redis数据结构介绍六 快表

这应该是 redis 系列的最后一篇了,讲下快表,其实最前面讲的链表在早先的 redis 版本中也作为 list 的数据结构使用过,但是单纯的链表的缺陷之前也说了,插入便利,但是空间利用率低,并且不能进行二分查找等,检索效率低,ziplist 压缩表的产生也是同理,希望获得更好的性能,包括存储空间和访问性能等,原来我也不懂这个快表要怎么快,然后明白了一个道理,其实并没有什么银弹,只是大牛们会在适合的时候使用最适合的数据结构来实现性能的最大化,这里面有一招就是不同数据结构的组合调整,比如 Java 中的 HashMap,在链表节点数大于 8 时会转变成红黑树,以此提高访问效率,不费话了,回到快表,quicklist,这个数据结构主要使用在 list 类型中,如果我说其实这个 quicklist 就是个链表,可能大家不太会相信,但是事实上的确可以认为 quicklist 是个双向链表,看下代码

/* quicklistNode is a 32 byte struct describing a ziplist for a quicklist.
 * We use bit fields keep the quicklistNode at 32 bytes.
 * count: 16 bits, max 65536 (max zl bytes is 65k, so max count actually < 32k).
 * encoding: 2 bits, RAW=1, LZF=2.
 * container: 2 bits, NONE=1, ZIPLIST=2.
 * recompress: 1 bit, bool, true if node is temporarry decompressed for usage.
 * attempted_compress: 1 bit, boolean, used for verifying during testing.
 * extra: 10 bits, free for future use; pads out the remainder of 32 bits */
typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* count of items in ziplist */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* was this node previous compressed? */
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;

/* quicklistLZF is a 4+N byte struct holding 'sz' followed by 'compressed'.
 * 'sz' is byte length of 'compressed' field.
 * 'compressed' is LZF data with total (compressed) length 'sz'
 * NOTE: uncompressed length is stored in quicklistNode->sz.
 * When quicklistNode->zl is compressed, node->zl points to a quicklistLZF */
typedef struct quicklistLZF {
    unsigned int sz; /* LZF size in bytes*/
    char compressed[];
} quicklistLZF;

/* quicklist is a 40 byte struct (on 64-bit systems) describing a quicklist.
 * 'count' is the number of total entries.
 * 'len' is the number of quicklist nodes.
 * 'compress' is: -1 if compression disabled, otherwise it's the number
 *                of quicklistNodes to leave uncompressed at ends of quicklist.
 * 'fill' is the user-requested (or default) fill factor. */
typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* total count of all entries in all ziplists */
    unsigned long len;          /* number of quicklistNodes */
    int fill : 16;              /* fill factor for individual nodes */
    unsigned int compress : 16; /* depth of end nodes not to compress;0=off */
} quicklist;

粗略看下,quicklist 里有 head,tail, quicklistNode里有 prev,next 指针,是不是有链表的基本轮廓了,那么为啥这玩意要称为快表呢,快在哪,关键就在这个unsigned char *zl;zl 是不是前面又看到过,就是 ziplist ,这是什么鬼,链表里用压缩表,这不套娃么,先别急,回顾下前面说的 ziplist,ziplist 有哪些特点,内存利用率高,可以从表头快速定位到尾节点,节点可以从后往前找,但是有个缺点,就是从中间插入的效率比较低,需要整体往后移,这个其实是普通数组的优化版,但还是有数组的一些劣势,所以要真的快,是不是可以将链表跟数组真的结合起来。

ziplist

这里有两个 redis 的配置参数,list-max-ziplist-sizelist-compress-depth,先来说第一个,既然快表是将链表跟压缩表数组结合起来使用,那么具体怎么用呢,比如我有一个 10 个元素的 list,那具体怎么放,每个 quicklistNode 里放多大的 ziplist,假如每个快表节点的 ziplist 只放一个元素,那么其实这就退化成了一个链表,如果 10 个元素放在一个 quicklistNode 的 ziplist 里,那就退化成了一个 ziplist,所以有了这个 list-max-ziplist-size,而且它还比较牛,能取正负值,当是正值时,对应的就是每个 quicklistNode 的 ziplist 中的元素个数,比如配置了 list-max-ziplist-size = 5,那么我刚才的 10 个元素的 list 就是一个两个 quicklistNode 组成的快表,每个 quicklistNode 中的 ziplist 包含了五个元素,当 list-max-ziplist-size取负值的时候,它限制了 ziplist 的字节数

size_t offset = (-fill) - 1;
if (offset < (sizeof(optimization_level) / sizeof(*optimization_level))) {
    if (sz <= optimization_level[offset]) {
        return 1;
    } else {
        return 0;
    }
} else {
    return 0;
}

/* Optimization levels for size-based filling */
static const size_t optimization_level[] = {4096, 8192, 16384, 32768, 65536};

/* Create a new quicklist.
 * Free with quicklistRelease(). */
quicklist *quicklistCreate(void) {
    struct quicklist *quicklist;

    quicklist = zmalloc(sizeof(*quicklist));
    quicklist->head = quicklist->tail = NULL;
    quicklist->len = 0;
    quicklist->count = 0;
    quicklist->compress = 0;
    quicklist->fill = -2;
    return quicklist;
}

这个 fill 就是传进来的 list-max-ziplist-size, 具体对应的就是

  • -5: 每个quicklist节点上的ziplist大小不能超过64 Kb。(注:1kb => 1024 bytes)
  • -4: 每个quicklist节点上的ziplist大小不能超过32 Kb。
  • -3: 每个quicklist节点上的ziplist大小不能超过16 Kb。
  • -2: 每个quicklist节点上的ziplist大小不能超过8 Kb。(-2是Redis给出的默认值)也就是上面的 quicklist->fill = -2;
  • -1: 每个quicklist节点上的ziplist大小不能超过4 Kb。

压缩

list-compress-depth这个参数呢是用来配置压缩的,等等压缩是为啥,不是里面已经是压缩表了么,大牛们就是为了性能殚精竭虑,这里考虑到的是一个场景,一般状况下,list 都是两端的访问频率比较高,那么是不是可以对中间的数据进行压缩,那么这个参数就是用来表示

/* depth of end nodes not to compress;0=off */
  • 0,代表不压缩,默认值
  • 1,两端各一个节点不压缩
  • 2,两端各两个节点不压缩
  • … 依次类推
    压缩后的 ziplist 就会变成 quicklistLZF,然后替换 zl 指针,这里使用的是 LZF 压缩算法,压缩后的 quicklistLZF 中的 compressed 也是个柔性数组,压缩后的 ziplist 整个就放进这个柔性数组

插入过程

简单说下插入元素的过程

/* Wrapper to allow argument-based switching between HEAD/TAIL pop */
void quicklistPush(quicklist *quicklist, void *value, const size_t sz,
                   int where) {
    if (where == QUICKLIST_HEAD) {
        quicklistPushHead(quicklist, value, sz);
    } else if (where == QUICKLIST_TAIL) {
        quicklistPushTail(quicklist, value, sz);
    }
}

/* Add new entry to head node of quicklist.
 *
 * Returns 0 if used existing head.
 * Returns 1 if new head created. */
int quicklistPushHead(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_head = quicklist->head;
    if (likely(
            _quicklistNodeAllowInsert(quicklist->head, quicklist->fill, sz))) {
        quicklist->head->zl =
            ziplistPush(quicklist->head->zl, value, sz, ZIPLIST_HEAD);
        quicklistNodeUpdateSz(quicklist->head);
    } else {
        quicklistNode *node = quicklistCreateNode();
        node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);

        quicklistNodeUpdateSz(node);
        _quicklistInsertNodeBefore(quicklist, quicklist->head, node);
    }
    quicklist->count++;
    quicklist->head->count++;
    return (orig_head != quicklist->head);
}

/* Add new entry to tail node of quicklist.
 *
 * Returns 0 if used existing tail.
 * Returns 1 if new tail created. */
int quicklistPushTail(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_tail = quicklist->tail;
    if (likely(
            _quicklistNodeAllowInsert(quicklist->tail, quicklist->fill, sz))) {
        quicklist->tail->zl =
            ziplistPush(quicklist->tail->zl, value, sz, ZIPLIST_TAIL);
        quicklistNodeUpdateSz(quicklist->tail);
    } else {
        quicklistNode *node = quicklistCreateNode();
        node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_TAIL);

        quicklistNodeUpdateSz(node);
        _quicklistInsertNodeAfter(quicklist, quicklist->tail, node);
    }
    quicklist->count++;
    quicklist->tail->count++;
    return (orig_tail != quicklist->tail);
}

/* Wrappers for node inserting around existing node. */
REDIS_STATIC void _quicklistInsertNodeBefore(quicklist *quicklist,
                                             quicklistNode *old_node,
                                             quicklistNode *new_node) {
    __quicklistInsertNode(quicklist, old_node, new_node, 0);
}

REDIS_STATIC void _quicklistInsertNodeAfter(quicklist *quicklist,
                                            quicklistNode *old_node,
                                            quicklistNode *new_node) {
    __quicklistInsertNode(quicklist, old_node, new_node, 1);
}

/* Insert 'new_node' after 'old_node' if 'after' is 1.
 * Insert 'new_node' before 'old_node' if 'after' is 0.
 * Note: 'new_node' is *always* uncompressed, so if we assign it to
 *       head or tail, we do not need to uncompress it. */
REDIS_STATIC void __quicklistInsertNode(quicklist *quicklist,
                                        quicklistNode *old_node,
                                        quicklistNode *new_node, int after) {
    if (after) {
        new_node->prev = old_node;
        if (old_node) {
            new_node->next = old_node->next;
            if (old_node->next)
                old_node->next->prev = new_node;
            old_node->next = new_node;
        }
        if (quicklist->tail == old_node)
            quicklist->tail = new_node;
    } else {
        new_node->next = old_node;
        if (old_node) {
            new_node->prev = old_node->prev;
            if (old_node->prev)
                old_node->prev->next = new_node;
            old_node->prev = new_node;
        }
        if (quicklist->head == old_node)
            quicklist->head = new_node;
    }
    /* If this insert creates the only element so far, initialize head/tail. */
    if (quicklist->len == 0) {
        quicklist->head = quicklist->tail = new_node;
    }

    if (old_node)
        quicklistCompress(quicklist, old_node);

    quicklist->len++;
}

前面第一步先根据插入的是头还是尾选择不同的 push 函数,quicklistPushHead 或者 quicklistPushTail,举例分析下从头插入的 quicklistPushHead,先判断当前的 quicklistNode 节点还能不能允许再往 ziplist 里添加元素,如果可以就添加,如果不允许就新建一个 quicklistNode,然后调用 _quicklistInsertNodeBefore 将节点插进去,具体插入quicklist节点的操作类似链表的插入。