redis源码 -ziplist

大家好，又见面了，我是你们的朋友全栈君。

数据结构：

注释的翻译:

/* The ziplist is a specially encoded dually linked list that is designed
 * to be very memory efficient. It stores both strings and integer values,
 * where integers are encoded as actual integers instead of a series of
 * characters. It allows push and pop operations on either side of the list
 * in O(1) time. However, because every operation requires a reallocation of
 * the memory used by the ziplist, the actual complexity is related to the
 * amount of memory used by the ziplist.
 * 
 *  ziplist是特别编码的双向链表，设计目标是为了节省内存。保存字符串和整形数值，
 整数就是真正编码，而不是字符序列。在列表的两端push、pop操作都是O(1)复杂度。因
为每一个操作都需要一次重新分配内存，所以实际复杂度和ziplist的内存量有关。
 * ----------------------------------------------------------------------------
 *
 * ZIPLIST OVERALL LAYOUT:
 * The general layout of the ziplist is as follows:
 * <zlbytes><zltail><zllen><entry><entry><zlend>
 *
       zlbytes是一个无符号整型，整个ziplist的占用的字节数。resize整个结构大小的时候，保存整个值，就不需要遍历真个列表。
 * <zlbytes> is an unsigned integer to hold the number of bytes that the
 * ziplist occupies. This value needs to be stored to be able to resize the
 * entire structure without the need to traverse it first.
 *     zltail是最后一个entry的偏移量，这样对最后元素pop的时候避免遍历。
 * <zltail> is the offset to the last entry in the list. This allows a pop
 * operation on the far side of the list without the need for full traversal.
 *     zllen 是条目的数量，数量大于64K就需要遍历了，但是为啥嘞？
 * <zllen> is the number of entries.When this value is larger than 2**16-2,
 * we need to traverse the entire list to know how many items it holds.
 *     zlend等于255是个特殊值，表示列表的结尾。
 * <zlend> is a single byte special value, equal to 255, which indicates the
 * end of the list.
 *  
 * ZIPLIST ENTRIES:
       每个条目都有一个header，包含两块信息：一个是前一个条目的长度，为了从后向前遍历；
另一个是可选的字符串长度编码。
 * Every entry in the ziplist is prefixed by a header that contains two pieces
 * of information. First, the length of the previous entry is stored to be
 * able to traverse the list from back to front. Second, the encoding with an
 * optional string length of the entry itself is stored.
 *     前一个条目的长度这样编码：如果长度小于254bytes，只用一个字节表示长度；如果长度大于等于254，
 需要用5个字节。第一个字节设置成254表示，前一个条目的长度长度保存在后面4个字节内。
 * The length of the previous entry is encoded in the following way:
 * If this length is smaller than 254 bytes, it will only consume a single
 * byte that takes the length as value. When the length is greater than or
 * equal to 254, it will consume 5 bytes. The first byte is set to 254 to
 * indicate a larger value is following. The remaining 4 bytes take the
 * length of the previous entry as value.
 *     header的其他字段依据entry的内容而不同。如果entry是字符串，使用高两位表示编码类型，紧接着是实际长度；
 如果entry是整数，最高两位均设置成1，第3、4位表示使用哪种整型编码。
 * The other header field of the entry itself depends on the contents of the
 * entry. When the entry is a string, the first 2 bits of this header will hold
 * the type of encoding used to store the length of the string, followed by the
 * actual length of the string. When the entry is an integer the first 2 bits
 * are both set to 1. The following 2 bits are used to specify what kind of
 * integer will be stored after this header. An overview of the different
 * types and encodings is as follows:
 *      字符串长度小于等于63
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
        使用14个bit表示长度，字符串长度小于2的14次方 - 1，总共占用2个字节
 * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
        5个字节表示长度，前一个字节的前两位10表示后面4个字节长度
 * |11000000| - 1 byte
 *      Integer encoded as int16_t (2 bytes).
        16位整形
 * |11010000| - 1 byte
 *      Integer encoded as int32_t (4 bytes).
        4字节32位整形
 * |11100000| - 1 byte
 *      Integer encoded as int64_t (8 bytes).
 * |11110000| - 1 byte
 *      Integer encoded as 24 bit signed (3 bytes).
 * |11111110| - 1 byte
 *      Integer encoded as 8 bit signed (1 byte).
 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 * |11111111| - End of ziplist.
 *
 * All the integers are represented in little endian byte order.
 */

拷贝了一张图【http://blog.csdn.net/u012658346/article/details/51321337】，整个结构分为三部分；

header：分成3部分：
zlbytes：整个list所占字节数
zltail：最后一个entry的指针位置
zllen：entry的个数，使用2个字节表示，说明ziplist做多容纳65535个entry。统计总数的时候，无需遍历整个list。
entries：列表条目，zlentry结构表示
end：一个字节，0xFF表示

这里写图片描述

entry的结构是：

typedef struct zlentry {
    unsigned int prevrawlensize, prevrawlen; // 前一个entry的字节数，前一个entry的长度，长度是个什么意思？？？
    unsigned int lensize, len;   // 当前节点的字节数，当前节点长度，同前长度是什么意思？
    unsigned int headersize;     // header里存储的是什么？
    unsigned char encoding;      // 编码方式
    unsigned char *p;            // 字符串指针，这个指针指向的内容是ziplist连续空间内部，还是单独的另一段内存空间？？
} zlentry;

常用的宏：

/* Utility macros */
// 工具宏
// 整个字节数
#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))
// 尾指针相对于ziplist起始位置的偏移量
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
// entry数量
#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
// header的字节数
#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))
// 首entry指针
#define ZIPLIST_ENTRY_HEAD(zl)  ((zl)+ZIPLIST_HEADER_SIZE)
// 尾entry指针
#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
// 结束位
#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)

一些宏

/* Different encoding/length possibilities */
#define ZIP_STR_MASK 0xc0 // 这个怎么用的？
#define ZIP_INT_MASK 0x30 // 这个怎么用的？
// 00xx xxxx 小于等于63字节的字符串
#define ZIP_STR_06B (0 << 6)
// 01xx xxxx 14位，2的14次方 - 1个字节的字符串
#define ZIP_STR_14B (1 << 6)
// 10__ ____ 后面有4个字节的长度，大于等于2的14次方
#define ZIP_STR_32B (2 << 6)
// 1100 0000, 16位无符号整型（2字节）
#define ZIP_INT_16B (0xc0 | 0<<4)
// 1101 0000, 32位无符号整数（4字节）
#define ZIP_INT_32B (0xc0 | 1<<4)
// 1110 0000, 64位无符号整数（8字节）
#define ZIP_INT_64B (0xc0 | 2<<4)
// 1111 0000, 24位有符号整数（3字节）
#define ZIP_INT_24B (0xc0 | 3<<4)
// 1111 1110 8位有符号整数（1字节）
#define ZIP_INT_8B 0xfe

/* 4 bit integer immediate encoding */
1~13 13个数字和编码encoding公用同一个字节，叫做立即数
#define ZIP_INT_IMM_MASK 0x0f
// 最小值1
#define ZIP_INT_IMM_MIN 0xf1    /* 11110001 */
// 最大值15
#define ZIP_INT_IMM_MAX 0xfd    /* 11111101 */
// 低4位按位取整
#define ZIP_INT_IMM_VAL(v) (v & ZIP_INT_IMM_MASK)

// 24为有符号数最大值
#define INT24_MAX 0x7fffff
// 24为有符号数最小值
#define INT24_MIN (-INT24_MAX - 1)

/* Macro to determine type */
// 小于1100 0000 的都是字符串编码，大于或等于的是整形编码
#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)

/* We know a positive increment can only be 1 because entries can only be
 * pushed one at a time. */
// 长度（entry的数量）加一
#define ZIPLIST_INCR_LENGTH(zl,incr) { \
    if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \
        ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \
}

ziplist的创建：

/* Create a new empty ziplist. */
// 创建一个空的ziplist
unsigned char *ziplistNew(void) {
	// header = 字节数 + 尾指针 + end
    unsigned int bytes = ZIPLIST_HEADER_SIZE+1;
    unsigned char *zl = zmalloc(bytes);
	// 设置长度
    ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
	// 设置尾指针和开始位置的偏移量
    ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
	// 设置entry的数量
    ZIPLIST_LENGTH(zl) = 0;
	// 结束字节
    zl[bytes-1] = ZIP_END;
    return zl;
}

resize的源码

/* Resize the ziplist. */
static unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {
    zl = zrealloc(zl,len);
    ZIPLIST_BYTES(zl) = intrev32ifbe(len);
    zl[len-1] = ZIP_END;
    return zl;
}

取出编码方式字节；1、2、3、4、8、0，分别对应字节数，这个直观点还是；

/* Return bytes needed to store integer encoded by 'encoding' */
static unsigned int zipIntSize(unsigned char encoding) {
    switch(encoding) {
	// 1111 1110 一个字节有符号整形
    case ZIP_INT_8B:  return 1;
	// 1100 0000 2字节无符号整形
    case ZIP_INT_16B: return 2;
	// 1111 0000 3字节有符号整形
    case ZIP_INT_24B: return 3;
	// 1101 0000 4字节无符号整形
    case ZIP_INT_32B: return 4;
	// 1110 0000 8字节无符号整形
    case ZIP_INT_64B: return 8;
	// 4位无符号整形立即数
    default: return 0; /* 4 bit immediate */
    }
    assert(NULL);
    return 0;
}

设置长度

/* Return bytes needed to store integer encoded by 'encoding' */
static unsigned int zipIntSize(unsigned char encoding) {
    switch(encoding) {
	// 1111 1110 一个字节有符号整形
    case ZIP_INT_8B:  return 1;
	// 1100 0000 2字节无符号整形
    case ZIP_INT_16B: return 2;
	// 1111 0000 3字节有符号整形
    case ZIP_INT_24B: return 3;
	// 1101 0000 4字节无符号整形
    case ZIP_INT_32B: return 4;
	// 1110 0000 8字节无符号整形
    case ZIP_INT_64B: return 8;
	// 4位无符号整形立即数
    default: return 0; /* 4 bit immediate */
    }
    assert(NULL);
    return 0;
}

/* Encode the length 'rawlen' writing it in 'p'. If p is NULL it just returns
 * the amount of bytes required to encode such a length. */
 // 首字节
static unsigned int zipEncodeLength(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
    unsigned char len = 1, buf[5];

    if (ZIP_IS_STR(encoding)) {
		// #define ZIP_STR_06B (0 << 6)  0000 0000
		// #define ZIP_STR_14B (1 << 6)  0100 0000
		// #define ZIP_STR_32B (2 << 6)  1000 0000
        /* Although encoding is given it may not be set for strings,
         * so we determine it here using the raw length. */
		// 把原始长度写入到encoding所在位置
        if (rawlen <= 0x3f) {
			// 小于等于63
            if (!p) return len;
			0000 0000 | rawlen
            buf[0] = ZIP_STR_06B | rawlen;
        } else if (rawlen <= 0x3fff) {
			// 小于2的14次方，这里是大端模式，而不是小端模式了！！！
            len += 1;
            if (!p) return len;
			// 0100 0000，取高六位
            buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
			// 低八位
            buf[1] = rawlen & 0xff;
        } else {
			// 大于等于2的14次方，用4个字节保存长度，大端模式；共5个字节
            len += 4;
            if (!p) return len;
            buf[0] = ZIP_STR_32B;
            buf[1] = (rawlen >> 24) & 0xff;
            buf[2] = (rawlen >> 16) & 0xff;
            buf[3] = (rawlen >> 8) & 0xff;
            buf[4] = rawlen & 0xff;
        }
    } else {
        /* Implies integer encoding, so length is always 1. */
		// 整形占用一个字节
        if (!p) return len;
        buf[0] = encoding;
    }

    /* Store this length at p */
	// 长度拷贝到p指针位置
    memcpy(p,buf,len);
    return len;
}

解析长度编码的宏

/* Decode the length encoded in 'ptr'. The 'encoding' variable will hold the
 * entries encoding, the 'lensize' variable will hold the number of bytes
 * required to encode the entries length, and the 'len' variable will hold the
 * entries length. */
// 从ptr中解析成长度编码。encoding 变量代表整个编码；lensize变量代表整个长度编码所需字节数；
// len 代表整个长度
#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do {                    \
	// 从prt解析出encoding
    ZIP_ENTRY_ENCODING((ptr), (encoding));                                     \
	// 先处理字符串
    if ((encoding) < ZIP_STR_MASK) {                                           \
		// 小于等于63个字节的字符串
        if ((encoding) == ZIP_STR_06B) {                                       \
			// 一个字节存储长度
            (lensize) = 1;                                                     \
			// entry的长度是 第一个字节后六位所表示整数
            (len) = (ptr)[0] & 0x3f;                                           \
		// 小于等于2的14方 entry长度
        } else if ((encoding) == ZIP_STR_14B) {                                \
			两个字节表示
            (lensize) = 2;                                                     \
			// 高有效值存储在低位，大端模式
            (len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1];                       \
		// 小于2的32次方
        } else if (encoding == ZIP_STR_32B) {                                  \
			//  需要5个字节保存编码和长度
            (lensize) = 5;                                                     \
			// 大端模式
            (len) = ((ptr)[1] << 24) |                                         \
                    ((ptr)[2] << 16) |                                         \
                    ((ptr)[3] <<  8) |                                         \
                    ((ptr)[4]);                                                \
        } else {                                                               \
            assert(NULL);                                                      \
        }                                                                      \
    } else {                                                                   \
        (lensize) = 1;   													   \
		// payload需要的字节数
        (len) = zipIntSize(encoding);                                          \
    }                                                                          \
} while(0);

前一个entry长度

/* Encode the length of the previous entry and write it to "p". Return the
 * number of bytes needed to encode this length if "p" is NULL. */
// 前一个entry的长度写入到p
static unsigned int zipPrevEncodeLength(unsigned char *p, unsigned int len) {
    if (p == NULL) {
        return (len < ZIP_BIGLEN) ? 1 : sizeof(len)+1;
    } else {
		// 小于254，占用一个字节
        if (len < ZIP_BIGLEN) {
            p[0] = len;
            return 1;
        } else {
			// 大于等于245，占用4个字节保存长度，一共需要5个字节
            p[0] = ZIP_BIGLEN;
            memcpy(p+1,&len,sizeof(len));
            memrev32ifbe(p+1);
            return 1+sizeof(len);
        }
    }
}

前一个节点的长度，编码、解码

/* Encode the length of the previous entry and write it to "p". This only
 * uses the larger encoding (required in __ziplistCascadeUpdate). */
// 前一个节点的编码写到p
static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) {
    if (p == NULL) return;
    p[0] = ZIP_BIGLEN;
    memcpy(p+1,&len,sizeof(len));
    memrev32ifbe(p+1);
}

/* Decode the number of bytes required to store the length of the previous
 * element, from the perspective of the entry pointed to by 'ptr'. */
 // 从p解析长度编码
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do {                          \
    if ((ptr)[0] < ZIP_BIGLEN) {                                               \
        (prevlensize) = 1;                                                     \
    } else {                                                                   \
        (prevlensize) = 5;                                                     \
    }                                                                          \
} while(0);

前一个节点能不能保存len长度的字串。

/* Return the difference in number of bytes needed to store the length of the
 * previous element 'len', in the entry pointed to by 'p'. */
 // 判断len长度的字符串，p能不能存得下；如果大于0，小于0分别代表什么含义？
static int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
    unsigned int prevlensize;
    ZIP_DECODE_PREVLENSIZE(p, prevlensize);
    return zipPrevEncodeLength(NULL, len) - prevlensize;
}

entry的存储结构和struct需要转换一下。

/* Return the total number of bytes used by the entry pointed to by 'p'. */
// 好吧，这个才是最重要的;p指向entry的开始offset
// p-->|prelensize|当前节点lensize|实际payload的len
// prelensize：前一个节点的长度encoding和lensize，1或者5个字节
   lensize：当前节点的长度encoding和lensize，字符串是1 + (0,1,4)；数字是1 + (0,1,2,3,4,8）
// len: 第三部分是真正的数据占用的字节
static unsigned int zipRawEntryLength(unsigned char *p) {
    unsigned int prevlensize, encoding, lensize, len;
    ZIP_DECODE_PREVLENSIZE(p, prevlensize);
    ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
    return prevlensize + lensize + len;
}

检查字符串是否可以转成整数表示

/* Check if string pointed to by 'entry' can be encoded as an integer.
 * Stores the integer value in 'v' and its encoding in 'encoding'. */
// 检查entry存储的值，是否能用integer表示；有什么用呢，把字符串转成整形为什么？？
int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) {
    long long value;

    if (entrylen >= 32 || entrylen == 0) return 0;
    if (string2ll((char*)entry,entrylen,&value)) {
        /* Great, the string can be encoded. Check what's the smallest
         * of our encoding types that can hold this value. */
		// 
        if (value >= 0 && value <= 12) {
			// 立即数
            *encoding = ZIP_INT_IMM_MIN+value;
        } else if (value >= INT8_MIN && value <= INT8_MAX) {
			// 后面跟一个字节
            *encoding = ZIP_INT_8B;
        } else if (value >= INT16_MIN && value <= INT16_MAX) {
			// 后面跟两个字节
            *encoding = ZIP_INT_16B;
        } else if (value >= INT24_MIN && value <= INT24_MAX) {
			// 后面跟3个字节
            *encoding = ZIP_INT_24B;
        } else if (value >= INT32_MIN && value <= INT32_MAX) {
			// 后面跟4个字节
            *encoding = ZIP_INT_32B;
        } else {
			// 后面跟5个字节
            *encoding = ZIP_INT_64B;
        }
        *v = value;
        return 1;
    }
    return 0;
}

保存整数

/* Store integer 'value' at 'p', encoded as 'encoding' */
// 保存数字的写操作。数字value保存在p，encoding表示编码；小端模式；
void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64;
    if (encoding == ZIP_INT_8B) {
        ((int8_t*)p)[0] = (int8_t)value;
    } else if (encoding == ZIP_INT_16B) {
        i16 = value;
        memcpy(p,&i16,sizeof(i16));
        memrev16ifbe(p);
    } else if (encoding == ZIP_INT_24B) {
        i32 = value<<8;
        memrev32ifbe(&i32);
        memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    } else if (encoding == ZIP_INT_32B) {
        i32 = value;
        memcpy(p,&i32,sizeof(i32));
        memrev32ifbe(p);
    } else if (encoding == ZIP_INT_64B) {
        i64 = value;
        memcpy(p,&i64,sizeof(i64));
        memrev64ifbe(p);
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        /* Nothing to do, the value is stored in the encoding itself. */
    } else {
        assert(NULL);
    }
}

读取整数

/* Read integer encoded as 'encoding' from 'p' */
// 读操作；读整数从指针p，根据encoding
int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64, ret = 0;
    if (encoding == ZIP_INT_8B) {
        ret = ((int8_t*)p)[0];
    } else if (encoding == ZIP_INT_16B) {
        memcpy(&i16,p,sizeof(i16));
        memrev16ifbe(&i16);
        ret = i16;
    } else if (encoding == ZIP_INT_32B) {
        memcpy(&i32,p,sizeof(i32));
        memrev32ifbe(&i32);
        ret = i32;
    } else if (encoding == ZIP_INT_24B) {
        i32 = 0;
        memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t));
        memrev32ifbe(&i32);
        ret = i32>>8;
    } else if (encoding == ZIP_INT_64B) {
        memcpy(&i64,p,sizeof(i64));
        memrev64ifbe(&i64);
        ret = i64;
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        ret = (encoding & ZIP_INT_IMM_MASK)-1;
    } else {
        assert(NULL);
    }
    return ret;
}

构造一个entry

/* Return a struct with all information about an entry. */
// 读取操作。构造一个entry；回头再看一下entry的结构；
/*  typedef struct zlentry {
        unsigned int prevrawlensize, prevrawlen;
        unsigned int lensize, len;
        unsigned int headersize;
        unsigned char encoding;
        unsigned char *p;
    } zlentry;
*/
// p指针指向哪里？指向entry的header起始位置
void zipEntry(unsigned char *p, zlentry *e) {

    ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
    ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
    e->headersize = e->prevrawlensize + e->lensize;
	// 指针指向存header起始位置
    e->p = p;
}

再插入节点的时候调用

/* When an entry is inserted, we need to set the prevlen field of the next
 * entry to equal the length of the inserted entry. It can occur that this
 * length cannot be encoded in 1 byte and the next entry needs to be grow
 * a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
 * because this only happens when an entry is already being inserted (which
 * causes a realloc and memmove). However, encoding the prevlen may require
 * that this entry is grown as well. This effect may cascade throughout
 * the ziplist when there are consecutive entries with a size close to
 * ZIP_BIGLEN, so we need to check that the prevlen can be encoded in every
 * consecutive entry.
 *     在插入数据的时候，需要设置next节点的prevlen，这长度等于新插入节点的字节数。
 要插入节点的长度需要5个字节，此时next节点的需要从1 byte增加到5 byte。增加字节是简单的，
 但是增加prevlen字节后，next节点的长度也发生了变化。如果遇到连续多个节点的长度都接近254，
 会引起连锁反应，所以需要检查每一个后续节点的prevlen字段是否需要扩容。
 
 * Note that this effect can also happen in reverse, where the bytes required
 * to encode the prevlen field can shrink. This effect is deliberately ignored,
 * because it can cause a "flapping" effect where a chain prevlen fields is
 * first grown and then shrunk again after consecutive inserts. Rather, the
 * field is allowed to stay larger than necessary, because a large prevlen
 * field implies the ziplist is holding large entries anyway.
 *     注意这个过程也可以是相反的，对前一个节点的长度编码所需长度是变小的情况。
 缩小的时候会被忽略掉，插入一个长度大于等于254的entry，然后再插入一个长度小于254，
 后续节点会反复修改。所以prevlen允许比实际需要的长度大，prevlen大一点暗示ziplist持有大的entry。
 * The pointer "p" points to the first entry that does NOT need to be
 * updated, i.e. consecutive fields MAY need an update. 
       p指向第一个entry不需要修改，后面的节点才可能会被修改。
 */
 // zl: ziplist的起始位置指针
 // p : 某一个entry的起始位置指针
 static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
    size_t offset, noffset, extra;
    unsigned char *np;
    zlentry cur, next;

    while (p[0] != ZIP_END) {
		// 先取出一个entry
        cur = zipEntry(p);
		// headersize = prevlen + encoding + payload
        rawlen = cur.headersize + cur.len;
		// 前一个节点长度，需要的字节数
        rawlensize = zipPrevEncodeLength(NULL,rawlen);

        /* Abort if there is no next entry. */
		// 最后一个节点，结束
        if (p[rawlen] == ZIP_END) break;
		// 取出下一个节点
        next = zipEntry(p+rawlen);

        /* Abort when "prevlen" has not changed. */
		// 节点的长度没有变化，结束
        if (next.prevrawlen == rawlen) break;

		// p 对应的节点保存节点长度所需字节数 要更多一点；rawlensize
        if (next.prevrawlensize < rawlensize) {
            /* The "prevlen" field of "next" needs more bytes to hold
             * the raw length of "cur". */
			// next节点的prevlen需要扩容，容纳cur节点的长度所需字节数
            offset = p-zl;
            extra = rawlensize-next.prevrawlensize;
			// resize方法调用realloc内存重新分配，zl指针可能会变化
            zl = ziplistResize(zl,curlen+extra);
            p = zl+offset;

            /* Current pointer and offset for next element. */
			// next pointer 下一个节点指针 
            np = p+rawlen;
			// next offset 下一个节点的偏移量
            noffset = np-zl;

            /* Update tail offset when next element is not the tail element. */
			// 如果next 节点不是tail节点；设置ziplist的tail节点offset；
			// 如果是最后一个节点了，就不需要再设置长度
            if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
                ZIPLIST_TAIL_OFFSET(zl) =
                    intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
            }

            /* Move the tail to the back. */
			// rawlensize = 5, next.prevrawlensize = 1,curlen = ziplist所有字节数
			// np往后一直到tail的节点都向后平移，为什么要减1呢？？？最后一个END也应该在里面才对
            memmove(np+rawlensize,
                np+next.prevrawlensize,
                curlen-noffset-next.prevrawlensize-1);
			// 设置next节点的前置长度字段
            zipPrevEncodeLength(np,rawlen);

            /* Advance the cursor */
			// 移动到下一个entry节点
            p += rawlen;
			// ziplist总长度需要增大
            curlen += extra;
        } else {
			// 如果长度足够
            if (next.prevrawlensize > rawlensize) {
                /* This would result in shrinking, which we want to avoid.
                 * So, set "rawlen" in the available bytes. */
				// next节点的prevlen字段不需要扩容，只是扩大encoding；
				// p+rawlen指向下一个entry节点的起始位置
                zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);
            } else {
				// 长度和encoding都修改
                zipPrevEncodeLength(p+rawlen,rawlen);
            }

            /* Stop here, as the raw length of "next" has not changed. */
			// 重要--直到next节点的prevrawlensize不需要修改
            break;
        }
    }
    return zl;
}

/* Encode the length of the previous entry and write it to "p". This only
 * uses the larger encoding (required in __ziplistCascadeUpdate). */
// p后面的节点, 只是使用更大一点的encoding
static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) {
    if (p == NULL) return;
    p[0] = ZIP_BIGLEN;
    memcpy(p+1,&len,sizeof(len));
    memrev32ifbe(p+1);
}

插入节点：唯一不明白的地方是，

 memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

为什么要把最后的END标记去掉？

/* Insert item at "p". */
/*
  zl : 压缩表起始指针
  p  ：带插入节点位置指针
  s  ：要插入的节点
  slen：要插入节点长度
*/
static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
	// 整个列表字节数
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
    unsigned int prevlensize, prevlen = 0;
    size_t offset;
    int nextdiff = 0;
    unsigned char encoding = 0;
	// 可以知道是否真正初始化了
    long long value = 123456789; /* initialized to avoid warning. Using a value
                                    that is easy to see if for some reason
                                    we use it uninitialized. */
    zlentry tail;

    /* Find out prevlen for the entry that is inserted. */
    if (p[0] != ZIP_END) {
		// 读操作，后一个节点保存的prevlensize 和 prevlen，实际是前一个节点的字节数
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
    } else {
        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
		// 里面原来至少有一个节点，不是空的
        if (ptail[0] != ZIP_END) {
            prevlen = zipRawEntryLength(ptail);
        }
		
		// p[0] == ZIP_END && ptail[0] == ZIP_END 这种情况表示是空的列表，直接插入即可
    }

    /* See if the entry can be encoded */
	// 检查是否可以用整数表示，整数表示有什么好处吗？？？
    if (zipTryEncoding(s,slen,&value,&encoding)) {
        /* 'encoding' is set to the appropriate integer encoding */
		// 如果可以使用整形表示，使用整形保存；
        reqlen = zipIntSize(encoding);
    } else {
        /* 'encoding' is untouched, however zipEncodeLength will use the
         * string length to figure out how to encode it. */
		// 不能用整形表示，新增的节点的字节数就是slen，要插入的长度
        reqlen = slen;
    }
    /* We need space for both the length of the previous entry and
     * the length of the payload. */
	// 包一个header，prevlen是前一个节点的字节数
    reqlen += zipPrevEncodeLength(NULL,prevlen);
	// 当前节点的编码encoding所占的字节数
    reqlen += zipEncodeLength(NULL,encoding,slen);

    /* When the insert position is not equal to the tail, we need to
     * make sure that the next entry can hold this entry's length in
     * its prevlen field. */
	// 如果插入位置不是结尾，需要确认next的节点可以容纳新插入节点的字节数
	// diff = 新的插入的节点需要字节数 - 原来后面节点已经保存的前一个节点的字节数
    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;

    /* Store offset because a realloc may change the address of zl. */
	// 因为ziplist的resize使用了realloc方法，address可能会发生变化
    offset = p-zl;
	// reqlen就是entry的长度，nextdiff是下一个节点扩容（4）或者缩容的长度（-4）
	// 小于254 使用一个字节，大于等于254 使用4个字节
    zl = ziplistResize(zl,curlen+reqlen+nextdiff);
	// 要插入的位置
    p = zl+offset;

    /* Apply memory move when necessary and update tail offset. */
	// 不是最后一个节点，需要往后挪数据，腾空位置
    if (p[0] != ZIP_END) {
        /* Subtract one because of the ZIP_END bytes */
		// p+reqlen：空出来reqlen个字节
		// p-nextdiff: 其实就是多拷贝4个字节
		// curlen-offset-1+nextdiff ：整体后移的字节数，为什么最后的END不需要移动呢？？？
		/* 将原来 p-nextdiff 开始的数据全部后移，中间出现reqlen个字节保存即将插入的数据 
            主要需要考虑一下几种情况：
            nextdiff == 0：p节点中用来存储原先前一个节点长度信息的数据区域正好保存待插入节点的长度
            nextdiff == 4：原先p节点只需要1个字节来存储上一个节点的长度，现在需要5个字节。那就将p-4后面的数据偏移到p+reqlen
            nextdiff == -4：原先p节点需要5个字节来存储上一个节点的长度，现在只需要1个字节。那就将p+4后面的数据偏移到p+reqlen
        */
        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

        /* Encode this entry's raw length in the next entry. */
		// p+reqlen是新插入节点的next节点,这一步操是把新插入节点的长度保存在，下一个节点的header里
        zipPrevEncodeLength(p+reqlen,reqlen);

        /* Update offset for tail */
		// 更新tail指针
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);

        /* When the tail contains more than one entry, we need to take
         * "nextdiff" in account as well. Otherwise, a change in the
         * size of prevlen doesn't have an effect on the *tail* offset. */
		// 新插入节点的后继节点，不是END；后一个节点扩容的4个字节要加上（nextdiff）；
		// 如果新插入节点是tail，不需要增加，有点多余这个if判断，因为最前面的p[0] != END 已经说明了这个不是最后一个节点
        tail = zipEntry(p+reqlen);
        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
        }
    } else {
        /* This element will be the new tail. */
		// 插入到最后一个位置，后面就紧跟着END
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
    }

    /* When nextdiff != 0, the raw length of the next entry has changed, so
     * we need to cascade the update throughout the ziplist */
	// 新插入节点的next节点发生变化，需要从next节点开始一直到tail节点都连锁更新
    if (nextdiff != 0) {
        offset = p-zl;
		// 使用realloc实现，zl指针可能会变化
        zl = __ziplistCascadeUpdate(zl,p+reqlen);
        p = zl+offset;
    }

    /* Write the entry */
	// 真正写入，先写前一个节点的len
    p += zipPrevEncodeLength(p,prevlen);
	//  然后再写当前节点的encoding和len
    p += zipEncodeLength(p,encoding,slen);
	// 如果是字符串直接拷贝
    if (ZIP_IS_STR(encoding)) {
        memcpy(p,s,slen);
    } else {
		// 否则保存成整数
        zipSaveInteger(p,value,encoding);
    }
	// 长度加1
    ZIPLIST_INCR_LENGTH(zl,1);
    return zl;
}

删除节点：

/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. */
/*
  zl: 起始指针
  p：要删除的节点
  num：要删除的节点个数
*/
static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
    unsigned int i, totlen, deleted = 0;
    size_t offset;
    int nextdiff = 0;
    zlentry first, tail;

	// 从p开始读取第一个节点
    zipEntry(p, &first);
    for (i = 0; p[0] != ZIP_END && i < num; i++) {
		// 逐个删除，指针一直后移，数据没动
        p += zipRawEntryLength(p);
		// 计数器 + 1
        deleted++;
    }

	// 总共删除的字节数
    totlen = p-first.p; /* Bytes taken by the element(s) to delete. */
    if (totlen > 0) {
        if (p[0] != ZIP_END) {
            /* Storing `prevrawlen` in this entry may increase or decrease the
             * number of bytes required compare to the current `prevrawlen`.
             * There always is room to store this, because it was previously
             * stored by an entry that is now being deleted. */
			// first的前置节点后面紧邻p指向的节点；如果大于0，说明p需要扩容4个字节，first前面那个节点的长度大于254
			nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
			// 往做移动不会覆盖first左边的节点，因为这个时候还没有memmove，
            p -= nextdiff;
			// 写入p的 prevlen  1+4 个字节
            zipPrevEncodeLength(p,first.prevrawlen);

            /* Update offset for tail */
			// 更新tail偏移量
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);

            /* When the tail contains more than one entry, we need to take
             * "nextdiff" in account as well. Otherwise, a change in the
             * size of prevlen doesn't have an effect on the *tail* offset. */
			// 当tail含有多个节点，需要考虑链式扩容；后面半句没看懂？？？
            tail = zipEntry(p);
			// 如果不是最后一个节点，要再增加4个字节 扩容时增加的
            if (p[tail.headersize+tail.len] != ZIP_END) {
                ZIPLIST_TAIL_OFFSET(zl) =
                   intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
            }

            /* Move tail to the front of the ziplist */
			// 向前移动，为什么要减去1呢？？？END起始也应该拷贝过去的啊
            memmove(first.p,p,
                intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
        } else {
            /* The entire tail was deleted. No need to move memory. */
			// p后面全删了，不需要内存移动，first前面的那个节点的偏移量作为tail的offset
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe((first.p-zl)-first.prevrawlen);
        }

        /* Resize and update length */
		// delete后p的offset
        offset = first.p-zl;
		// 为什么要resize呢，已经memmove了？？？
        zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
        ZIPLIST_INCR_LENGTH(zl,-deleted);
		// 新的指针指向下一个将被删除但是还没有删除的节点
        p = zl+offset;

        /* When nextdiff != 0, the raw length of the next entry has changed, so
         * we need to cascade the update throughout the ziplist */
		// nextdiff != 0，下个将被删除的节点的header（prevlensize）增加了4个字节，所以要处理连锁反应
        if (nextdiff != 0)
            zl = __ziplistCascadeUpdate(zl,p);
    }
    return zl;
}

查找

/* Find pointer to the entry equal to the specified entry. Skip 'skip' entries
 * between every comparison. Returns NULL when the field could not be found. */
 /*
   p: skiplist起始位置
   vstr ：要查找字符串
   vlen ：要查找字符串长度
   skip：每次比较skip跳跃几个entry，到底什么意思不清楚？？？
 */
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
    int skipcnt = 0;
    unsigned char vencoding = 0;
    long long vll = 0;

    while (p[0] != ZIP_END) {
        unsigned int prevlensize, encoding, lensize, len;
        unsigned char *q;

	// q 存储的是payload
        ZIP_DECODE_PREVLENSIZE(p, prevlensize);
        ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
        q = p + prevlensize + lensize;

        if (skipcnt == 0) {
            /* Compare current entry with specified entry */
	    // 比较字符串
            if (ZIP_IS_STR(encoding)) {
                if (len == vlen && memcmp(q, vstr, vlen) == 0) {
                    return p;
                }
            } else {
		// 比较整数
                /* Find out if the searched field can be encoded. Note that
                 * we do it only the first time, once done vencoding is set
                 * to non-zero and vll is set to the integer value. */
                if (vencoding == 0) {
                    if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
                        /* If the entry can't be encoded we set it to
                         * UCHAR_MAX so that we don't retry again the next
                         * time. */
                        vencoding = UCHAR_MAX;
                    }
                    /* Must be non-zero by now */
                    assert(vencoding);
                }

                /* Compare current entry with specified entry, do it only
                 * if vencoding != UCHAR_MAX because if there is no encoding
                 * possible for the field it can't be a valid integer. */
                if (vencoding != UCHAR_MAX) {
		    // 加载整数
                    long long ll = zipLoadInteger(q, encoding);
                    if (ll == vll) {
                        return p;
                    }
                }
            }

            /* Reset skip count */
            skipcnt = skip;
        } else {
            /* Skip entry */
            skipcnt--;
        }

        /* Move to next entry */
	// 移到下一个entry
        p = q + len;
    }

    return NULL;
}

发布者：全栈程序员-用户IM，转载请注明出处：https://javaforall.cn/127737.html原文链接：https://javaforall.cn

【正版授权，激活自己账号】： Jetbrains全家桶Ide使用，1年售后保障，每天仅需1毛

【官方授权正版激活】： 官方授权正版激活支持Jetbrains家族下所有IDE 使用个人JB账号...