Redis BigKey – Sugar

面试题

阿里广告平台，海量数据查询某一固定前缀的 key
小红书，你如何生产上限制 keys */flushdb/flushall 等危险命令防止误删误用
美团，MEMORY USAGE 命令你用过吗
BigKey 问题，多大算 big？你如何发现？如何删除？如何处理？
BigKey 你做过调优吗？惰性释放 lazyfree 了解过吗？
MoreKey 问题，生产上 redis 数据库有 1000w 记录，你如何遍历？key * 可以吗？

MoreKey 案例

大批量往 redis 里面插入 2000w 测试数据 key

Linux bash 下面执行，插入 100w 条数据

for((i=1;i<=100*10000;i++)); do echo "set k$i v$i" >> /tmp/redisTest.txt ;done;

通过 redis 提供的管道 --pipe 命令插入 100w 大批量数据

cat /tmp/redisTest.txt | /opt/redis-7.0.0/src/redis-cli -h 127.0.0.1 -p 6379 -a 111111 --pipe

生产上限制 keys */flushdb/flushall 等危险命令防止误删误用

通过配置设置禁用这些命令，redis.conf 在 SECURITY 这项中

# Command renaming (DEPRECATED).
#
# ------------------------------------------------------------------------
# WARNING: avoid using this option if possible. Instead use ACLs to remove
# commands from the default user, and put them only in some admin user you
# create for administrative purposes.
# ------------------------------------------------------------------------
#
# It is possible to change the name of dangerous commands in a shared
# environment. For instance the CONFIG command may be renamed into something
# hard to guess so that it will still be available for internal-use tools
# but not available for general clients.
#
# Example:
#
# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
#
# It is also possible to completely kill a command by renaming it into
# an empty string:
#
# rename-command CONFIG ""

rename-command keys ""
rename-command flushdb ""
rename-command flushall ""

#
# Please note that changing the name of commands that are logged into the
# AOF file or transmitted to replicas may cause problems.

不用 keys * 避免卡顿，那应该用什么

使用 scan 命令，参考官网 (opens in a new tab)，中文官网 (opens in a new tab)，类似 MySQL limit，但不完全相同。

scan 命令用于迭代数据库中的数据库键
- 语法
  
  SCAN cursor [MATCH pattern] [COUNT count]
  - 基于游标的迭代器，需要基于上一次的游标延续之前的迭代过程；
  - 以 0 作为游标开始一次新的迭代，知道命令返回游标 0 完成一次遍历；
  - 不保证每次执行都返回某个给定数量的元素，支持模糊查询；
  - 一次返回的数量不可控，只能大概率符合 count 参数。
- 特点
  
  SCAN 命令是一个基于游标的迭代器，每次被调用之后，都会向用户返回一个新的游标， 用户在下次迭代时需要使用这个新游标作为 SCAN 命令的游标参数，以此来延续之前的迭代过程。
  
  SCAN 返回一个包含两个元素的数组，第一个元素是用于进行下一次迭代的新游标，第二个元素则是一个数组，这个数组中包含了所有被迭代的元素。如果新游标返回零表示迭代已结束。
  
  SCAN的遍历顺序非常特别，它不是从第一维数组的第零位一直遍历到末尾，而是采用了高位进位加法来遍历。之所以使用这样特殊的方式进行遍历，是考虑到字典的扩容和缩容时避免槽位的遍历重复和遗漏。
- 使用
```
127.0.0.1:6379> keys *
1) "balance"
2) "k300"
3) "list"
4) "k200"
5) "k100"
127.0.0.1:6379> scan 0 match * count 1
1) "4"
2) 1) "balance"
   2) "k300"
   3) "list"
127.0.0.1:6379> scan 4 match * count 1
1) "1"
2) 1) "k100"
127.0.0.1:6379> scan 1 match * count 1
1) "7"
2) 1) "k200"
127.0.0.1:6379> scan 7 match * count 1
1) "0"
2) (empty array)
127.0.0.1:6379> 
```

BigKey 案例

多大算 Big

参考《阿里云 Redis 开发规范》

拒绝 bigkey（防止网卡流量、慢查询）

string 类型控制在 10KB 以内，hash、list、set、zset元素个数不超过 5000。

反例：一个包含 200 万个元素的 list

非字符串的 bigkey，不要使用 del 删除，使用 hscan、sscan、zscan 方式渐进式删除，同时要注意防止 bigkey 过期时间自动删除问题（例如一个 200 万的 zset 设置 1 小时过期，会触发 del 操作，造成阻塞，而且改操作不会出现在慢查询中（letency 可查））。
string 和二级结构
- string 是 value，最大 512MB 但是大于等于 10KB 就是 bigkey。
- list、hash、set 和 zset，个数超过 500 就是 bigkey

哪些危害

内存不均，集群迁移困难
删除超时，大 key 删除作梗
网络流量阻塞

如何产生

社交类
汇总统计

如何发现

redis-cli --bigkeys

优势：给出每种数据结构 top1 bigkey，同时给出每种数据类型的键值个数和平均大小
劣势：想查询大于 10kb 的所有 key，--bigkeys 参数就无能为力了，需要用到 memory usage 来计算每个键值的字节数

redis-cli --bigkeys -a 111111
redis-cli -h 127.0.0.1 -p 6379 -a 111111 --bigkeys
redis-cli -h 127.0.0.1 -p 6379 -a 111111 --bigkeys -i 0.1 # 每隔 100 条 scan 指令就会休眠 0.1s，ops 就不会剧烈抬升，但是扫描的时间会变长

MEMORY USAGE <KEY-NAME> 计算每个键值的字节数

MEMORY USAGE 命令给出一个 key 和它的值在 RAM 中所占用的字节数。返回的结果是 key 的值以及为管理该 key 分配的内存总字节数。对于嵌套数据类型，可以使用选项 SAMPLES，其中 count 表示抽样的元素个数，默认值为 5。当需要抽样所有元素时，使用 SAMPLES 0，参考Redis 中文网 (opens in a new tab)。
```
127.0.0.1:6379> MEMORY USAGE key [SAMPLES count]
```

如何删除

参考《阿里云 Redis 开发规范》

【强制】：拒绝 bigkey（防止网卡流量，慢查询）

string 类型控制在 10KB 以内，hash、list、set、zset 元素个数不要超过 5000.

反例：一个包含 200 万个元素的 list。

非字符串的 bigkey，使用 hscan、sscan、zscan 方式渐进式删除，同时要注意防止 bigkey 过期时间自动删除问题（例如一个 200 万的 zset 设置 1 小时过期，会触发 del 操作，造成阻塞，而且该操作不会出现在慢查询中（latency可查））

官网

https://redis.io/commands/scan/ (opens in a new tab)

普通命令

string

一般用 del，如果过于庞大 unlink

hash

使用 hscan 每次获取少量 field-value，在使用 hdel 删除每个 field
命令

Redis HSCAN 命令用于迭代哈希表中的键值对
```
hscan key cursor [MATCH pattern] [COUNT count]
```

阿里手册 hash 删除：hscan + hdel

delBigHash

public void delBigHash(String host, int port, String password, String bigHashKey) {
    Jedis jedis = new Jedis(host, port);
    if (password != null && !"".equals(password)) {
        jedis.auth(password);
    }
    ScanParams scanParams = new ScanParams().count(100);
    String cursor = "";
    do {
        ScanResult<Entry<String, String>> scanResult = jedis.hscan(bigHashKey, cursor, scanParams);
        List<Entry<String, String>> entryList = scanResult.getResult();
        if (entryList != null && !entryList.isEmpty()) {
            for (Entry<String, String> entry : entryList) {
                jedis.hdel(bigHashKey, entry.getKey());
            }
        }
        cursor = scanResult.getStringCursor();
    } while (!"0".equals(cursor));
    
    // delete bigkey
    jedis.del(bigHashKey);
}

list

使用 ltrim 渐进式逐步删除，直到全部删除完成
命令
```
ltrim key_name start stop
```

阿里手册 list 删除：ltrim

delBigList

public void delBigList(String host, int port, String password, String bigListKey) {
    Jedis jedis = new Jedis(host, port);
    if (password != null && !"".equals(password)) {
        jedis.auth(password);
    }
    long llen = jedis.llen(bigListKey);
    int counter = 0;
    int left = 100;
    while (counter < llen) {
        jedis.ltrim(bigListKey, left, llen);
        count += left;
    }
    jedis.del(bigListKey);
}

set

使用 sscan 每次获取部分元素，再使用 srem 命令删除每个元素
命令
```
sscan set 0
srem set a b
```

阿里手册 set 删除：sscan + srem

delBigSet

public void delBigSet(String host, int port, String password, String bigSetKey) {
    Jedis jedis = new Jedis(host, port);
    if (password != null && !"".equals(password)) {
        jedis.auth(password);
    }
    ScanParams scanParams = new ScanParams().count(100);
    String cursor = "0";
    do {
        ScanResult<String> scanResult = jedis.sscan(bigSetKey, cursor, scanParams);
        List<String> memberList = scanResult.getResult();
        if (memberList != null && !memberList.isEmpty()) {
            for (String member : memberList) {
                jedis.srem(bigSetKey, member);
            }
        }
        cursor = scanResult.getStringCursor();
    } while(!"0".equals(cursor));
    jedis.del(bigSetKey);
}

zset

使用 zscan 每次获取部分元素，在使用 ZREMRANGEBYRANK 命令删除每个元素

命令

zadd salary 2000 z3 4000 li4 5000 w5 7000 s7
zrange salary 0 -1 withscores
zscan salary 0
zremrangebyrank salary 0 1
zrange salary 0 -1 withscores

阿里手册 SortedSet 删除：zscan + zrem

delBigZset

public void delBigZset(String host, int port, String password, String bigZsetKey) {
    Jedis jedis = new Jedis(host, port);
    if (password != null && !"".equals(password)) {
        jedis.auth(password);
    }
    ScanParams scanParams = new ScanParams().count(100);
    String cursor = "0";
    do {
        ScanResult<Tuple> scanResult = jedis.zscan(bigZsetKey, cursor, scanParams);
        List<Tuple> tupleList = scanResult.getResult();
        if (tupleList != null && !tupleList.isEmpty()) {
            for (Tuple tuple : tupleList) {
                jedis.zrem(bigZsetKey, tuple.getElement());
            }
        }
        cursor = scanResult.getStringCursor();
    } while (!"0".euqals(cursor));
    
    jedis.del(bigZsetKey);
}

BigKey 生产调优

redis.conf 配置文件 LAZY FREEING 相关说明

阻塞和非阻塞删除命令

############################# LAZY FREEING ####################################

# Redis has two primitives to delete keys. One is called DEL and is a blocking
# deletion of the object. It means that the server stops processing new commands
# in order to reclaim all the memory associated with an object in a synchronous
# way. If the key deleted is associated with a small object, the time needed
# in order to execute the DEL command is very small and comparable to most other
# O(1) or O(log_N) commands in Redis. However if the key is associated with an
# aggregated value containing millions of elements, the server can block for
# a long time (even seconds) in order to complete the operation.
#
# For the above reasons Redis also offers non blocking deletion primitives
# such as UNLINK (non blocking DEL) and the ASYNC option of FLUSHALL and
# FLUSHDB commands, in order to reclaim memory in background. Those commands
# are executed in constant time. Another thread will incrementally free the
# object in the background as fast as possible.
#
# DEL, UNLINK and ASYNC option of FLUSHALL and FLUSHDB are user-controlled.
# It's up to the design of the application to understand when it is a good
# idea to use one or the other. However the Redis server sometimes has to
# delete keys or flush the whole database as a side effect of other operations.
# Specifically Redis deletes objects independently of a user call in the
# following scenarios:
#
# 1) On eviction, because of the maxmemory and maxmemory policy configurations,
#    in order to make room for new data, without going over the specified
#    memory limit.
# 2) Because of expire: when a key with an associated time to live (see the
#    EXPIRE command) must be deleted from memory.
# 3) Because of a side effect of a command that stores data on a key that may
#    already exist. For example the RENAME command may delete the old key
#    content when it is replaced with another one. Similarly SUNIONSTORE
#    or SORT with STORE option may delete existing keys. The SET command
#    itself removes any old content of the specified key in order to replace
#    it with the specified string.
# 4) During replication, when a replica performs a full resynchronization with
#    its master, the content of the whole database is removed in order to
#    load the RDB file just transferred.
#
# In all the above cases the default is to delete objects in a blocking way,
# like if DEL was called. However you can configure each case specifically
# in order to instead release memory in a non-blocking way like if UNLINK
# was called, using the following configuration directives.

lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no

# It is also possible, for the case when to replace the user code DEL calls
# with UNLINK calls is not easy, to modify the default behavior of the DEL
# command to act exactly like UNLINK, using the following configuration
# directive:

lazyfree-lazy-user-del no

# FLUSHDB, FLUSHALL, SCRIPT FLUSH and FUNCTION FLUSH support both asynchronous and synchronous
# deletion, which can be controlled by passing the [SYNC|ASYNC] flags into the
# commands. When neither flag is passed, this directive will be used to determine
# if the data should be deleted asynchronously.

lazyfree-lazy-user-flush no

优化配置

lazyfree-lazy-server-del yes
replica-lazy-flush yes

lazyfree-lazy-user-del yes

Redis 单线程 VS 多线程缓存双写一致性之更新策略