Redis BigKey

面试题

  • 阿里广告平台,海量数据查询某一固定前缀的 key
  • 小红书,你如何生产上限制 keys */flushdb/flushall 等危险命令防止误删误用
  • 美团,MEMORY USAGE 命令你用过吗
  • BigKey 问题,多大算 big?你如何发现?如何删除?如何处理?
  • BigKey 你做过调优吗?惰性释放 lazyfree 了解过吗?
  • MoreKey 问题,生产上 redis 数据库有 1000w 记录,你如何遍历?key * 可以吗?

MoreKey 案例

  • 大批量往 redis 里面插入 2000w 测试数据 key

    Linux bash 下面执行,插入 100w 条数据

    for((i=1;i<=100*10000;i++)); do echo "set k$i v$i" >> /tmp/redisTest.txt ;done;

    通过 redis 提供的管道 --pipe 命令插入 100w 大批量数据

    cat /tmp/redisTest.txt | /opt/redis-7.0.0/src/redis-cli -h 127.0.0.1 -p 6379 -a 111111 --pipe
  • 生产上限制 keys */flushdb/flushall 等危险命令防止误删误用

    通过配置设置禁用这些命令,redis.conf 在 SECURITY 这项中

    # Command renaming (DEPRECATED).
    #
    # ------------------------------------------------------------------------
    # WARNING: avoid using this option if possible. Instead use ACLs to remove
    # commands from the default user, and put them only in some admin user you
    # create for administrative purposes.
    # ------------------------------------------------------------------------
    #
    # It is possible to change the name of dangerous commands in a shared
    # environment. For instance the CONFIG command may be renamed into something
    # hard to guess so that it will still be available for internal-use tools
    # but not available for general clients.
    #
    # Example:
    #
    # rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
    #
    # It is also possible to completely kill a command by renaming it into
    # an empty string:
    #
    # rename-command CONFIG ""
    
    rename-command keys ""
    rename-command flushdb ""
    rename-command flushall ""
    
    #
    # Please note that changing the name of commands that are logged into the
    # AOF file or transmitted to replicas may cause problems.
  • 不用 keys * 避免卡顿,那应该用什么

    使用 scan 命令,参考官网 (opens in a new tab)中文官网 (opens in a new tab),类似 MySQL limit,但不完全相同

    scan 命令用于迭代数据库中的数据库键

    • 语法

      SCAN cursor [MATCH pattern] [COUNT count]

      • 基于游标的迭代器,需要基于上一次的游标延续之前的迭代过程;
      • 以 0 作为游标开始一次新的迭代,知道命令返回游标 0 完成一次遍历;
      • 不保证每次执行都返回某个给定数量的元素,支持模糊查询;
      • 一次返回的数量不可控,只能大概率符合 count 参数。
    • 特点

      SCAN 命令是一个基于游标的迭代器,每次被调用之后, 都会向用户返回一个新的游标, 用户在下次迭代时需要使用这个新游标作为 SCAN 命令的游标参数, 以此来延续之前的迭代过程。

      SCAN 返回一个包含两个元素的数组,第一个元素是用于进行下一次迭代的新游标,第二个元素则是一个数组, 这个数组中包含了所有被迭代的元素。如果新游标返回零表示迭代已结束。

      SCAN的遍历顺序非常特别,它不是从第一维数组的第零位一直遍历到末尾,而是采用了高位进位加法来遍历。之所以使用这样特殊的方式进行遍历,是考虑到字典的扩容和缩容时避免槽位的遍历重复和遗漏。

    • 使用

      127.0.0.1:6379> keys *
      1) "balance"
      2) "k300"
      3) "list"
      4) "k200"
      5) "k100"
      127.0.0.1:6379> scan 0 match * count 1
      1) "4"
      2) 1) "balance"
         2) "k300"
         3) "list"
      127.0.0.1:6379> scan 4 match * count 1
      1) "1"
      2) 1) "k100"
      127.0.0.1:6379> scan 1 match * count 1
      1) "7"
      2) 1) "k200"
      127.0.0.1:6379> scan 7 match * count 1
      1) "0"
      2) (empty array)
      127.0.0.1:6379> 

BigKey 案例

多大算 Big

  • 参考《阿里云 Redis 开发规范》

    拒绝 bigkey(防止网卡流量、慢查询)

    string 类型控制在 10KB 以内,hash、list、set、zset元素个数不超过 5000。

    反例:一个包含 200 万个元素的 list

    非字符串的 bigkey,不要使用 del 删除,使用 hscan、sscan、zscan 方式渐进式删除,同时要注意防止 bigkey 过期时间自动删除问题(例如一个 200 万的 zset 设置 1 小时过期,会触发 del 操作,造成阻塞,而且改操作不会出现在慢查询中(letency 可查))。

  • string 和二级结构

    • string 是 value,最大 512MB 但是大于等于 10KB 就是 bigkey。
    • list、hash、set 和 zset,个数超过 500 就是 bigkey

哪些危害

  • 内存不均,集群迁移困难
  • 删除超时,大 key 删除作梗
  • 网络流量阻塞

如何产生

  • 社交类
  • 汇总统计

如何发现

  • redis-cli --bigkeys

    优势:给出每种数据结构 top1 bigkey,同时给出每种数据类型的键值个数和平均大小
    劣势:想查询大于 10kb 的所有 key,--bigkeys 参数就无能为力了,需要用到 memory usage 来计算每个键值的字节数

    redis-cli --bigkeys -a 111111
    redis-cli -h 127.0.0.1 -p 6379 -a 111111 --bigkeys
    redis-cli -h 127.0.0.1 -p 6379 -a 111111 --bigkeys -i 0.1 # 每隔 100 条 scan 指令就会休眠 0.1s,ops 就不会剧烈抬升,但是扫描的时间会变长
  • MEMORY USAGE <KEY-NAME> 计算每个键值的字节数

    MEMORY USAGE 命令给出一个 key 和它的值在 RAM 中所占用的字节数。返回的结果是 key 的值以及为管理该 key 分配的内存总字节数。对于嵌套数据类型,可以使用选项 SAMPLES,其中 count 表示抽样的元素个数,默认值为 5。当需要抽样所有元素时,使用 SAMPLES 0,参考Redis 中文网 (opens in a new tab)

    127.0.0.1:6379> MEMORY USAGE key [SAMPLES count]

如何删除

参考《阿里云 Redis 开发规范》

【强制】:拒绝 bigkey(防止网卡流量,慢查询)

string 类型控制在 10KB 以内,hash、list、set、zset 元素个数不要超过 5000.

反例:一个包含 200 万个元素的 list。

非字符串的 bigkey,使用 hscan、sscan、zscan 方式渐进式删除,同时要注意防止 bigkey 过期时间自动删除问题(例如一个 200 万的 zset 设置 1 小时过期,会触发 del 操作,造成阻塞,而且该操作不会出现在慢查询中(latency可查))

官网

https://redis.io/commands/scan/ (opens in a new tab)

普通命令

string
  • 一般用 del,如果过于庞大 unlink
hash
  • 使用 hscan 每次获取少量 field-value,在使用 hdel 删除每个 field

  • 命令

    Redis HSCAN 命令用于迭代哈希表中的键值对

    hscan key cursor [MATCH pattern] [COUNT count]
  • 阿里手册 hash 删除:hscan + hdel

    delBigHash
    public void delBigHash(String host, int port, String password, String bigHashKey) {
        Jedis jedis = new Jedis(host, port);
        if (password != null && !"".equals(password)) {
            jedis.auth(password);
        }
        ScanParams scanParams = new ScanParams().count(100);
        String cursor = "";
        do {
            ScanResult<Entry<String, String>> scanResult = jedis.hscan(bigHashKey, cursor, scanParams);
            List<Entry<String, String>> entryList = scanResult.getResult();
            if (entryList != null && !entryList.isEmpty()) {
                for (Entry<String, String> entry : entryList) {
                    jedis.hdel(bigHashKey, entry.getKey());
                }
            }
            cursor = scanResult.getStringCursor();
        } while (!"0".equals(cursor));
        
        // delete bigkey
        jedis.del(bigHashKey);
    }
list
  • 使用 ltrim 渐进式逐步删除,直到全部删除完成

  • 命令

    ltrim key_name start stop
  • 阿里手册 list 删除:ltrim

    delBigList
    public void delBigList(String host, int port, String password, String bigListKey) {
        Jedis jedis = new Jedis(host, port);
        if (password != null && !"".equals(password)) {
            jedis.auth(password);
        }
        long llen = jedis.llen(bigListKey);
        int counter = 0;
        int left = 100;
        while (counter < llen) {
            jedis.ltrim(bigListKey, left, llen);
            count += left;
        }
        jedis.del(bigListKey);
    }
set
  • 使用 sscan 每次获取部分元素,再使用 srem 命令删除每个元素

  • 命令

    sscan set 0
    srem set a b
  • 阿里手册 set 删除:sscan + srem

    delBigSet
    public void delBigSet(String host, int port, String password, String bigSetKey) {
        Jedis jedis = new Jedis(host, port);
        if (password != null && !"".equals(password)) {
            jedis.auth(password);
        }
        ScanParams scanParams = new ScanParams().count(100);
        String cursor = "0";
        do {
            ScanResult<String> scanResult = jedis.sscan(bigSetKey, cursor, scanParams);
            List<String> memberList = scanResult.getResult();
            if (memberList != null && !memberList.isEmpty()) {
                for (String member : memberList) {
                    jedis.srem(bigSetKey, member);
                }
            }
            cursor = scanResult.getStringCursor();
        } while(!"0".equals(cursor));
        jedis.del(bigSetKey);
    }
zset
  • 使用 zscan 每次获取部分元素,在使用 ZREMRANGEBYRANK 命令删除每个元素

  • 命令

    zadd salary 2000 z3 4000 li4 5000 w5 7000 s7
    zrange salary 0 -1 withscores
    zscan salary 0
    zremrangebyrank salary 0 1
    zrange salary 0 -1 withscores
  • 阿里手册 SortedSet 删除:zscan + zrem

    delBigZset
    public void delBigZset(String host, int port, String password, String bigZsetKey) {
        Jedis jedis = new Jedis(host, port);
        if (password != null && !"".equals(password)) {
            jedis.auth(password);
        }
        ScanParams scanParams = new ScanParams().count(100);
        String cursor = "0";
        do {
            ScanResult<Tuple> scanResult = jedis.zscan(bigZsetKey, cursor, scanParams);
            List<Tuple> tupleList = scanResult.getResult();
            if (tupleList != null && !tupleList.isEmpty()) {
                for (Tuple tuple : tupleList) {
                    jedis.zrem(bigZsetKey, tuple.getElement());
                }
            }
            cursor = scanResult.getStringCursor();
        } while (!"0".euqals(cursor));
        
        jedis.del(bigZsetKey);
    }

BigKey 生产调优

redis.conf 配置文件 LAZY FREEING 相关说明

  • 阻塞和非阻塞删除命令

    ############################# LAZY FREEING ####################################
    
    # Redis has two primitives to delete keys. One is called DEL and is a blocking
    # deletion of the object. It means that the server stops processing new commands
    # in order to reclaim all the memory associated with an object in a synchronous
    # way. If the key deleted is associated with a small object, the time needed
    # in order to execute the DEL command is very small and comparable to most other
    # O(1) or O(log_N) commands in Redis. However if the key is associated with an
    # aggregated value containing millions of elements, the server can block for
    # a long time (even seconds) in order to complete the operation.
    #
    # For the above reasons Redis also offers non blocking deletion primitives
    # such as UNLINK (non blocking DEL) and the ASYNC option of FLUSHALL and
    # FLUSHDB commands, in order to reclaim memory in background. Those commands
    # are executed in constant time. Another thread will incrementally free the
    # object in the background as fast as possible.
    #
    # DEL, UNLINK and ASYNC option of FLUSHALL and FLUSHDB are user-controlled.
    # It's up to the design of the application to understand when it is a good
    # idea to use one or the other. However the Redis server sometimes has to
    # delete keys or flush the whole database as a side effect of other operations.
    # Specifically Redis deletes objects independently of a user call in the
    # following scenarios:
    #
    # 1) On eviction, because of the maxmemory and maxmemory policy configurations,
    #    in order to make room for new data, without going over the specified
    #    memory limit.
    # 2) Because of expire: when a key with an associated time to live (see the
    #    EXPIRE command) must be deleted from memory.
    # 3) Because of a side effect of a command that stores data on a key that may
    #    already exist. For example the RENAME command may delete the old key
    #    content when it is replaced with another one. Similarly SUNIONSTORE
    #    or SORT with STORE option may delete existing keys. The SET command
    #    itself removes any old content of the specified key in order to replace
    #    it with the specified string.
    # 4) During replication, when a replica performs a full resynchronization with
    #    its master, the content of the whole database is removed in order to
    #    load the RDB file just transferred.
    #
    # In all the above cases the default is to delete objects in a blocking way,
    # like if DEL was called. However you can configure each case specifically
    # in order to instead release memory in a non-blocking way like if UNLINK
    # was called, using the following configuration directives.
    
    lazyfree-lazy-eviction no
    lazyfree-lazy-expire no
    lazyfree-lazy-server-del no
    replica-lazy-flush no
    
    # It is also possible, for the case when to replace the user code DEL calls
    # with UNLINK calls is not easy, to modify the default behavior of the DEL
    # command to act exactly like UNLINK, using the following configuration
    # directive:
    
    lazyfree-lazy-user-del no
    
    # FLUSHDB, FLUSHALL, SCRIPT FLUSH and FUNCTION FLUSH support both asynchronous and synchronous
    # deletion, which can be controlled by passing the [SYNC|ASYNC] flags into the
    # commands. When neither flag is passed, this directive will be used to determine
    # if the data should be deleted asynchronously.
    
    lazyfree-lazy-user-flush no
  • 优化配置

    lazyfree-lazy-server-del yes
    replica-lazy-flush yes
    
    lazyfree-lazy-user-del yes