Migration from AWS ElastiCache Part 1

The architecture of AWS ElastiCache is divided into two modes: cluster-enabled and non-cluster-enabled modes. Montplex Cache has the same architecture as AWS ElastiCache and is also divided into cluster-enabled and non-cluster-enabled modes.

This article mainly explains how to migrate from AWS ElastiCache non-cluster mode to Montplex Cache non-cluster mode.

Preparation

Prepare the Target Instance

In the VPC where AWS ElastiCache is located, create a non-cluster mode Montplex Cache by referring to the Montplex Cache creation process. It is recommended to choose the same availability zone as the ElastiCache instance to reduce cross-availability zone traffic during migration.

Prepare the Migration Scheduling Machine

Prepare an EC2 machine that can be remotely accessed in the VPC where AWS ElastiCache is located. A configuration of 4C8G or higher is recommended. It is also recommended that the availability zone is the same as the ElastiCache instance and the target instance Montplex Cache.

Install the Migration Tool

This article uses the Redis-shake tool for migration. https://github.com/tair-opensource/RedisShake

Download the migration tool.

wget https://github.com/tair-opensource/RedisShake/releases/download/v4.1.0/redis-shake-linux-amd64.tar.gz tar -zxf redis-shake-linux-amd64.tar.gz

Determine Migration Type

Migration Types

Migrating from instance to instance can be divided into online full migration and online full + incremental migration.

Online Full Migration: Migrates data that already exists on the source side; new data will not be migrated.

Online Full + Incremental Migration: Migrates both existing data and new data.

Migration Differences

Due to limitations on the ElastiCache instance for some commands (e.g., config, sync, psync), direct data migration using master-slave replication is not possible. Therefore, the scan mode must be used for data migration. When migrating incremental data in scan mode, the notify-keyspace-events parameter must be enabled.

In summary:

Online Full Migration: Does not require modifications to the source instance configuration file.

Online Full + Incremental Migration: Requires modifications to the source instance configuration file. Refer to the following to modify ElastiCache parameters: To achieve online full + incremental data migration, the notify parameter for new keys in Redis needs to be modified. How to Change the Aws ElastiCache Parameter

Configure the Migration Tool

Migration Modes

The Redis-shake provides three modes: restore, sync, and scan.

  • Restore Mode: Mainly used for offline recovery of RDB files.
  • Sync Mode: Uses sync, psync, etc. commands, mainly used for master-slave replication scenarios.
  • Scan Mode: Uses the scan method, mainly for cloud Redis instances that cannot use config, sync, psync, etc. commands.

Due to command limitations on the ElastiCache instance (e.g., config, sync, psync), only scan mode can be used. Redis Shake

Configuration Files

The Redis-shake provides a default configuration file shake.toml, which includes configuration files for all three migration modes.

Generate Migration Configuration Files (Scan Mode)

Full migration profile

function = "" [scan_reader] cluster = false # set to true if source is a redis cluster address = "single.0d6hxg.ng.0001.use1.cache.amazonaws.com:6379" # when cluster is true, set address to one of the cluster node username = "" # keep empty if not using ACL password = "" # keep empty if no authentication is required tls = false dbs = [] # set you want to scan dbs such as [1,5,7], if you don't want to scan all scan = true # set to false if you don't want to scan keys ksn = false # set to true to enabled Redis keyspace notifications (KSN) subscription count = 50 # number of keys to scan per iteration [redis_writer] cluster = false # set to true if target is a redis cluster sentinel = false # set to true if target is a redis sentinel master = "" # set to master name if target is a redis sentinel address = "k8s-proxy-engulapr-995a2c0196-ea82fd9391768799.elb.us-east-1.amazonaws.com:8125" # when cluster is true, set address to one of the cluster node username = "" # keep empty if not using ACL password = "50892090597450154656114056972149" # keep empty if no authentication is required tls = false off_reply = false # ture off the server reply [advanced] dir = "data" ncpu = 0 # runtime.GOMAXPROCS, 0 means use runtime.NumCPU() cpu cores pprof_port = 0 # pprof port, 0 means disable status_port = 0 # status port, 0 means disable # log log_file = "shake.log" log_level = "info" # debug, info or warn log_interval = 5 # in seconds rdb_restore_command_behavior = "rewrite" # panic, rewrite or skip pipeline_count_limit = 1024 target_redis_client_max_querybuf_len = 1024_000_000 target_redis_proto_max_bulk_len = 512_000_000 aws_psync = "" empty_db_before_sync = false

Full + incremental migration profile

function = "" [scan_reader] cluster = false # set to true if source is a redis cluster address = "single.0d6hxg.ng.0001.use1.cache.amazonaws.com:6379" # when cluster is true, set address to one of the cluster node username = "" # keep empty if not using ACL password = "" # keep empty if no authentication is required tls = false dbs = [] # set you want to scan dbs such as [1,5,7], if you don't want to scan all scan = true # set to false if you don't want to scan keys ksn = true # set to true to enabled Redis keyspace notifications (KSN) subscription count = 50 # number of keys to scan per iteration [redis_writer] cluster = false # set to true if target is a redis cluster sentinel = false # set to true if target is a redis sentinel master = "" # set to master name if target is a redis sentinel address = "k8s-proxy-engulapr-995a2c0196-ea82fd9391768799.elb.us-east-1.amazonaws.com:8125" # when cluster is true, set address to one of the cluster node username = "" # keep empty if not using ACL password = "50892090597450154656114056972149" # keep empty if no authentication is required tls = false off_reply = false # ture off the server reply [advanced] dir = "data" ncpu = 0 # runtime.GOMAXPROCS, 0 means use runtime.NumCPU() cpu cores pprof_port = 0 # pprof port, 0 means disable status_port = 0 # status port, 0 means disable # log log_file = "shake.log" log_level = "info" # debug, info or warn log_interval = 5 # in seconds rdb_restore_command_behavior = "rewrite" # panic, rewrite or skip pipeline_count_limit = 1024 target_redis_client_max_querybuf_len = 1024_000_000 target_redis_proto_max_bulk_len = 512_000_000 aws_psync = "" empty_db_before_sync = false

Execute Migration

Migrate Full Data

Note: When migrating full data, do not enable Redis keyspace notifications (as it affects business performance). Set ksn = false.

1./redis-shake scan.toml >shake.log 2>&1 &
2ubuntu@ip-10-0-1-211:~$
3ubuntu@ip-10-0-1-211:~$ tail -f shake.log
42024-06-17 03:47:44 INF load config from file: scan.toml
52024-06-17 03:47:44 INF log_level: [info], log_file: [/home/ubuntu/data/shake.log]
62024-06-17 03:47:44 INF changed work dir. dir=[/home/ubuntu/data]
72024-06-17 03:47:44 INF GOMAXPROCS defaults to the value of runtime.NumCPU [8]
82024-06-17 03:47:44 INF not set pprof port
92024-06-17 03:47:44 INF create ScanStandaloneReader: single.0d6hxg.ng.0001.use1.cache.amazonaws.com:6379
102024-06-17 03:47:44 INF create RedisStandaloneWriter: k8s-proxy-engulapr-995a2c0196-ea82fd9391768799.elb.us-east-1.amazonaws.com:8125
112024-06-17 03:47:44 INF not set status port
122024-06-17 03:47:44 INF start syncing...
132024-06-17 03:47:49 INF read_count=[469357], read_ops=[109627.91], write_count=[469356], write_ops=[109626.91], scan_dbid=[0], scan_percent=[90.45%], need_update_count=[99997]
142024-06-17 03:47:51 INF [reader_single.0d6hxg.ng.0001.use1.cache.amazonaws.com_6379] scanStandaloneReader dump finished.
152024-06-17 03:47:51 INF [reader_single.0d6hxg.ng.0001.use1.cache.amazonaws.com_6379] scanStandaloneReader restore finished.
162024-06-17 03:47:51 INF all done

Upon completion of the full data migration, the task will exit automatically.

Migrate Full + Incremental Data

Note: When migrating incremental data, you need to enable Redis keyspace notifications (as it affects business performance). Set ksn = true.

1./redis-shake scan.toml >shake.log 2>&1 &
2ubuntu@ip-10-0-1-211:~$
3ubuntu@ip-10-0-1-211:~$ tail -f shake.log
42024-06-17 09:26:36 INF load config from file: scan.toml
52024-06-17 09:26:36 INF log_level: [info], log_file: [/home/ubuntu/data/shake.log]
62024-06-17 09:26:36 INF changed work dir. dir=[/home/ubuntu/data]
72024-06-17 09:26:36 INF GOMAXPROCS defaults to the value of runtime.NumCPU [8]
82024-06-17 09:26:36 INF not set pprof port
92024-06-17 09:26:36 INF create ScanStandaloneReader: single.0d6hxg.ng.0001.use1.cache.amazonaws.com:6379
102024-06-17 09:26:36 INF create RedisStandaloneWriter: k8s-proxy-engulapr-995a2c0196-ea82fd9391768799.elb.us-east-1.amazonaws.com:8125
112024-06-17 09:26:36 INF not set status port
122024-06-17 09:26:36 INF start syncing...
132024-06-17 09:26:41 INF read_count=[598150], read_ops=[120415.30], write_count=[598149], write_ops=[120415.30], need_update_count=[34844]
142024-06-17 09:26:46 INF read_count=[635024], read_ops=[0.00], write_count=[635024], write_ops=[0.00], need_update_count=[0]
152024-06-17 09:26:51 INF read_count=[635024], read_ops=[0.00], write_count=[635024], write_ops=[0.00], need_update_count=[0]
162024-06-17 09:26:56 INF read_count=[635024], read_ops=[0.00], write_count=[635024], write_ops=[0.00], need_update_count=[0]
172024-06-17 09:27:01 INF read_count=[635024], read_ops=[0.00], write_count=[635024], write_ops=[0.00], need_update_count=[0]

Upon completion of the full + incremental data migration, the task will not exit automatically and will wait for event notifications.

Simulate New Data from the Source

Connect to the source Redis via the command line and write a key.

1# redis-cli -h source_ip -p 6379
2set newkey newvalue

The redis-shake log output will show that read_count has become 635025, indicating one more key.

2024-06-17 09:27:06 INF read_count=[635025], read_ops=[0.00], write_count=[635025], write_ops=[0.00], need_update_count=[0]

Migration Issues

ERR unexpected EOF Error

If the migration speed is slower than the write speed of the source during incremental migration, the publish/subscribe client buffer may fill up, causing the publish/subscribe client's connection (i.e., the migration tool's connection) to be dropped. The specific error is as follows:

ERR unexpected EOF RedisShake/internal/reader/scan_standalone_reader.go:102 -> (*scanStandaloneReader).subscript.func1() runtime/asm_amd64.s:1598 -> goexit()

Solution:

  1. Modify the Source Publish/Subscribe Client Buffer Parameters:
    • client-output-buffer-limit-pubsub-hard-limit: Redis will immediately close the client connection when the client output buffer size exceeds this value.
    • client-output-buffer-limit-pubsub-soft-limit: Redis will start timing when the client output buffer size exceeds this value.
    • client-output-buffer-limit-pubsub-soft-seconds: Specifies how long after the output buffer size exceeds the soft limit before closing the client connection.
  2. Modify the Migration Tool Configuration File:
    • Adjust the scan_reader.count parameter to increase the number of items migrated per batch.