Redis master-slave mode encounters errors after reaching a certain scale. #83

Salvare330 · 2024-10-31T06:48:59Z

Redis master-slave mode encounters errors after reaching a certain scale(RDBfile 60MB+).
slave:
1765:S 25 Sep 2024 18:06:10.296 * Full resync from master: d87a61aa0f144c8837c48ee045de84d72485ab1f:2824627352
1765:S 25 Sep 2024 18:06:10.412 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
1765:S 25 Sep 2024 18:06:10.565 # I/O error trying to sync with MASTER: Connection reset by peer
1765:S 25 Sep 2024 18:06:10.577 * Reconnecting to MASTER 10.113.7.15:6479 after failure
1765:S 25 Sep 2024 18:06:10.582 * MASTER <-> REPLICA sync started
1765:S 25 Sep 2024 18:06:10.586 * Non blocking connect for SYNC fired the event.
1765:S 25 Sep 2024 18:06:10.589 * Master replied to PING, replication can continue...
1765:S 25 Sep 2024 18:06:10.594 * Partial resynchronization not possible (no cached master)
1765:S 25 Sep 2024 18:06:15.556 * Full resync from master: d87a61aa0f144c8837c48ee045de84d72485ab1f:2824845204
1765:S 25 Sep 2024 18:06:15.672 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
1765:S 25 Sep 2024 18:06:15.815 # I/O error trying to sync with MASTER: Connection reset by peer
1765:S 25 Sep 2024 18:06:15.828 * Reconnecting to MASTER 10.113.7.15:6479 after failure
1765:S 25 Sep 2024 18:06:15.833 * MASTER <-> REPLICA sync started
1765:S 25 Sep 2024 18:06:15.836 * Non blocking connect for SYNC fired the event.
1765:S 25 Sep 2024 18:06:15.841 * Master replied to PING, replication can continue...
1765:S 25 Sep 2024 18:06:15.845 * Partial resynchronization not possible (no cached master)

odumetz · 2024-12-18T22:20:57Z

+1 having the same issue. Have tried so many configuration changes but replication is no longer working at all (increase replication log size, replication buffers, tried diskless replication, etc etc etc). Connection to the replica keeps getting lost. DB is not that large (full sync is 30MB)
Master Log:
[014308] 18 Dec 22:44:21.580 * Replica 10.10.28.20:6379 asks for synchronization
[014308] 18 Dec 22:44:21.581 * Full resync requested by replica 10.10.28.20:6379
[014308] 18 Dec 22:44:21.582 * Delay next BGSAVE for diskless SYNC
[014308] 18 Dec 22:44:26.717 * Starting BGSAVE for SYNC with target: replicas sockets
[014308] 18 Dec 22:44:26.748 * Background RDB transfer started by pid 2084
[014308] 18 Dec 22:44:26.964 # fork operation complete
[014308] 18 Dec 22:44:26.991 # Background transfer error
[014308] 18 Dec 22:44:26.991 # SYNC failed. BGSAVE child returned an error
[014308] 18 Dec 22:44:26.991 * Connection with replica 10.10.28.20:6379 lost.

Replica log:
[008748] 18 Dec 22:44:16.071 * MASTER <-> REPLICA sync started
[008748] 18 Dec 22:44:16.071 * Non blocking connect for SYNC fired the event.
[008748] 18 Dec 22:44:16.071 * Master replied to PING, replication can continue...
[008748] 18 Dec 22:44:16.071 * Partial resynchronization not possible (no cached master)
[008748] 18 Dec 22:44:21.259 * Full resync from master: 24eb1a270e365e170f2ac2a7c1cb458c618cb0f2:4953856
[008748] 18 Dec 22:44:21.466 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
[008748] 18 Dec 22:44:21.547 # I/O error trying to sync with MASTER: Unknown error
[008748] 18 Dec 22:44:21.568 * Reconnecting to MASTER 10.10.28.21:6379 after failure

odumetz · 2024-12-18T22:29:53Z

..Is this a bug that was known/fixed in 7.4.0 from 7.2.5? If yes, safe to swap in the new executables? (same aof/rdb format, etc.?)
I got replication to work partly by not using diskless replication removing buffer limits, but the .exe crashes when doing a full sync, after fetching the .rdb
Thanks!

zkteco-home · 2024-12-25T09:56:16Z

please tell me details to reproduce issue,including your conf file,how to run it to raise error?

odumetz · 2025-01-06T23:22:32Z

Thanks for replying - Conf is identical on master and replica (replica has additional replicaof statement).
Problem happens with master running, when replica starts fresh (no rdb or oaf files) or when adding a new replica. I even tried grabbing a copy of the rdb+aof files from the master to startup the replica, same behavior. rdb and aof sat at around 15M.
The .exe crashes, so I copied over event log info, WER dump and corresponding redis logs.
redis-conf.txt
redis-server-crash.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis master-slave mode encounters errors after reaching a certain scale. #83

Redis master-slave mode encounters errors after reaching a certain scale. #83

Salvare330 commented Oct 31, 2024

odumetz commented Dec 18, 2024

odumetz commented Dec 18, 2024 •

edited

Loading

zkteco-home commented Dec 25, 2024

odumetz commented Jan 6, 2025

Redis master-slave mode encounters errors after reaching a certain scale. #83

Redis master-slave mode encounters errors after reaching a certain scale. #83

Comments

Salvare330 commented Oct 31, 2024

odumetz commented Dec 18, 2024

odumetz commented Dec 18, 2024 • edited Loading

zkteco-home commented Dec 25, 2024

odumetz commented Jan 6, 2025

odumetz commented Dec 18, 2024 •

edited

Loading