Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.60.0 cannot read from datastream #1513

Open
yorickdowne opened this issue Nov 28, 2024 · 22 comments
Open

v2.60.0 cannot read from datastream #1513

yorickdowne opened this issue Nov 28, 2024 · 22 comments
Assignees

Comments

@yorickdowne
Copy link

yorickdowne commented Nov 28, 2024

Tested on Polygon-zkEVM mainnet and XLayer mainnet

Batches stage shows an error:

cdk-erigon-1  | [INFO] [11-28|08:19:00.584] [3/15 Batches] Starting batches stage 
cdk-erigon-1  | [INFO] [11-28|08:19:03.291] [3/15 Batches] Highest block in datastream datastreamBlock=6950555 stageProgressBlockNo=6950554
cdk-erigon-1  | [INFO] [11-28|08:19:03.291] [3/15 Batches] Reading blocks from the datastream. 
cdk-erigon-1  | [INFO] [11-28|08:19:03.293] [3/15 Batches] Started downloading L2Blocks routine ID: 636467 
cdk-erigon-1  | [WARN] [11-28|08:19:04.813] Error in datastream client, stopping consumption 
cdk-erigon-1  | [WARN] [11-28|08:19:04.813] [3/15 Batches] Error downloading blocks from datastream error="initiateDownloadBookmark: afterStartCommand: readResultEntry: unknown error code: ��n��
           OK�n����n������Π�2 B �\\����=&���\"�y��A���X@G��� J l������=�ԅ��;3*`%=;�i\riNZ�\r��R Z��'����J��Ǐ'���Z��"
cdk-erigon-1  | [INFO] [11-28|08:19:04.813] [3/15 Batches] Total blocks written: 0 
cdk-erigon-1  | [INFO] [11-28|08:19:04.813] [3/15 Batches] Ended downloading L2Blocks routine ID: 636467 
cdk-erigon-1  | [INFO] [11-28|08:19:04.813] [3/15 Batches] Finished Batches stage 

followed by this, which means it recovers? This alternates, the next run of 3/15 will show an error again

cdk-erigon-1  | [INFO] [11-28|08:22:03.910] [Datastream client] Last error detected, trying to reconnect 
cdk-erigon-1  | [INFO] [11-28|08:22:04.870] [3/15 Batches] Highest block in datastream datastreamBlock=17932392 stageProgressBlockNo=17932391
cdk-erigon-1  | [INFO] [11-28|08:22:04.870] [3/15 Batches] Reading blocks from the datastream. 
cdk-erigon-1  | [INFO] [11-28|08:22:04.870] [3/15 Batches] Started downloading L2Blocks routine ID: 288436 
cdk-erigon-1  | [INFO] [11-28|08:22:05.031] [3/15 Batches] Ended downloading L2Blocks routine ID: 288436 
cdk-erigon-1  | [INFO] [11-28|08:22:05.043] [3/15 Batches] Total blocks written: 1 
cdk-erigon-1  | [INFO] [11-28|08:22:05.043] [3/15 Batches] Saving stage progress     lastBlockHeight=17932392
cdk-erigon-1  | [INFO] [11-28|08:22:05.043] [3/15 Batches] Finished writing blocks   blocksWritten=1 elapsed=1.132854715s
cdk-erigon-1  | [INFO] [11-28|08:22:05.043] [3/15 Batches] Finished Batches stage 

v2.0.1 does not show these errors.

@hexoscott
Copy link
Collaborator

Please could you share your config, it looks like your node is having trouble getting anything from the stream. I have just synced from scratch using v2.60.0 without issue.

@yorickdowne
Copy link
Author

yorickdowne commented Nov 28, 2024

@hexoscott For sure! Here's the compose entrypoint and command, and the mainnet.yaml

compose

    command:
      - --zkevm.smt-regenerate-in-memory
      - --prune
      - htc
    entrypoint:
      - cdk-erigon
      - --config
      - /home/erigon/.local/share/erigon-config/mainnet.yaml
      - --zkevm.l1-rpc-url
      - https://eth-rpc.mydomain.tld
      - --zkevm.rpc-ratelimit=250
      - --maxpeers
      - "32"
      - --downloader.disable.ipv6
      - --http
      - --http.api
      - eth,net,trace,web3,erigon,zkevm
      - --http.addr
      - 0.0.0.0
      - --http.port
      - "8545"
      - --http.vhosts=*
      - --http.corsdomain=*
      - --ws
      - --metrics
      - --metrics.addr
      - 0.0.0.0

mainnet.yaml

datadir: /home/erigon/.local/share/erigon
chain: hermez-mainnet
zkevm.l2-chain-id: 1101
zkevm.l2-sequencer-rpc-url: https://zkevm-rpc.com
zkevm.l2-datastreamer-url: stream.zkevm-rpc.com:6900
zkevm.l1-chain-id: 1

zkevm.address-sequencer: "0x148Ee7dAF16574cD020aFa34CC658f8F3fbd2800"
zkevm.address-zkevm: "0x519E42c24163192Dca44CD3fBDCEBF6be9130987"
zkevm.address-admin: "0x242daE44F5d8fb54B198D03a94dA45B5a4413e21"
zkevm.address-rollup: "0x5132A183E9F3CB7C848b0AAC5Ae0c4f0491B7aB2"
zkevm.address-ger-manager: "0x580bda1e7A0CFAe92Fa7F6c20A3794F169CE3CFb"

zkevm.default-gas-price: 1000000000
zkevm.max-gas-price: 0
zkevm.gas-price-factor: 0.0375

zkevm.l1-rollup-id: 1
zkevm.l1-block-range: 20000
zkevm.l1-query-delay: 6000
zkevm.l1-first-block: 16896700
zkevm.datastream-version: 2

externalcl: true

@zfy0701
Copy link

zfy0701 commented Dec 3, 2024

same here, but for xlayer,

             cdk-erigon --datadir=/data  --port=30303
              --http.addr=0.0.0.0 --http.port=8545 --http.vhosts=*
              --private.api.addr=127.0.0.1:9090
              --authrpc.jwtsecret=/data/jwt.hex --authrpc.addr=0.0.0.0
              --authrpc.port=8551 --authrpc.vhosts=* --torrent.port=42069
              --metrics --metrics.addr=0.0.0.0 --metrics.port=6060
              --http.api=eth,erigon,web3,net,debug,trace,zkevm
              --rpc.returndata.limit=100000000 --rpc.gascap=10000000000 --ws
              --torrent.download.rate=200mb --port=0 --zkevm.l2-chain-id=196
              --zkevm.l2-sequencer-rpc-url=https://rpc.xlayer.tech
              --zkevm.l2-datastreamer-url=stream.xlayer.tech:8800
              --zkevm.l1-chain-id=1
              --zkevm.l1-rpc-url=http://local-l1node:8545
              --zkevm.address-sequencer="0xAF9d27ffe4d51eD54AC8eEc78f2785D7E11E5ab1"
              --zkevm.address-zkevm="0x2B0ee28D4D51bC9aDde5E58E295873F61F4a0507"
              --zkevm.address-admin="0x491619874b866c3cDB7C8553877da223525ead01"
              --zkevm.address-rollup="0x5132A183E9F3CB7C848b0AAC5Ae0c4f0491B7aB2"
              --zkevm.address-ger-manager="0x580bda1e7A0CFAe92Fa7F6c20A3794F169CE3CFb"
              --zkevm.l1-rollup-id=3 
              --zkevm.l1-first-block=19218658
              --zkevm.l1-block-range=2000
              --zkevm.l1-query-delay=1000
              --zkevm.datastream-version=3 
              --chain=xlayer-mainnet

@yorickdowne
Copy link
Author

I was unaware that 2.60.0 requires a full re-sync. Re-syncing now to see whether that solves the issue.

"As for older DBs, version 2.60.X is incompatible with a DB created by a 2.0.X version so will require a full re-sync on the newer version."

@zfy0701
Copy link

zfy0701 commented Dec 5, 2024

I was unaware that 2.60.0 requires a full re-sync. Re-syncing now to see whether that solves the issue.

"As for older DBs, version 2.60.X is incompatible with a DB created by a 2.0.X version so will require a full re-sync on the newer version."

mine is already sync from scratch

@yorickdowne
Copy link
Author

Yes, same result here. After resyncing xlayer and polygon-zkevm on 2.60.0, I still see these error messages "Error downloading blocks from datastream"

@hexoscott
Copy link
Collaborator

Can you can try netcat or similar to connect to the datastream port from your Docker environment? We have been unable to replicate this issue from our side.

@yorickdowne
Copy link
Author

yorickdowne commented Dec 5, 2024

/home/erigon # nc -zv -w2 stream.zkevm-rpc.com 6900
stream.zkevm-rpc.com (35.189.86.110:6900) open

That's from inside the cdk-erigon container

During sync, the datastream connection is healthy:

cdk-erigon-1  | [INFO] [12-05|13:17:41.833] [3/15 Batches] Saving stage progress     lastBlockHeight=9300000
cdk-erigon-1  | [INFO] [12-05|13:17:43.293] [3/15 Batches] Downloaded blocks from datastream progress: 9316375 
cdk-erigon-1  | [INFO] [12-05|13:17:53.293] [3/15 Batches] Downloaded blocks from datastream progress: 9387547 
cdk-erigon-1  | [INFO] [12-05|13:17:55.457] [3/15 Batches] Saving stage progress     lastBlockHeight=9400000
cdk-erigon-1  | [INFO] [12-05|13:18:02.749] [3/15 Batches] Saving stage progress     lastBlockHeight=9500000
cdk-erigon-1  | [INFO] [12-05|13:18:03.300] [3/15 Batches] Downloaded blocks from datastream progress: 9500000 
cdk-erigon-1  | [INFO] [12-05|13:18:09.720] [3/15 Batches] Saving stage progress     lastBlockHeight=9600000
cdk-erigon-1  | [INFO] [12-05|13:18:13.293] [3/15 Batches] Downloaded blocks from datastream progress: 9657974 
cdk-erigon-1  | [INFO] [12-05|13:18:16.212] [3/15 Batches] Saving stage progress     lastBlockHeight=9700000
cdk-erigon-1  | [INFO] [12-05|13:18:22.240] [3/15 Batches] Saving stage progress     lastBlockHeight=9800000
cdk-erigon-1  | [INFO] [12-05|13:18:23.293] [3/15 Batches] Downloaded blocks from datastream progress: 9806737 

@yorickdowne
Copy link
Author

Log of an xlayer sync, maybe you can see where these errors may originate from, with more context.
xlayer-sync.log

@yorickdowne
Copy link
Author

Issue also seen in v2.60.1-RC1

@yorickdowne
Copy link
Author

To clarify, even with these warn messages and "Error downloading blocks from datastream error" including the "garbage" output on that line, the node still keeps up with head. It is staying synced from what I can see; it is also throwing out these warnings frequently.

@hexoscott
Copy link
Collaborator

Could you confirm what these values are set to on the sequencer node this RPC is consuming from?

zkevm.data-stream-inactivity-timeout
zkevm.data-stream-inactivity-check-interval
zkevm.data-stream-writeTimeout

Are you using any datastream repeaters in between the original sequencer and the RPC consuming them or anything like that?

@hexoscott
Copy link
Collaborator

If you're staying synced but the connection is being frequently dropped the issue is likely in the repeater or in these config settings being too short of a timeframe

@yorickdowne
Copy link
Author

This is the sequencer and the streams we are using in the config yaml. No repeaters, and I don't know how the sequencer is configured: But maybe you do?

zkevm.l2-chain-id: 196
zkevm.l2-sequencer-rpc-url: https://rpc.xlayer.tech
zkevm.l2-datastreamer-url: stream.xlayer.tech:8800
zkevm.l1-chain-id: 1

For Polygon zkEVM it's similar:

zkevm.l2-chain-id: 1101
zkevm.l2-sequencer-rpc-url: https://zkevm-rpc.com
zkevm.l2-datastreamer-url: stream.zkevm-rpc.com:6900
zkevm.l1-chain-id: 1

@yorickdowne
Copy link
Author

yorickdowne commented Dec 5, 2024

Erigon command as seen by ps -auxww, here for Polygon zkEVM:

cdk-erigon --config /home/erigon/.local/share/erigon-config/mainnet.yaml --zkevm.l1-rpc-url https://eth-rpc.mydomain.tld --zkevm.rpc-ratelimit 250 --maxpeers 32 --downloader.disable.ipv6 --http --http.api eth,net,trace,web3,erigon,zkevm --http.addr 0.0.0.0 --http.port 8545 --http.vhosts=* --http.corsdomain=* --ws --ws.api eth,net,trace,web3,erigon,zkevm --ws.addr 0.0.0.0 --ws.port 8546 --metrics --metrics.addr 0.0.0.0 --zkevm.smt-regenerate-in-memory --prune htc

@hdiass
Copy link

hdiass commented Dec 18, 2024

Is there any update regarding this error ? I also face it after syncing from scratch both zkevm and xlayer on V2.61.0

@revitteth
Copy link
Collaborator

@yorickdowne do you encounter this error syncing from datastreams other than xlayer? Are we able to establish the datastreamer version being used by xlayer?

@yorickdowne
Copy link
Author

yorickdowne commented Jan 7, 2025

Hello @revitteth, I’ve encountered this on Polygon zkEVM and on XLayer.

I do not know how to check data stream version for either.

@hdiass
Copy link

hdiass commented Jan 7, 2025

Hello @revitteth we see it on both xlayer and zkevm mainnets and testnets

@LinkRiver-Vitor
Copy link

Hey @revitteth

We are using version 2, per our hermezconfig-mainnet.yaml file:

zkevm.datastream-version: 2

Is there a preferable version to be used? Could you please, confirm?
Thanks in advance!

@northwestnodes-eric
Copy link

Same issue for us.

@revitteth
Copy link
Collaborator

I have raised a task to remove the datastream-version flag - at version 1 it denotes the 'pre-proto' version of the stream, and 2 indicates proto version of the stream (now ubiquitous).

I am currently hooking up to the xlayer stream to test - I will report my findings. I figure that the stream is corrupt and a sensible plan of action which doesn't require too much input would be on the stream server to regenerate the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants