Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: abci state sync #2413

Open
wants to merge 35 commits into
base: v2.0-dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
2d60300
feat: state sync
ogabrielides Jan 8, 2025
134855b
feat: move state sync read operations in consensus app
ogabrielides Jan 9, 2025
c00ecbd
fixes for checkpoints_path
ogabrielides Jan 13, 2025
420f84c
suggestions
ogabrielides Jan 13, 2025
1512e85
fix: correct return value for offer_snapshot
ogabrielides Jan 14, 2025
94777c7
refactor: more logs for load_snapshot_chunk API
ogabrielides Jan 14, 2025
caed905
fix: spelling
ogabrielides Jan 14, 2025
cacdc8c
temp
ogabrielides Jan 15, 2025
c686f27
increased snapshots max and freq
ogabrielides Jan 15, 2025
e6bb363
fix: return CompleteSnapshot on success
ogabrielides Jan 15, 2025
3b9e2a2
refactor: moved read state_sync into check_tx
ogabrielides Jan 15, 2025
667a3c6
chore: updated tenderdash-abci
ogabrielides Jan 15, 2025
a876678
refactor: remove list_snapshots load_snapshot_chunk from consensus app
ogabrielides Jan 15, 2025
76a0576
feat: added finalize_snapshot
ogabrielides Jan 16, 2025
3b80c45
feat: snapshot finalize
ogabrielides Jan 24, 2025
08b7854
fix: correct handling of last_commited_block
ogabrielides Jan 24, 2025
a41e993
chore: added logs for info and finalize_snapshot
ogabrielides Jan 24, 2025
523dece
chore: more logs
ogabrielides Jan 27, 2025
9a70f4c
fix: state reconstruct finalize_snapshot
ogabrielides Jan 27, 2025
293c71f
refactor: cleaning and formatter
ogabrielides Jan 28, 2025
1798eb7
feat: calculate next_validator_set_quorum_hash
ogabrielides Jan 29, 2025
0d447ee
feat: add check for next_validator_set_quorum_hash calculation
ogabrielides Jan 29, 2025
b321239
fix: for next_validator_set_quorum_hash calculation
ogabrielides Jan 29, 2025
a819364
fix: non empty current_validator_set_quorum_hash
ogabrielides Jan 29, 2025
ad58081
build: update dockerfile and dashmate config for state sync testing
lklimek Jan 29, 2025
c560323
Merge tag 'v1.8.0' into feat/abci-state-sync
lklimek Jan 29, 2025
b38cfe8
build(deps): upgrade rs-tenderdash-abci and fix serde_json dependency
lklimek Jan 29, 2025
143cccc
build(deps): update serde_json
lklimek Jan 29, 2025
4d58403
build(deps): upgrade sccache to 0.9.1
lklimek Jan 29, 2025
dad9fc2
build: disable sccache
lklimek Jan 29, 2025
a8a71d1
fix: fix for calculation of genesis time + cleanup
ogabrielides Jan 30, 2025
6785e6e
Revert "build: disable sccache"
lklimek Jan 30, 2025
3914225
refactor: clean and format
ogabrielides Jan 30, 2025
60eb7ff
Merge branch 'feat/abci-state-sync' of github.com:dashpay/platform in…
ogabrielides Jan 30, 2025
c16c1d9
chore: reduced tracing level
ogabrielides Jan 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/docker/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ runs:
AWS=${{ env.HOME }}/.aws/credentials
build-args: |
CARGO_BUILD_PROFILE=${{ inputs.cargo_profile }}
${{ steps.sccache.outputs.env_vars }}
# ${{ steps.sccache.outputs.env_vars }}
cache-from: ${{ steps.layer_cache_settings.outputs.cache_from }}
cache-to: ${{ steps.layer_cache_settings.outputs.cache_to }}
outputs: type=image,name=${{ inputs.image_org }}/${{ inputs.image_name }},push-by-digest=${{ inputs.push_tags != 'true' }},name-canonical=true,push=true
2 changes: 1 addition & 1 deletion .github/actions/sccache/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ inputs:
default: "true"
version:
description: "sccache version"
default: "0.8.2"
default: "0.9.1"
required: false
outputs:
env_vars:
Expand Down
12 changes: 6 additions & 6 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 12 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ ENV NODE_ENV=${NODE_ENV}
FROM deps-base AS deps-sccache

# SCCACHE_VERSION must be the same as in github actions, to avoid cache incompatibility
ARG SCCHACHE_VERSION=0.8.2
ARG SCCHACHE_VERSION=0.9.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in variable name.

The variable name SCCHACHE_VERSION contains a typo and should be SCCACHE_VERSION.

Apply this diff to fix the typo:

-ARG SCCHACHE_VERSION=0.9.1
+ARG SCCACHE_VERSION=0.9.1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ARG SCCHACHE_VERSION=0.9.1
ARG SCCACHE_VERSION=0.9.1


# Install sccache for caching
RUN if [[ "$TARGETARCH" == "arm64" ]] ; then export SCC_ARCH=aarch64; else export SCC_ARCH=x86_64; fi; \
Expand Down Expand Up @@ -552,19 +552,24 @@ LABEL description="Drive ABCI Rust"
RUN apk add --no-cache libgcc libstdc++

ENV DB_PATH=/var/lib/dash/rs-drive-abci/db
ENV CHECKPOINTS_PATH=/var/lib/dash/rs-drive-abci/db-checkpoints
ENV REJECTIONS_PATH=/var/log/dash/rejected

RUN mkdir -p /var/log/dash \
/var/lib/dash/rs-drive-abci/db \
${REJECTIONS_PATH}

COPY --from=build-drive-abci /artifacts/drive-abci /usr/bin/drive-abci
COPY packages/rs-drive-abci/.env.mainnet /var/lib/dash/rs-drive-abci/.env

# Create a volume
VOLUME /var/lib/dash/rs-drive-abci/db
VOLUME /var/log/dash

# Ensure required paths do exist
# TODO: remove /var/lib/dash-platform/data/checkpoints when drive-abci is fixed
RUN mkdir -p /var/log/dash \
${DB_PATH} \
${CHECKPOINTS_PATH} \
${REJECTIONS_PATH} \
/var/lib/dash-platform/data/checkpoints

# Double-check that we don't have missing deps
RUN ldd /usr/bin/drive-abci

Expand All @@ -574,9 +579,10 @@ RUN ldd /usr/bin/drive-abci
ARG USERNAME=dash
ARG USER_UID=1000
ARG USER_GID=$USER_UID
# TODO: remove /var/lib/dash-platform/data/checkpoints when drive-abci is fixed
RUN addgroup -g $USER_GID $USERNAME && \
adduser -D -u $USER_UID -G $USERNAME -h /var/lib/dash/rs-drive-abci $USERNAME && \
chown -R $USER_UID:$USER_GID /var/lib/dash/rs-drive-abci /var/log/dash
chown -R $USER_UID:$USER_GID /var/lib/dash/rs-drive-abci /var/log/dash /var/lib/dash-platform/data/checkpoints
Comment on lines +582 to +585
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consolidate checkpoint directory paths.

The ownership command includes a temporary checkpoint directory path (/var/lib/dash-platform/data/checkpoints). Once the TODO is addressed, this should be consolidated to use only the ${CHECKPOINTS_PATH} variable.

Apply this diff after the drive-abci fix is implemented:

-    chown -R $USER_UID:$USER_GID /var/lib/dash/rs-drive-abci /var/log/dash /var/lib/dash-platform/data/checkpoints
+    chown -R $USER_UID:$USER_GID /var/lib/dash/rs-drive-abci /var/log/dash ${CHECKPOINTS_PATH}

Committable suggestion skipped: line range outside the PR's diff.


USER $USERNAME

Expand Down
3 changes: 2 additions & 1 deletion packages/dapi-grpc/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,9 @@ tonic = { version = "0.12.3", features = [
serde = { version = "1.0.197", optional = true, features = ["derive"] }
serde_bytes = { version = "0.11.12", optional = true }
serde_json = { version = "1.0", optional = true }
tenderdash-proto = { git = "https://github.com/dashpay/rs-tenderdash-abci", version = "1.2.1", tag = "v1.2.1+1.3.0", default-features = false, features = [
tenderdash-proto = { git = "https://github.com/dashpay/rs-tenderdash-abci", rev = "b55bed9f574b68f2b6c96cbc80da41072056781d", default-features = false, features = [
"grpc",
"serde",
] }
dapi-grpc-macros = { path = "../rs-dapi-grpc-macros" }
platform-version = { path = "../rs-platform-version" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ export default function getBaseConfigFactory() {
tenderdash: {
mode: 'full',
docker: {
image: 'dashpay/tenderdash:1',
image: 'dashpay/tenderdash:feat-statesync-integration',
},
p2p: {
host: '0.0.0.0',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ filter-peers = false
# Example for routed multi-app setup:
# abci = "routed"
# address = "Info:socket:unix:///tmp/socket.1,Info:socket:unix:///tmp/socket.2,CheckTx:socket:unix:///tmp/socket.1,*:socket:unix:///tmp/socket.3"
address = "CheckTx:grpc:drive_abci:26670,*:socket:tcp://drive_abci:26658"
address = "ListSnapshots:grpc:drive_abci:26670,LoadSnapshotChunk:grpc:drive_abci:26670,CheckTx:grpc:drive_abci:26670,*:socket:tcp://drive_abci:26658"
# Transport mechanism to connect to the ABCI application: socket | grpc | routed
transport = "routed"
# Maximum number of simultaneous connections to the ABCI application
Expand All @@ -97,6 +97,10 @@ transport = "routed"
#]
grpc-concurrency = [
{ "check_tx" = {{= it.platform.drive.tenderdash.mempool.maxConcurrentCheckTx }} },
{ "list_snapshots" = {{= it.platform.drive.tenderdash.mempool.maxConcurrentCheckTx }} },
{ "load_snapshot_chunk" = {{= it.platform.drive.tenderdash.mempool.maxConcurrentCheckTx }} },
{ "offer_snapshot" = 1 },
{ "apply_snapshot_chunk" = 1 },
]


Expand Down Expand Up @@ -414,26 +418,17 @@ ttl-num-blocks = {{=it.platform.drive.tenderdash.mempool.ttlNumBlocks}}
# the network to take and serve state machine snapshots. State sync is not attempted if the node
# has any local state (LastBlockHeight > 0). The node will have a truncated block history,
# starting from the height of the snapshot.
enable = false
enable = true

# State sync uses light client verification to verify state. This can be done either through the
# P2P layer or RPC layer. Set this to true to use the P2P layer. If false (default), RPC layer
# will be used.
use-p2p = false
use-p2p = true

# If using RPC, at least two addresses need to be provided. They should be compatible with net.Dial,
# for example: "host.example.com:2125"
rpc-servers = ""

# The hash and height of a trusted block. Must be within the trust-period.
trust-height = 0
trust-hash = ""

# The trust period should be set so that Tendermint can detect and gossip misbehavior before
# it is considered expired. For chains based on the Cosmos SDK, one day less than the unbonding
# period should suffice.
trust-period = "168h0m0s"

# Time to spend discovering snapshots before initiating a restore.
discovery-time = "15s"

Expand Down
2 changes: 1 addition & 1 deletion packages/rs-dapi-client/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ sha2 = { version = "0.10", optional = true }
hex = { version = "0.4.3", optional = true }
lru = { version = "0.12.3" }
serde = { version = "1.0.197", optional = true, features = ["derive"] }
serde_json = { version = "1.0.120", optional = true }
serde_json = { version = "1.0", optional = true }
chrono = { version = "0.4.38", features = ["serde"] }

[dev-dependencies]
Expand Down
6 changes: 6 additions & 0 deletions packages/rs-drive-abci/.env.local
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@ ABCI_LOG_STDOUT_FORMAT=pretty
ABCI_LOG_STDOUT_COLOR=true

DB_PATH=/tmp/db

CHECKPOINTS_PATH=${DB_PATH}/checkpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Critical: Production environments configured to use temporary storage

The codebase has a proper default production path (/var/lib/dash-platform/data), but it's being overridden by environment files to use temporary storage:

  • .env.mainnet: DB_PATH=/tmp/db
  • .env.testnet: DB_PATH=/tmp/db

This configuration will lead to data loss on system reboot. Update the environment files to use persistent storage paths instead of /tmp/db.

🔗 Analysis chain

Verify database paths for production environments.

Both CHECKPOINTS_PATH and GROVEDB_LATEST_FILE inherit from DB_PATH which is set to /tmp/db. While this is fine for local development, using /tmp in production could lead to data loss as it's typically cleared on system reboot.

Run this script to check if there are any production environment files with similar configuration:

Also applies to: 19-19

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for production environment files and their DB_PATH configuration
# to ensure they don't use temporary directories.

echo "Searching for environment files:"
fd -t f "^\.env\.(production|prod)" packages/rs-drive-abci/

echo -e "\nChecking DB_PATH in found files:"
rg "^DB_PATH=" packages/rs-drive-abci/.env.*

Length of output: 702


Script:

#!/bin/bash
# Check for deployment documentation and configuration
echo "Searching for deployment-related files and documentation:"
fd -t f "^(deploy|docker|k8s|kubernetes|README)" packages/rs-drive-abci/

echo -e "\nChecking content of found files for database path configuration:"
rg -i "(\bdb[_\s-]path|database[_\s-]path|data[_\s-]dir)" packages/rs-drive-abci/

Length of output: 2638


# GroveDB database file
GROVEDB_LATEST_FILE=${DB_PATH}/latest_state

REJECTIONS_PATH=/tmp/rejected

# Cache size for Data Contracts
Expand Down
4 changes: 3 additions & 1 deletion packages/rs-drive-abci/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ tracing-subscriber = { version = "0.3.16", default-features = false, features =
"registry",
"tracing-log",
], optional = false }
tenderdash-abci = { git = "https://github.com/dashpay/rs-tenderdash-abci", version = "1.2.1", tag = "v1.2.1+1.3.0", features = [
tenderdash-abci = { git = "https://github.com/dashpay/rs-tenderdash-abci", rev = "b55bed9f574b68f2b6c96cbc80da41072056781d", features = [
"crypto",
"grpc",
"serde",
] }
lazy_static = "1.4.0"
itertools = { version = "0.13" }
Expand Down
43 changes: 41 additions & 2 deletions packages/rs-drive-abci/src/abci/app/check_tx.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
use crate::abci::app::PlatformApplication;
use crate::abci::app::{PlatformApplication, SnapshotManagerApplication};
use crate::abci::handler;
use crate::error::Error;
use crate::platform_types::platform::Platform;
use crate::platform_types::snapshot::SnapshotManager;
use crate::rpc::core::CoreRPCLike;
use crate::utils::spawn_blocking_task_with_name_if_supported;
use async_trait::async_trait;
Expand All @@ -22,6 +23,8 @@ where
/// Platform
platform: Arc<Platform<C>>,
core_rpc: Arc<C>,
/// Snapshot manager
snapshot_manager: SnapshotManager,
}

impl<C> PlatformApplication<C> for CheckTxAbciApplication<C>
Expand All @@ -33,13 +36,31 @@ where
}
}

impl<C> SnapshotManagerApplication for CheckTxAbciApplication<C>
where
C: CoreRPCLike + Send + Sync + 'static,
{
fn snapshot_manager(&self) -> &SnapshotManager {
&self.snapshot_manager
}
}

impl<C> CheckTxAbciApplication<C>
where
C: CoreRPCLike + Send + Sync + 'static,
{
/// Create new ABCI app
pub fn new(platform: Arc<Platform<C>>, core_rpc: Arc<C>) -> Self {
Self { platform, core_rpc }
let snapshot_manager = SnapshotManager::new(
platform.config.state_sync_config.checkpoints_path.clone(),
platform.config.state_sync_config.max_num_snapshots,
platform.config.state_sync_config.snapshots_frequency,
);
Self {
platform,
core_rpc,
snapshot_manager,
}
}
}

Expand Down Expand Up @@ -92,6 +113,24 @@ where
.await
.map_err(|error| tonic::Status::internal(format!("check tx panics: {}", error)))?
}

async fn list_snapshots(
&self,
request: tonic::Request<proto::RequestListSnapshots>,
) -> Result<tonic::Response<proto::ResponseListSnapshots>, tonic::Status> {
handler::list_snapshots(self, request.into_inner())
.map(tonic::Response::new)
.map_err(|e| tonic::Status::internal(format!("list_snapshots failed: {}", e)))
}

async fn load_snapshot_chunk(
&self,
request: tonic::Request<proto::RequestLoadSnapshotChunk>,
) -> Result<tonic::Response<proto::ResponseLoadSnapshotChunk>, tonic::Status> {
handler::load_snapshot_chunk(self, request.into_inner())
.map(tonic::Response::new)
.map_err(|e| tonic::Status::internal(format!("load_snapshot_chunk failed: {}", e)))
}
}

pub fn error_into_status(error: Error) -> tonic::Status {
Expand Down
Loading
Loading