Crates.io | zksync_dal |
lib.rs | zksync_dal |
version | 0.1.0 |
source | src |
created_at | 2024-07-12 12:41:47.669515 |
updated_at | 2024-07-12 12:41:47.669515 |
description | ZKsync data access layer |
homepage | https://zksync.io/ |
repository | https://github.com/matter-labs/zksync-era |
max_upload_size | |
id | 1300637 |
size | 1,189,877 |
This crate provides read and write access to the main database (which is Postgres), that acts as a primary source of truth.
Current schema is managed by sqlx
. Schema changes are stored in the migrations
directory.
This overview skips prover-related and Ethereum sender-related tables, which are specific to the main node.
miniblocks
. Stores L2 block headers. The naming is due to historic reasons.
miniblocks_consensus
. Stores L2 block data related to the consensus algorithm used by the decentralized sequencer.
Tied one-to-one to L2 blocks (the consensus side of the relation is optional).
l1_batches
. Stores L1 batch headers.
commitments
. Stores a part of L1 batch commitment data (event queue and bootloader memory commitments). In the
future, other commitment-related columns will be moved here from l1_batches
.
transactions
. Stores all transactions received by the node, both L2 and L1 ones. Transactions in this table are not
necessarily included into an L2 block; i.e., the table is used as a persistent mempool as well.See zksync_state
crate for more context.
storage_logs
. Stores all the VM storage write logs for all transactions, as well as non-transaction writes generated
by the bootloader. This is the source of truth for the VM storage; all other VM storage implementations (see the
zksync_state
crate) are based on it (e.g., by adding persistent or in-memory caching). Used by multiple components
including Metadata calculator, Commitment generator, API server (both for reading one-off values like account balance,
and as a part of the VM sandbox) etc.
initial_writes
. Stores initial writes information for each L1 batch, i.e., the enumeration index assigned for each
key. Used when generating L1 batch metadata in Metadata calculator and Commitment generator components, and in the VM
sandbox in API server for fee estimation.
protective_reads
. Stores protective read information for each L1 batch, i.e., keys influencing VM execution for the
batch that were not modified. Used when generating L1 batch metadata in Commitment generator.
factory_deps
. Stores bytecodes of all deployed L2 contracts.
storage
. Obsolete, going to be removed; must not be used in new code.
events
. Stores all events (aka logs) emitted by smart contracts during VM execution.
l2_to_l1_logs
. Stores L2-to-L1 logs emitted by smart contracts during VM execution.
call_traces
. Stores call traces for transactions emitted during VM execution. (Unlike with L1 node implementations,
in Era call traces are currently proactively generated for all transactions.)
tokens
. Stores all ERC-20 tokens registered in the L1–L2 bridge.
transaction_traces
. Obsolete, going to be removed; must not be used in new code.
See snapshots_creator
and snapshots_applier
crates for the overview of application-level nodes snapshots.
snapshots
. Stores metadata for all snapshots generated by snapshots_creator
, such as the L1 batch of the snapshot.
snapshot_recovery
. Stores metadata for the snapshot used during node recovery, if any. Currently, this table is
expected to have no more than one row.
In addition to foreign key constraints and other constraints manifested directly in the DB schema, the following invariants are expected to be upheld:
miniblocks
table, it is expected that the DB contains all artifacts associated with
the L2 block execution, such as events
, l2_to_l1_logs
, call_traces
, tokens
etc. (See State keeper I/O logic
for the exact definition of these artifacts.)l1_batches
table, all artifacts associated with the L1 batch execution are
also expected in the DB, e.g. initial_writes
and protective_reads
. (See State keeper I/O logic for the exact
definition of these artifacts.)snapshot_recovery
table. Otherwise, L2 blocks / L1 batches must start from number 0 (aka genesis).address
and key
fields in the storage_logs
table are not null for all blocks executed on the node (i.e., blocks
the header of which is present in miniblocks
). On the other hand, address
and key
fields may be null for
snapshot storage logs. These fields are needed for some components post-processing L1 batches, such as the Merkle tree
and the commitment generator. Both use (address, key)
tuples to sort logs in a batch to get canonical ordering.
Since a snapshot is not post-processed in such a way, it is acceptable to skip them for the snapshot logs (and only
for them).Some tips and tricks to make contributing to DAL easier:
.sqlx
directory for the identical /
equivalent queries. Reuse is almost always better than duplication.instrument
tooling.
See the instrument
module docs for details.sqlx
has
compile-time schema checking, but it's not a panacea.EXPLAIN
/ EXPLAIN ANALYZE
prefixes against a
production-size database.All DB schema changes are expected to be backward-compatible. That is, old code must be able to function with the new schema. As an example, dropping / renaming columns is not allowed. Instead, a 2-phase migration should be used:
We cannot afford non-trivial amount of downtime caused by a data migration. That is, if a migration may cause such downtime (e.g., it copies non-trivial amount of data), it must be organized as a programmatic migration and run in the node background (perhaps, splitting work into chunks with a delay between them so that the migration doesn't hog all DB resources).