# rusty-blockparser rusty-blockparser is a Bitcoin Blockchain Parser written in **Rust language**. It allows extraction of various data types (blocks, transactions, scripts, public keys/hashes, balances, ...) and UTXO dumps from Bitcoin based blockchains. ##### **Currently Supported Blockchains:** `Bitcoin`, `Namecoin`, `Litecoin`, `Dogecoin`, `Myriadcoin`, `Unobtanium` and `NoteBlockchain`. **IMPORANT:** It assumes a local unpruned copy of the blockchain with intact block index and blk files, downloaded with [Bitcoin Core](https://github.com/bitcoin/bitcoin) 0.15.1+ or similar clients. If you are not sure whether your local copy is valid you can apply `--verify` to validate the chain and block merkle trees. If something doesn't match the parser exits. ## Usage ``` Usage: rusty-blockparser [OPTIONS] [COMMAND] Commands: unspentcsvdump Dumps the unspent outputs to CSV file csvdump Dumps the whole blockchain into CSV files simplestats Shows various Blockchain stats balances Dumps all addresses with non-zero balance to CSV file opreturn Shows embedded OP_RETURN data that is representable as UTF8 help Print this message or the help of the given subcommand(s) Options: --verify Verifies merkle roots and block hashes -v... Increases verbosity level. Info=0, Debug=1, Trace=2 (default: 0) -c, --coin Specify blockchain coin (default: bitcoin) [possible values: bitcoin, testnet3, namecoin, litecoin, dogecoin, myriadcoin, unobtanium, noteblockchain] -d, --blockchain-dir Sets blockchain directory which contains blk.dat files (default: ~/.bitcoin/blocks) -s, --start Specify starting block for parsing (inclusive) -e, --end Specify last block for parsing (inclusive) (default: all known blocks) -h, --help Print help -V, --version Print version ``` ### Example To make a `unspentcsvdump` of the Bitcoin blockchain your command would look like this: ``` # ./blockparser unspentcsvdump /path/to/dump/ [6:02:53] INFO - main: Starting rusty-blockparser v0.7.0 ... [6:02:53] INFO - index: Reading index from ~/.bitcoin/blocks/index ... [6:02:54] INFO - index: Got longest chain with 639626 blocks ... [6:02:54] INFO - blkfile: Reading files from ~/.bitcoin/blocks ... [6:02:54] INFO - parser: Parsing Bitcoin blockchain (range=0..) ... [6:02:54] INFO - callback: Using `unspentcsvdump` with dump folder: /path/to/dump ... [6:03:04] INFO - parser: Status: 130885 Blocks processed. (left: 508741, avg: 13088 blocks/sec) ... [10:28:47] INFO - parser: Status: 639163 Blocks processed. (left: 463, avg: 40 blocks/sec) [10:28:57] INFO - parser: Status: 639311 Blocks processed. (left: 315, avg: 40 blocks/sec) [10:29:07] INFO - parser: Status: 639452 Blocks processed. (left: 174, avg: 40 blocks/sec) [10:29:17] INFO - parser: Status: 639596 Blocks processed. (left: 30, avg: 40 blocks/sec) [10:29:19] INFO - parser: Done. Processed 639626 blocks in 266.43 minutes. (avg: 40 blocks/sec) [10:32:01] INFO - callback: Done. Dumped all 639626 blocks: -> transactions: 549390991 -> inputs: 1347165535 -> outputs: 1359449320 [10:32:01] INFO - main: Fin. ``` ## Installing This tool should run on Windows, OS X and Linux. All you need is `rust` and `cargo`. ### Latest Release You can download the latest release from crates.io: ```bash cargo install rusty-blockparser ``` ### Build from source ```bash git clone https://github.com/gcarq/rusty-blockparser.git cd rusty-blockparser cargo build --release cargo test --release ./target/release/rusty-blockparser --help ``` It is important to build with `--release`, otherwise you will get a horrible performance! *Tested on Gentoo Linux with rust-stable 1.44.1* ## Supported Transaction Types Bitcoin and Bitcoin Testnet transactions are parsed using [rust-bitcoin](https://github.com/rust-bitcoin/rust-bitcoin), this includes transactions of type P2SH, P2PKH, P2PK, P2WSH, P2WPKH, P2TR, OP_RETURN and SegWit. Bitcoin forks (e.g.: Dogecoin, Litecoin, ...) are evaluated via a custom script implementation which includes P2PK, [P2PKH](https://en.bitcoin.it/wiki/Transaction#Pay-to-PubkeyHash), [P2SH](https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki) and some non-standard transactions. ## Memory Usage The required memory usage depends on the used callback: * simplestats: ~100MB * csvdump: ~100M * unspentcsvdump: ~18GB * balances: ~18GB NOTE: Those values are taken from parsing to block height 639631 (17.07.2020). ## Callbacks Callbacks are built on top of the core parser. They can be implemented to extract specific types of information. * `balances`: dumps all addresses with a non-zero balance. The csv file is in the following format: ``` balances.csv address ; balance ``` * `unspentcsvdump`: dumps all UTXOs along with the address balance. The csv file is in the following format: ``` unspent.csv txid ; indexOut ; height ; value ; address ``` NOTE: The total size of the csv dump is at least 8 GiB (height 635000). * `opreturn`: shows transactions with embedded OP_RETURN data that is representable as UTF8. * `csvdump`: dumps all parsed data as CSV files into the specified `folder`. See [Usage](#Usage) for an example. I chose CSV dumps instead of an active db-connection because `LOAD DATA INFILE` is the most performant way for bulk inserts. The files are in the following format: ``` blocks.csv block_hash ; height ; version ; blocksize ; hashPrev ; hashMerkleRoot ; nTime ; nBits ; nNonce ``` ``` transactions.csv txid ; hashBlock ; version ; lockTime ``` ``` tx_in.csv txid ; hashPrevOut ; indexPrevOut ; scriptSig ; sequence ``` ``` tx_out.csv txid ; indexOut ; height ; value ; scriptPubKey ; address ``` If unclear what some of these fields are, see the [block](https://en.bitcoin.it/wiki/Protocol_documentation#block) and [transaction](https://en.bitcoin.it/wiki/Protocol_documentation#tx) specifications. If you want to insert the files into MySql see [sql/schema.sql](sql/schema.sql). It contains all table structures and SQL statements for bulk inserting. Also see [sql/views.sql](sql/views.sql) for some query examples. NOTE: The total size of the csv dump is at least to 731 GiB (height 635000). * `simplestats`: prints some blockchain statistics like block count, transaction count, avg transactions per block, largest transaction, transaction types etc. You can also define custom callbacks. A callback gets called at startup, on each block and at the end. See [src/callbacks/mod.rs](src/callbacks/mod.rs) for more information. ## Contributing Use the issue tracker to report problems, suggestions and questions. You may also contribute by submitting pull requests. If you find this project helpful, please consider making a donation: `1LFidBTeg5joAqjw35ksebiNkVM8azFM1K` ## Customizing the tool for your coin The tool can easily be customized to your coin. This section outlines the changes that need to be made and is for a beginner user (both with Rust and Blockchain). (This guide is made possible by reviewing the commits made by MerlinMagic2018). During this example the coin name used is NoCoinium. * The main change is `src/blockchain/parser/types.rs`. * Add a new entry `pub struct NoCoinium` above the line `//pub struct Dash`(The case you use here is to be carried in all subsequent references, except when noted) * You will then need to add a `impl Coin for NoCoinium`. You could easily copy a previous block e.g. Bitcoin. The changes you need to do are highlighted below as comments ```rust //The name here should be the same case as defined in the pub struct line impl Coin for NoCoinium { fn name(&self) -> String { //This is primarily for display. Use same case as before String::from("NoCoinium") } fn magic(&self) -> u32 { // Magic bytes are a string of hex characters that prefix messages in the chain. // To find this value, look for the fields pchMessageStart[0-3] in the file chainparams.cpp under CMainParams // The value to be used here is 0x + pchMessageStart[3] + pchMessageStart[2] + pchMessageStart[1] + pchMessageStart[0] // i.e. string the values in reverse. 0xd9b4bef9 } fn version_id(&self) -> u8 { // Version ID is used to identify the address prefix for Base58 encoding of the public address // Found this using the stackoverflow comment - https://bitcoin.stackexchange.com/questions/62781/litecoin-constants-and-prefixes // Again with chainparams.cpp and CMainParams, look for base58Prefixes[PUBKEY_ADDRESS]. Convert the decimal value to Hex and add it here 0x00 } fn genesis(&self) -> sha256d::Hash { // This is the Genesis Block hash - Get the value from consensus.hashGenesisBlock, again found in chainparams.cpp sha256d::Hash::from_str("000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f").unwrap() } fn default_folder(&self) -> PathBuf { // This is the folder from the user's home folder to where the blocks files are found // Note the case here. It is not CamelCase as most coin directories are lower case. However, use the actual folder name // from your coin implementation. Path::new(".nocoinium").join("blocks") } } ``` * Finally, tie these changes within `impl FromStr for CoinType` under `match coin`. The first part will be the case passed as argument to the program (see bullet point below) and the name within `from()` will be the name used above. ```rust "nocoinium" => Ok(CoinType::from(NoCoinium)), ``` * The next change is in `src/main.rs`. Under the fn `parse_args()` add your coin to the array of coins. The case you use here will be the same value as you pass in the arguments when executing the blockchain (using the `-c` argument) * Finally, add your coin name in the README.md file so others know your coin is supported ## TODO * Implement Pay2MultiSig script evaluation