| Crates.io | rcp-tools-throttle |
| lib.rs | rcp-tools-throttle |
| version | 0.27.0 |
| created_at | 2025-10-11 21:18:06.837581+00 |
| updated_at | 2026-01-23 15:51:48.933171+00 |
| description | Internal library for RCP tools - resource throttling and rate limiting (not intended for direct use) |
| homepage | |
| repository | https://github.com/wykurz/rcp |
| max_upload_size | |
| id | 1878558 |
| size | 41,518 |
This repo contains tools to efficiently copy, remove and link large filesets, both locally and across remote hosts.

rcp is for copying files; similar to cp but generally MUCH faster when dealing with large filesets.
Supports both local and remote copying using host:/path syntax (similar to scp).
Inspired by tools like dsync(1) and pcp(2).
rrm is for removing large filesets.
rlink allows hard-linking filesets with optional update path; typically used for hard-linking datasets with a delta.
rcmp tool is for comparing filesets.
filegen tool generates sample filesets, useful for testing.
API documentation for the command-line tools is available on docs.rs:
For contributors: Internal library crates used by the tools above:
Design and reference documents (in the docs/ directory):
Basic local copy with progress-bar and summary at the end:
> rcp <foo> <bar> --progress --summary
Copy while preserving metadata, overwrite/update destination if it already exists:
> rcp <foo> <bar> --preserve --progress --summary --overwrite
Remote copy from one host to another:
> rcp user@host1:/path/to/source user@host2:/path/to/dest --progress --summary
Copies files from host1 to host2. The rcpd process is automatically started on both hosts via SSH.
Copy from remote host to local machine:
> rcp host:/remote/path /local/path --progress --summary
Copy from local machine to remote host and preserve metadata:
> rcp /local/path host:/remote/path --progress --summary --preserve
Remote copy with automatic rcpd deployment:
> rcp /local/data remote-host:/backup --auto-deploy-rcpd --progress
Automatically deploys rcpd to the remote host if not already installed. Useful for dynamic infrastructure, development environments, or when rcpd is not pre-installed on remote hosts.
Log tool output to a file while using progress bar:
> rcp <foo> <bar> --progress --summary > copy.log
Progress bar is sent to stderr while log messages go to stdout. This allows us to pipe stdout to a file to preserve the tool output while still viewing the interactive progress bar. This works for all RCP tools.
~ support)~ or ~/... expands to your local $HOME.~/... expands to the remote user’s $HOME (resolved over SSH). Other ~user forms are not supported.~/, or be relative. Relative remote paths are resolved against the local current working directory before being used remotely.Remove a path recursively:
> rrm <bar> --progress --summary
Hard-link contents of one path to another:
> rlink <foo> <bar> --progress --summary
Roughly equivalent to: cp -p --link <foo> <bar>.
Hard-link contents of <foo> to <baz> if they are identical to <bar>:
> rlink <foo> --update <bar> <baz> --update-exclusive --progress --summary
Using --update-exclusive means that if a file is present in <foo> but not in <bar> it will be ignored.
Roughly equivalent to: rsync -a --link-dest=<foo> <bar> <baz>.
Compare <foo> vs. <bar>:
# differences are printed to stdout by default
> rcmp <foo> <bar> --progress --summary
# use --log to write differences to a file instead
> rcmp <foo> <bar> --progress --summary --log compare.log
# use --quiet to suppress stdout output (exit code only)
> rcmp <foo> <bar> --quiet
# --quiet with --log: silent stdout but differences still written to file
> rcmp <foo> <bar> --quiet --log compare.log
All tools are available via nixpkgs under rcp package name.
The following command will install all the tools on your system:
> nix-env -iA nixpkgs.rcp
All tools are available on crates.io. Individual tools can be installed using cargo install:
> cargo install rcp-tools-rcp
Starting with release v0.10.1, .deb and .rpm packages are available as part of each release.
The repository is configured to build static musl binaries by default via .cargo/config.toml. Simply run cargo build or cargo build --release to produce fully static binaries. To build glibc binaries instead, use cargo build --target x86_64-unknown-linux-gnu.
For development, enter the nix environment (nix develop) to get all required tools including the musl toolchain. Outside nix shell, install the musl target with rustup target add x86_64-unknown-linux-musl and ensure you have musl-tools installed (e.g., apt-get install musl-tools on Ubuntu/Debian).
The copy semantics for RCP tools differ slightly from how e.g. the cp tool works. This is because of the ambiguity in the result of a cp operation that we wanted to avoid.
Specifically, the result of cp foo/x bar/x depends on bar/x being a directory. If so, the resulting path will be bar/x/x (which is usually undesired), otherwise it will be bar/x.
To avoid this confusion, RCP tools:
--overwrite to change)The following examples illustrate this (those rules apply to both rcp and rlink):
rcp A/B C/D - copy A/B into C/ and name it D; if C/D exists fail immediatelyrcp A/B C/D/ - copy B into D WITHOUT renaming i.e., the resulting path will be C/D/B; if C/B/D exists fail immediatelyUsing rcp it's also possible to copy multiple sources into a single destination, but the destination MUST have a trailing slash (/):
rcp A B C D/ - copy A, B and C into D WITHOUT renaming i.e., the resulting paths will be D/A, D/B and D/C; if any of which exist fail immediatelyset --ops-throttle to limit the maximum number of operations per second
set --iops-throttle to limit the maximum number of I/O operations per second
--chunk-size, which is used to calculate I/O operations per fileset --max-open-files to limit the maximum number of open files
rcp tools will log non-terminal errors and continue by default--fail-early flagWhen using remote paths (host:/path syntax), rcp automatically starts rcpd daemons on remote hosts via SSH.
Requirements:
rcpd binary available on remote hosts (see Auto-deployment below for automatic setup)Auto-deployment:
Starting with v0.22.0, rcp can automatically deploy rcpd to remote hosts using the --auto-deploy-rcpd flag. This eliminates the need to manually install rcpd on each remote host.
# automatic deployment - no manual setup required
> rcp --auto-deploy-rcpd host1:/source host2:/dest --progress
When auto-deployment is enabled:
rcp finds the local rcpd binary (same directory or PATH)~/.cache/rcp/bin/rcpd-{version} on remote hosts via SSHManual deployment is still supported and may be preferred for:
Configuration options:
--port-ranges - restrict TCP data ports to specific ranges (e.g., "8000-8999")--remote-copy-conn-timeout-sec - connection timeout in seconds (default: 15)Architecture: The remote copy uses a three-node architecture:
rcp) orchestrates the copy operationrcpd reads files from source hostrcpd writes files to destination hostFor detailed network connectivity and troubleshooting information, see docs/remote_copy.md.
Remote copy uses SSH for authentication and TLS 1.3 for encrypted data transfer.
Security Model:
What's Protected:
Performance Option:
For trusted networks where encryption overhead is undesirable, use --no-encryption:
> rcp --no-encryption source:/path dest:/path
Warning: This disables both encryption and authentication on the data path.
Best Practices:
--no-encryption on isolated, trusted networksFor detailed security architecture and threat model, see docs/security.md.
Log messages
stdout-v/-vv/-vvv for INFO/DEBUG/TRACE and -q/--quiet to disableProgress
stderr (both ProgressBar and TextUpdates)-p/--progress with optional --progress-type=... overrideSummary
stdout--summaryrcp tools will not-overwrite pre-existing data unless used with the --overwrite flag.
For maximum throughput, especially with remote copies over high-speed networks, consider these optimizations.
rcp automatically requests larger TCP socket buffers for high-throughput transfers, but the kernel caps these to system limits. Increase the limits to allow full utilization of high-bandwidth links:
# Check current limits
sysctl net.core.rmem_max net.core.wmem_max
# Increase to 16 MiB (requires root, temporary until reboot)
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
# Make permanent (add to /etc/sysctl.d/99-rcp-perf.conf)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
The default is often insufficient for 10+ Gbps links.
When copying large filesets with many concurrent operations, you may hit the open file limit:
# Check current limit
ulimit -n
# Increase for current session
ulimit -n 65536
# Make permanent (add to /etc/security/limits.conf)
* soft nofile 65536
* hard nofile 65536
rcp automatically queries the system limit and uses --max-open-files to self-throttle, but higher limits allow more parallelism.
For very high-speed networks, increase the kernel's packet processing capacity:
sudo sysctl -w net.core.netdev_max_backlog=16384
sudo sysctl -w net.core.netdev_budget=600
Control parallelism with --max-workers:
# Use all CPU cores (default)
rcp --max-workers=0 /source /dest
# Limit to 4 workers (reduce I/O contention)
rcp --max-workers=4 /source /dest
For remote copies, the --remote-copy-buffer-size flag controls the size of data chunks sent over TCP:
# Larger buffers for high-bandwidth links (default: 16 MiB for datacenter)
rcp --remote-copy-buffer-size=32MiB host1:/data host2:/data
# Smaller buffers for constrained memory
rcp --remote-copy-buffer-size=4MiB host1:/data host2:/data
Use --network-profile to optimize for your network type:
# For datacenter/local networks (aggressive settings)
rcp --network-profile=datacenter host1:/data host2:/data
# For internet transfers (conservative settings)
rcp --network-profile=internet host1:/data host2:/data
Control concurrent TCP connections for file transfers (default: 100):
# Increase for many small files on high-bandwidth links
rcp --max-connections=200 host1:/many-small-files host2:/dest
# Decrease to reduce resource usage
rcp --max-connections=16 host1:/data host2:/dest
When rcp starts, it logs the actual buffer sizes achieved (visible with -v). If the actual sizes are much smaller than requested, increase your system's rmem_max/wmem_max.
# Network interface drops
ip -s link show eth0 | grep -A1 RX | grep dropped
# TCP retransmissions (high values indicate network congestion)
ss -ti | grep retrans
For optimal performance on high-speed networks:
rmem_max/wmem_max to 16+ MiBulimit -n if copying many files--network-profile=datacenter for local/datacenter networks--progress to monitor throughput in real-time-v output to verify buffer sizes and connection setupThe filegen tool generates random test data, which is CPU-intensive. Unlike other rcp tools that are typically I/O-bound, filegen's bottleneck is often the CPU generating random bytes.
Default behavior: filegen defaults --max-open-files to the number of physical CPU cores, rather than 80% of the system's open file limit used by other tools. This matches concurrency to compute capacity, avoiding excessive parallelism that would cause CPU contention.
Tuning for your workload:
# Use default (physical cores) - optimal for fast storage
filegen /tmp 3,2 10 1M --progress
# Increase for slow storage where I/O latency dominates
filegen /tmp 3,2 10 1M --max-open-files=64 --progress
# No limit (unlimited concurrency)
filegen /tmp 3,2 10 1M --max-open-files=0 --progress
rcp supports several profiling and debugging options.
Produces JSON trace files viewable in Perfetto UI or chrome://tracing.
# Profile a local copy
rcp --chrome-trace=/tmp/trace /source /dest
# Profile a remote copy (traces produced on all hosts)
rcp --chrome-trace=/tmp/trace host1:/path host2:/path
Output files are named: {prefix}-{identifier}-{hostname}-{pid}-{timestamp}.json
Example output:
/tmp/trace-rcp-master-myhost-12345-2025-01-15T10:30:45.json/tmp/trace-rcpd-source-host1-23456-2025-01-15T10:30:46.json/tmp/trace-rcpd-destination-host2-34567-2025-01-15T10:30:46.jsonView traces by opening https://ui.perfetto.dev and dragging the JSON file into the browser.
Produces folded stack files convertible to SVG flamegraphs using inferno.
# Profile and generate flamegraph data
rcp --flamegraph=/tmp/flame /source /dest
# Convert to SVG (requires: cargo install inferno)
cat /tmp/flame-*.folded | inferno-flamegraph > flamegraph.svg
# Or use inferno-flamechart to preserve chronological order
cat /tmp/flame-*.folded | inferno-flamechart > flamechart.svg
Output files are named: {prefix}-{identifier}-{hostname}-{pid}-{timestamp}.folded
Control which spans are captured with --profile-level (default: trace):
# Capture only info-level and above spans
rcp --chrome-trace=/tmp/trace --profile-level=info /source /dest
Only spans from rcp crates are captured (not tokio internals).
Enable tokio-console for real-time async task inspection:
# Start rcp with tokio-console enabled
rcp --tokio-console /source /dest
# Or specify a custom port
rcp --tokio-console --tokio-console-port=6670 /source /dest
# Connect with tokio-console CLI
tokio-console http://127.0.0.1:6669
Trace events are retained for 60s by default. This can be modified with RCP_TOKIO_TRACING_CONSOLE_RETENTION_SECONDS=120.
All profiling options can be used together:
rcp --chrome-trace=/tmp/trace --flamegraph=/tmp/flame --tokio-console /source /dest