| Crates.io | s3find |
| lib.rs | s3find |
| version | 0.16.0 |
| created_at | 2018-08-16 21:23:04.50192+00 |
| updated_at | 2026-01-10 22:18:47.014331+00 |
| description | A command line utility to walk an Amazon S3 hierarchy. s3find is an analog of find for Amazon S3. |
| homepage | https://github.com/AnderEnder/s3find-rs |
| repository | https://github.com/AnderEnder/s3find-rs |
| max_upload_size | |
| id | 79836 |
| size | 511,720 |
A powerful command line utility to walk an Amazon S3 hierarchy. Think of it as the find command but specifically designed for Amazon S3.
Github Release page provides ready-to-use binaries for:
Binaries for both architectures allow you to run s3find natively on Intel-based and ARM-based machines (like Apple M1/M2/M3/M4, AWS Graviton, and Raspberry Pi).
Requirements: Rust and Cargo
# Build
cargo build --release
# Install from local source
cargo install --path .
# Install latest from git
cargo install --git https://github.com/AnderEnder/s3find-rs
# Install from crates.io
cargo install s3find
s3find [OPTIONS] <s3-path> [COMMAND]
Where:
<s3-path> is formatted as s3://bucket/path[OPTIONS] are filters and controls[COMMAND] is the action to perform on matched objectss3find supports multiple AWS authentication methods in the following priority:
--aws-access-key and --aws-secret-key)AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)AWS_PROFILE and AWS_SHARED_CREDENTIALS_FILE)s3find supports self-hosted and third-party S3-compatible services such as MinIO, Ceph, and others. Use the following options to connect to these services:
--endpoint-url <URL> - Custom S3 endpoint URL--force-path-style - Use path-style bucket addressing (required for most non-AWS S3 services)s3find 's3://mybucket/path' \
--endpoint-url 'http://localhost:9000' \
--force-path-style \
--aws-access-key 'minioadmin' \
--aws-secret-key 'minioadmin' \
--aws-region 'us-east-1' \
ls
s3find 's3://mybucket/data' \
--endpoint-url 'https://ceph.example.com' \
--force-path-style \
--aws-access-key 'your-access-key' \
--aws-secret-key 'your-secret-key' \
--name '*.log' \
print
Walk an Amazon S3 path hierarchy
The authorization flow is the following chain:
* use credentials from arguments provided by users
* use environment variable credentials: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
* use credentials via aws file profile.
Profile can be set via environment variable AWS_PROFILE
Profile file can be set via environment variable AWS_SHARED_CREDENTIALS_FILE
* use AWS instance IAM profile
* use AWS container IAM profile
Usage: s3find [OPTIONS] <path> [COMMAND]
Commands:
exec Exec any shell program with every key
print Extended print with detail information
delete Delete matched keys
download Download matched keys
copy Copy matched keys to a s3 destination
move Move matched keys to a s3 destination
ls Print the list of matched keys
lstags Print the list of matched keys with tags
tags Set the tags(overwrite) for the matched keys
public Make the matched keys public available (readonly)
restore Restore objects from Glacier and Deep Archive storage
change-storage Change storage class of matched objects and move objects to Glacier or Deep Archive
nothing Do not do anything with keys, do not print them as well
help Print this message or the help of the given subcommand(s)
Arguments:
<path>
S3 path to walk through. It should be s3://bucket/path
Options:
--aws-access-key <aws-access-key>
AWS access key. Unrequired
--aws-secret-key <aws-secret-key>
AWS secret key from AWS credential pair. Required only for the credential based authentication
--aws-region <aws-region>
[default: us-east-1]
--name <pattern>
Glob pattern for match, can be multiple
--iname <ipattern>
Case-insensitive glob pattern for match, can be multiple
--regex <rpattern>
Regex pattern for match, can be multiple
--storage-class <storage-class>
Object storage class for match
Valid values are:
DEEP_ARCHIVE
EXPRESS_ONEZONE
GLACIER
GLACIER_IR
INTELLIGENT_TIERING
ONEZONE_IA
OUTPOSTS
REDUCED_REDUNDANCY
SNOW
STANDARD
STANDARD_IA
Unknown values are also supported
--mtime <time>
Modification time for match, a time period:
-5d - for period from now-5d to now
+5d - for period before now-5d
Possible time units are as follows:
s - seconds
m - minutes
h - hours
d - days
w - weeks
Can be multiple, but should be overlaping
--bytes-size <bytes-size>
File size for match:
5k - exact match 5k,
+5k - bigger than 5k,
-5k - smaller than 5k,
Possible file size units are as follows:
k - kilobytes (1024 bytes)
M - megabytes (1024 kilobytes)
G - gigabytes (1024 megabytes)
T - terabytes (1024 gigabytes)
P - petabytes (1024 terabytes)
--limit <limit>
Limit result
--number <number>
The number of results to return in each response to a
list operation. The default value is 1000 (the maximum
allowed). Using a lower value may help if an operation
times out.
[default: 1000]
-s, --summarize
Print summary statistic
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
Use the --name option with glob patterns to match objects:
# Find all objects in a path
s3find 's3://example-bucket/example-path' --name '*' ls
# Find objects with specific extension
s3find 's3://example-bucket/example-path' --name '*.json' ls
The print command supports different output formats:
# Default format
s3find 's3://example-bucket/example-path' --name '*' print
# Text format
s3find 's3://example-bucket/example-path' --name '*' print --format text
# JSON format
s3find 's3://example-bucket/example-path' --name '*' print --format json
# CSV format
s3find 's3://example-bucket/example-path' --name '*' print --format csv
Use --iname for case-insensitive glob pattern matching:
s3find 's3://example-bucket/example-path' --iname '*data*' ls
Use --regex for regular expression pattern matching:
# Find objects ending with a number
s3find 's3://example-bucket/example-path' --regex '[0-9]$' print
# Exact match - files exactly 0 bytes
s3find 's3://example-bucket/example-path' --size 0 print
# Larger than 10 megabytes
s3find 's3://example-bucket/example-path' --size +10M print
# Smaller than 10 kilobytes
s3find 's3://example-bucket/example-path' --size -10k print
# Files modified in the last 10 seconds
s3find 's3://example-bucket/example-path' --mtime -10s print
# Files modified more than 10 minutes ago
s3find 's3://example-bucket/example-path' --mtime +10m print
# Files modified in the last 10 hours
s3find 's3://example-bucket/example-path' --mtime -10h print
Filter objects by their storage class:
# Find objects in STANDARD storage class
s3find 's3://example-bucket/example-path' --storage-class STANDARD print
# Find objects in GLACIER storage class
s3find 's3://example-bucket/example-path' --storage-class GLACIER print
Filter objects based on their S3 tags using --tag for key-value matching or --tag-exists for key presence:
# Find objects with a specific tag key and value
s3find 's3://example-bucket/example-path' --tag 'environment=production' ls
# Find objects with multiple tags (AND logic - all must match)
s3find 's3://example-bucket/example-path' --tag 'environment=production' --tag 'team=data' ls
# Find objects that have a specific tag key (any value)
s3find 's3://example-bucket/example-path' --tag-exists 'owner' ls
# Combine tag filters with other filters
s3find 's3://example-bucket/example-path' --name '*.log' --tag 'retention=long-term' ls
# Control concurrency for tag fetching (default: 50)
s3find 's3://example-bucket/example-path' --tag 'env=prod' --tag-concurrency 100 ls
Required IAM Permissions:
Tag filtering requires the s3:GetObjectTagging permission on the objects being searched.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObjectTagging"
],
"Resource": [
"arn:aws:s3:::example-bucket",
"arn:aws:s3:::example-bucket/*"
]
}
]
}
Combine filters to create more specific queries:
# Files between 10 and 20 bytes
s3find 's3://example-bucket/example-path' --size +10 --size -20 print
# Combine different filter types
s3find 's3://example-bucket/example-path' --size +10M --name '*backup*' print
s3find 's3://example-bucket/example-path' --name '*.tmp' delete
# List objects
s3find 's3://example-bucket/example-path' --name '*' ls
# List objects with their tags
s3find 's3://example-bucket/example-path' --name '*' lstags
s3find 's3://example-bucket/example-path' --name '*' exec 'echo {}'
s3find 's3://example-bucket/example-path' --name '*.pdf' download
# Copy files to another location
s3find 's3://example-bucket/example-path' --name '*.dat' copy -f 's3://example-bucket/example-path2'
# Move files to another location
s3find 's3://example-bucket/example-path' --name '*.dat' move -f 's3://example-bucket/example-path2'
s3find 's3://example-bucket/example-path' --name '*archive*' tags 'key:value' 'env:staging'
s3find 's3://example-bucket/example-path' --name '*.public.html' public
Control the number of results and request behavior:
# Limit to first 10 matching objects
s3find 's3://example-bucket/example-path' --name '*' --limit 10 ls
# Control page size for S3 API requests
s3find 's3://example-bucket/example-path' --name '*' --number 100 ls
Limit how deep s3find descends into the object hierarchy:
# Only objects at the bucket root level (no subdirectories)
s3find 's3://example-bucket/' --maxdepth 0 ls
# Objects up to one subdirectory level deep
s3find 's3://example-bucket/' --maxdepth 1 ls
# Objects up to two levels deep
s3find 's3://example-bucket/data/' --maxdepth 2 ls
The --maxdepth option uses S3's delimiter-based traversal for efficient server-side filtering, avoiding the need to fetch objects beyond the specified depth.
List all versions of objects in versioned buckets:
# List all versions of all objects
s3find 's3://example-bucket/' --all-versions ls
# List all versions matching a pattern
s3find 's3://example-bucket/' --all-versions --name '*.log' ls
# Print all versions in JSON format
s3find 's3://example-bucket/' --all-versions print --format json
When --all-versions is enabled:
file.txt?versionId=abc123); the underlying S3 key remains unchanged(latest)(delete marker)Version-aware operations: When using --all-versions, operations work on specific versions:
delete - Deletes specific object versions (not just the current version)copy/move - Copies or moves specific versions to the destinationdownload - Downloads specific versions of objectsNote: Delete markers are automatically skipped for operations that don't support them (copy, move, download, tags, restore, change-storage, public) since they have no content.
Note: --all-versions is not compatible with --maxdepth. If both are specified, --all-versions takes precedence and --maxdepth is ignored.
Tag filtering requires an individual GetObjectTagging API call for each object that passes the other filters. To optimize performance and cost:
Apply other filters first: Use --name, --mtime, --bytes-size, or --storage-class filters to reduce the number of objects before tag filtering is applied.
Use --limit: When testing or exploring, use --limit to cap the number of objects processed.
Adjust concurrency: Use --tag-concurrency to tune parallel API calls (default: 50). Higher values increase throughput but may cause throttling.
Cost awareness: GetObjectTagging requests incur S3 API charges that vary by region. For large buckets with millions of objects, apply filters to reduce the number of tag fetch operations. Refer to the AWS S3 pricing documentation for current rates.
Example: Optimized tag filtering
# Bad: Fetches tags for ALL objects (expensive for large buckets)
s3find 's3://large-bucket/' --tag 'env=prod' ls
# Good: Apply cheap filters first, then tag filter
s3find 's3://large-bucket/' --name '*.log' --mtime -7d --tag 'env=prod' ls
# Good: Use limit when exploring
s3find 's3://large-bucket/' --tag 'env=prod' --limit 100 ls
When --summarize is enabled, tag fetch statistics are displayed showing success/failure counts.
For more information, see the GitHub repository.