| Crates.io | hdfs-native |
| lib.rs | hdfs-native |
| version | 0.12.2 |
| created_at | 2023-09-11 22:47:53.453538+00 |
| updated_at | 2025-08-17 23:07:44.991215+00 |
| description | Native HDFS client implementation in Rust |
| homepage | https://github.com/Kimahriman/hdfs-native |
| repository | https://github.com/Kimahriman/hdfs-native |
| max_upload_size | |
| id | 970202 |
| size | 811,918 |
hdfs-native is an HDFS client written natively in Rust. It supports nearly all major features of an HDFS client, and several key client configuration options listed below.
Here is a list of currently supported and unsupported but possible future features.
Kerberos (SASL GSSAPI) mechanism is supported through a runtime dynamic link to libgssapi_krb5. This must be installed separately, but is likely already installed on your system. If not you can install it by:
apt-get install libgssapi-krb5-2
yum install krb5-libs
brew install krb5
Download and install the Microsoft Kerberos package from https://web.mit.edu/kerberos/dist/
Copy the <INSTALL FOLDER>\MIT\Kerberos\bin\gssapi64.dll file to a folder in %PATH% and change the name to gssapi_krb5.dll
The client will attempt to read Hadoop configs core-site.xml and hdfs-site.xml in the directories $HADOOP_CONF_DIR or if that doesn't exist, $HADOOP_HOME/etc/hadoop. Currently the supported configs that are used are:
fs.defaultFS - Client::default() supportdfs.ha.namenodes - name service supportdfs.namenode.rpc-address.* - name service supportdfs.client.failover.resolve-needed.* - DNS based NameNode discoverydfs.client.failover.resolver.useFQDN.* - DNS based NameNode discoverydfs.client.failover.random.order.* - Randomize order of NameNodes to trydfs.client.failover.proxy.provider.* - Supports the behavior of the following proxy providers. Any other values will default back to the ConfiguredFailoverProxyProvider behavior:
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProviderorg.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProviderorg.apache.hadoop.hdfs.server.namenode.ha.RouterObserverReadConfiguredFailoverProxyProviderdfs.client.block.write.replace-datanode-on-failure.enabledfs.client.block.write.replace-datanode-on-failure.policydfs.client.block.write.replace-datanode-on-failure.best-effortfs.viewfs.mounttable.*.link.* - ViewFS linksfs.viewfs.mounttable.*.linkFallback - ViewFS link fallbackAll other settings are generally assumed to be the defaults currently. For instance, security is assumed to be enabled and SASL negotiation is always done, but on insecure clusters this will just do SIMPLE authentication. Any setups that require other customized Hadoop client configs may not work correctly.
cargo build
An object_store implementation for HDFS is provided in the hdfs-native-object-store crate.
The tests are mostly integration tests that utilize a small Java application in rust/mindifs/ that runs a custom MiniDFSCluster. To run the tests, you need to have Java, Maven, Hadoop binaries, and Kerberos tools available and on your path. Any Java version between 8 and 17 should work.
cargo test -p hdfs-native --features intergation-test
See the Python README
Some of the benchmarks compare performance to the JVM based client through libhdfs via the fs-hdfs3 crate. Because of that, some extra setup is required to run the benchmarks:
export HADOOP_CONF_DIR=$(pwd)/rust/target/test
export CLASSPATH=$(hadoop classpath)
then you can run the benchmarks with
cargo bench -p hdfs-native --features benchmark
The benchmark feature is required to expose minidfs and the internal erasure coding functions to benchmark.
The examples make use of the minidfs module to create a simple HDFS cluster to run the example. This requires including the integration-test feature to enable the minidfs module. Alternatively, if you want to run the example against an existing HDFS cluster you can exclude the integration-test feature and make sure your HADOOP_CONF_DIR points to a directory with HDFS configs for talking to your cluster.
cargo run --example simple --features integration-test