fs-hdfs

Crates.iofs-hdfs
lib.rsfs-hdfs
version0.1.12
sourcesrc
created_at2021-09-24 15:47:27.288462
updated_at2023-09-07 10:44:18.509287
descriptionlibhdfs binding library and safe Rust APIs
homepagehttps://github.com/datafusion-contrib/fs-hdfs
repositoryhttps://github.com/datafusion-contrib/fs-hdfs
max_upload_size
id455890
size297,555
(yahoNanJing)

documentation

https://docs.rs/crate/fs-hdfs

README

fs-hdfs

It's based on the version 0.0.4 of http://hyunsik.github.io/hdfs-rs to provide libhdfs binding library and rust APIs which safely wraps libhdfs binding APIs.

Current Status

  • All libhdfs FFI APIs are ported.
  • Safe Rust wrapping APIs to cover most of the libhdfs APIs except those related to zero-copy read.
  • Compared to hdfs-rs, it removes the lifetime in HdfsFs, which will be more friendly for others to depend on.

Documentation

Requirements

  • The C related files are from the branch 2.7.3 of hadoop repository. For rust usage, a few changes are also applied.
  • No need to compile the Hadoop native library by yourself. However, the Hadoop jar dependencies are still required.

Usage

Add this to your Cargo.toml:

[dependencies]
fs-hdfs = "0.1.12"

Build

We need to specify $JAVA_HOME to make Java shared library available for building.

Run

Since our compiled libhdfs is JNI-based implementation, it requires Hadoop-related classes available through CLASSPATH. An example,

export CLASSPATH=$CLASSPATH:`hadoop classpath --glob`

Also, we need to specify the JVM dynamic library path for the application to load the JVM shared library at runtime.

For jdk8 and macOS, it's

export DYLD_LIBRARY_PATH=$JAVA_HOME/jre/lib/server

For jdk11 (or later jdks) and macOS, it's

export DYLD_LIBRARY_PATH=$JAVA_HOME/lib/server

For jdk8 and Centos

export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/amd64/server

For jdk11 (or later jdks) and Centos

export LD_LIBRARY_PATH=$JAVA_HOME/lib/server

Testing

The test also requires the CLASSPATH and DYLD_LIBRARY_PATH (or LD_LIBRARY_PATH). In case that the java class of org.junit.Assert can't be found. Refine the $CLASSPATH as follows:

export CLASSPATH=$CLASSPATH:`hadoop classpath --glob`:$HADOOP_HOME/share/hadoop/tools/lib/*

Here, $HADOOP_HOME need to be specified and exported.

Then you can run

cargo test

Example

use std::sync::Arc;
use hdfs::hdfs::{get_hdfs_by_full_path, HdfsFs};

let fs: Arc<HdfsFs> = get_hdfs_by_full_path("hdfs://localhost:8020/").ok().unwrap();
match fs.mkdir("/data") {
    Ok(_) => { println!("/data has been created") },
    Err(_)  => { panic!("/data creation has failed") }
};
Commit count: 56

cargo fmt