| Crates.io | lance-hdfs-provider |
| lib.rs | lance-hdfs-provider |
| version | 0.1.0 |
| created_at | 2026-01-08 08:07:19.367676+00 |
| updated_at | 2026-01-08 08:07:19.367676+00 |
| description | HDFS store provider for lance |
| homepage | |
| repository | https://github.com/fMeow/lance-hdfs-provider |
| max_upload_size | |
| id | 2029738 |
| size | 190,425 |
HDFS store provider for Lance built on top of the OpenDAL hdfs service. It lets Lance and LanceDB read and write datasets directly to Hadoop HDFS.
Add the crate in your Cargo.toml:
[dependencies]
lance-hdfs-provider = "0.1.0"
Register the provider, then read or write using HDFS URIs:
use std::sync::Arc;
use lance::{io::ObjectStoreRegistry, session::Session,
dataset::{DEFAULT_INDEX_CACHE_SIZE, DEFAULT_METADATA_CACHE_SIZE}
};
use lance::dataset::builder::DatasetBuilder;
use lance_hdfs_provider::HdfsStoreProvider;
# #[tokio::main(flavor = "current_thread")]
# async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut registry = ObjectStoreRegistry::default();
registry.insert("hdfs", Arc::new(HdfsStoreProvider));
let session = Arc::new(Session::new(
DEFAULT_INDEX_CACHE_SIZE,
DEFAULT_METADATA_CACHE_SIZE,
Arc::new(registry),
));
let uri = "hdfs://127.0.0.1:9000/sample-dataset";
// Load an existing dataset
let _dataset = DatasetBuilder::from_uri(uri)
.with_session(session.clone())
.load()
.await?;
// Or write a new dataset (see examples)
Ok(())
# }
Use the same registry when creating the LanceDB session:
use std::sync::Arc;
use lance::{io::ObjectStoreRegistry, session::Session,
dataset::{DEFAULT_INDEX_CACHE_SIZE, DEFAULT_METADATA_CACHE_SIZE}
};
use lance_hdfs_provider::HdfsStoreProvider;
# #[tokio::main(flavor = "current_thread")]
# async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut registry = ObjectStoreRegistry::default();
registry.insert("hdfs", Arc::new(HdfsStoreProvider));
let session = Arc::new(Session::new(
DEFAULT_INDEX_CACHE_SIZE,
DEFAULT_METADATA_CACHE_SIZE,
Arc::new(registry),
));
let db = lancedb::connect("hdfs://127.0.0.1:9000/test-db")
.session(session.clone())
.execute()
.await?;
let table = db.open_table("table1").execute().await?;
Ok(())
# }
hdfs://127.0.0.1:9000/path), or a named cluster.StorageOptions; any key supported by OpenDAL's HDFS service can be provided.Licensed under either of