Build Status crates.io Coverage Status

rust-sequencefile

Hadoop SequenceFile library for Rust

Documentation

```toml

Cargo.toml

[dependencies] sequencefile = "0.1.4" ```

Status

Prototype status! I'm in the process of learning Rust. :) Feedback appreciated.

Unfortunately that means the API will change. If you depend on this crate, please fully qualify your versions for now.

Currently supports reading out your garden-variety sequence file. Handles uncompressed sequencefiles as well as record compressed files (deflate only). The most common type of sequence file, block compressed, isn't supported yet.

There's a lot more to do: - [X] Varint decoding - Block sizes are written with Varints - [X] Block decompression - [X] Gzip support - [X] Bzip2 support - [X] Sequencefile metadata - [X] Better error handling - [X] Tests - [X] Better error handling2 - Iterator should return Result<(ByteString, ByteString)> - [ ] More tests - [ ] Better documentation - [ ] Snappy support - [ ] CRC file support - [ ] 'Writables', e.g. generic deserialization for common Hadoop writable types - TODO: "Reflection" of some sort to allow registration of custom types. - [ ] Writer - [ ] Gracefully handle version 4 sequencefiles - [ ] Zero-copy implementation.

Usage

```rust let path = Path::new("/path/to/seqfile"); let file = File::open(&path).unwrap();

let seqfile = match sequencefile::Reader::new(file) { Ok(val) => val, Err(err) => panic!("Failed to open sequence file: {}", err), }

for kv in seqfile { println!("{:?}", kv); // Some(([123, 123], [456, 456])) }

// Until there's automatic deserialization, you can do something like this: // VERY hacky let kvs = seqfile.map(|e| e.unwrap()).map(|(key, value)| { (BigEndian::readi64(&key), String::fromutf8lossy(&value[2..value.len()]).tostring()) });

for (k,v) in kvs { println!("key: {}, value: {}", k, v); } ```

License

rust-sequencefile is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE, and LICENSE-MIT for details.