EBML stands for Extensible Binary Meta-Language and is somewhat of a binary version of XML. It's used for container formats like WebM or MKV.
IMPORTANT: The iterator contained in this crate is spec-agnostic and requires a specification implementing the
TagSpec
trait to read files. Typically, you would only use this crate to implement a custom specification - most often you would prefer a crate providing an existing specification, like webm-iterable.
KNOWN LIMITATION: This library was not built to work with an "Unknown Data Size" as defined in RFC8794. As such, it likely will not support streaming applications and will only work on complete datasets.
Cargo.toml
[dependencies]
ebml-iterable = "0.1.0"
The TagIterator
struct implements Rust's standard Iterator trait.
This struct can be created with the new
function on any source that implements the standard Read trait. The iterator outputs SpecTag
objects reflecting the type of tag (based on the defined specification) and the tag data.
Note: The
with_capacity
method can be used to construct aTagIterator
with a specified default buffer size. This is only useful as a microoptimization to memory management if you know the maximum tag size of the file you're reading.
The data in the EbmlTag
property can then be modified as desired (encryption, compression, etc.) and reencoded using the TagWriter
struct. This struct can be created with the new
function on any source that implements the standard Write trait. Once created, this struct can encode EBML using the write
method on any EbmlTag
objects regardless of whether they came from a TagIterator
. This will emit binary EBML to the underlying Write
destination.
EbmlTag
is an enumeration of three different classifications of tags that this library understands:
StartTag(u64)
is a marker for the beginning of a "master" tag as defined in EBML. Master tags are simply containers for other tags. The u64 value is the "id" of the tag.EndTag(u64)
is a marker for the end of a "master" tag. The u64 value is the "id" of the tag.FullTag(DataTag)
is a complete tag that includes both the id and full data of the tag. The DataTag value is described in more detail below.```rs pub struct DataTag { pub id: u64, pub data_type: DataTagType, }
pub enum DataTagType {
Master(Vec
A DataTag is a simple struct containing a tag id and the tag "type". Tag types indicate the type of data stored within the tag. It is important to note that the type of data contained in the tag directly corresponds to the tag id as defined in whichever specification is in use. Because EBML is binary, the correct specification is required to parse tag content.
Note: This library made a concious decision to not parse "Date" elements from EBML due to lack of built-in support for dates in Rust. Specification implementations should treat Date elements as Binary so that consumers have the option of parsing the unaltered data using their library of choice, if needed.
Any specification based on EBML can use this library to parse or write binary data. Writing needs nothing special, but parsing requires a struct implementing the TagSpec
trait. This trait currently requires implementation of two methods - get_tag
and get_tag_type
. These are used respectively to identify a specific tag instance (based on the tag id) and to identify the type of data stored in the tag. Custom specification implementations can refer to webm-iterable as an example.
Parsing and writing complete files should both work. Streaming isn't supported yet, but may be an option in the future. If something is broken, please create an issue.
Any additional feature requests can also be submitted as an issue.