Dasher

A Directory Hasher

dasher is a small utility intended to allow you to condense the "status" of an entire directory tree down to a single hash digest. Thus, you can tell that something has changed in the tree (but not what has changed) if the hash has changed.

Installation

dasher is easiest to install via Cargo

$ cargo install dasher

You can, of course, also clone this repository and cargo build/cargo run the code that way.

Usage

dasher currently does not have a CLI, but it is in the works.

Hashing scheme

The hashing scheme is, in essence, generating a Merkle tree, but with extra steps. Each node in the directory tree has its name hashed, then its contents, then those hashes are concatenated with a separator byte based on the node's type, and that data is hashed again to generate the node's hash. This process is repeated, from the bottom up in the directory tree, until all nodes have been hashed and a final hash for the entire directory can be returned.

For normal files, the node hash is simply: hash(hash(name) + byte + hash(content))

For directories, the node hash includes arbitrarily many content hashes, one per sub-node: hash(hash(name) + byte + hash(content_1) + byte + hash(content_2) + ... + byte + hash(content_n))

Finally, for symlinks, the link isn't followed. Instead, the content hash is the hash of the path to the file the link points to. hash(hash(name) + byte + hash(path))

Traversal of the directory is not recursive --- rather, the process starts with the leaves in the lexicographically "first" directory. For example, in the directory ``` .git ├── COMMIT_EDITMSG ├── config ├── info │ └── exclude └── logs ├── HEAD └── refs ├── heads │ ├── add-cli │ ├── dh-main │ └── main └── remotes └── origin └── main

`` the first item to be hashed would beinfo/exclude, followed by the directory hash of theinfodirectory. After that,logs/refs/heads/*would be hashed, then logs/refs/heads/remotes/origin/main, thenlogs/refs/heads/remotes/originas a directory, thenlogs/refs/heads/remotesas a directory, then finally climbing back up to hashlogs/refs`, since both its sub-directories have been hashed.

In a way, I guess you could consider this as being recursive, but it is not implemented recursively.

License

dasher is licensed under either of

at your option.

Is it any good?

yes.