autograd

Differentiable operations and tensors backed by ndarray.

Installation

[dependencies] autograd = { version = "0.9.6", features = ["mkl"] } mkl feature is recommended to speedup gemm operations using Intel MKL.

Features

Lazy, zero-copy tensor evaluation

Computation graphs are created on the fly (a.k.a define-by-run), but are not evaluated until Tensor::eval or ag::eval is called. This mechanism balances better performance and flexibility. ```rust extern crate autograd as ag;

let a: ag::Tensor = ag::ones(&[60]); let b: ag::Tensor = ag::ones(&[24]); let c: ag::Tensor = ag::reshape(a, &[3, 4, 5]); let d: ag::Tensor = ag::reshape(b, &[4, 6]); let e: ag::Tensor = ag::tensordot(c, d, &[1, 0], &[0, 1]); e.eval(&[]); // Getting ndarray::Array here. ```

Reverse-mode automatic differentiation

There are a lot of built-in operations that support higher-order derivatives, and you can also define your own differentiable ops with ndarrays easily.

Here we are just computing partial derivatives of z = 2x^2 + 3y + 1.

```rust extern crate autograd as ag;

let ref x = ag::placeholder(&[]); let ref y = ag::placeholder(&[]); let ref z = 2.xx + 3.*y + 1.;

// dz/dy let gy = &ag::grad(&[z], &[y])[0]; println!("{:?}", gy.eval(&[])); // => Some(3.)

// dz/dx (requires to fill the placeholder x) let gx = &ag::grad(&[z], &[x])[0]; println!("{:?}", gx.eval(&[(x, &ag::ndarray::arr0(2.).into_dyn())])); // => Some(8.)

// ddz/dx (differentiates z again) let ggx = &ag::grad(&[gx], &[x])[0]; println!("{:?}", ggx.eval(&[])); // => Some(4.) ```

Neural networks

This crate has various low-level features inspired by tensorflow/theano to train neural networks. ```rust // This is a softmax regression for MNIST digits classification with Adam. // This achieves 0.918 test accuracy after 3 epochs (0.11 sec/epoch on 2.7GHz Intel Core i5). let ref w = ag::variable(ag::ndarrayext::glorotuniform::(&[2828, 10])); let ref b = ag::variable(ag::ndarray_ext::zeros::(&[1, 10])); let ref x = ag::placeholder(&[-1, 2828]); let ref y = ag::placeholder(&[-1]); let ref z = ag::matmul(x, w) + b; let ref loss = ag::sparsesoftmaxcrossentropy(z, y); let ref params = [w, b]; let ref grads = ag::grad(&[loss], params); let ref predictions = ag::argmax(z, -1, true); let ref accuracy = ag::reducemean(&ag::equal(predictions, y), &[0], false); let ref adam = ag::gradientdescentops::Adam::default(); let mut statefulparams = ag::gradientdescentops::Adam::varswithstates(params); let ref updateops = adam.computeupdates(&statefulparams, grads);

// -- dataset -- let ((xtrain, ytrain), (xtest, ytest)) = dataset::load();

// -- training loop -- for epoch in 0..maxepoch { ... ag::eval(updateops, &[(x, &xbatch), (y, &ybatch)]); } ```

ConvNet, LSTM example can be found in examples

Hooks

You can register hooks on ag::Tensor objects for debugging. ```rust extern crate autograd as ag;

// .p() is a shorthand for .with(ag::Hook::Print). let a: ag::Tensor = ag::zeros(&[4, 2]).p(); let b: ag::Tensor = ag::ones(&[2, 3]); let c = ag::matmul(a, b);

c.eval(&[]); // Zeros: // [[0.0, 0.0], // [0.0, 0.0], // [0.0, 0.0], // [0.0, 0.0]] shape=[4, 2], strides=[2, 1], layout=C (0x1) ```

Why Rust?

No need for bridges for fast languages. The entire logic including hotspots (kernels etc) is implemented in pure Rust, without compromising performance.
Memory safety. For example, Rust's lifetime checker makes it possible to implement zero-copy computation graphs without GC.

For more, see documentation or examples