program-induction

Build Status crates.io docs.rs

A library for program induction and learning representations.

Implements Bayesian program learning and genetic programming.

Installation

Install rust. In a new or existing project, add the following to your Cargo.toml:

```toml [dependencies] programinduction = "0.2"

many examples also depend on polytype for the tp! and arrow! macros:

polytype = "1.2" ```

The documentation requires a custom HTML header to include KaTeX for math support. This isn't supported by cargo doc, so to build the documentation you may use:

sh cargo rustdoc -- --html-in-header rustdoc-include-katex-header.html

Usage

Specify a probabilistic context-free grammar (PCFG; see pcfg::Grammar) and induce a sentence that matches an example:

```rust

[macro_use]

extern crate polytype; extern crate programinduction;

use programinduction::{ECParams, EC}; use programinduction::pcfg::{taskbysimple_evaluation, Grammar, Rule};

fn simple_evaluator(name: &str, inps: &[i32]) -> i32 { match name { "0" => 0, "1" => 1, "plus" => inps[0] + inps[1], _ => unreachable!(), } }

fn main() { let g = Grammar::new( tp!(EXPR), vec![ Rule::new("0", tp!(EXPR), 1.0), Rule::new("1", tp!(EXPR), 1.0), Rule::new("plus", arrow![tp!(EXPR), tp!(EXPR), tp!(EXPR)], 1.0), ], ); let ecparams = ECParams { frontierlimit: 1, searchlimit: 50, }; // task: the number 4 let task = taskbysimpleevaluation(&simple_evaluator, &4, tp!(EXPR));

let frontiers = g.explore(&ec_params, &[task]);
let sol = &frontiers[0].best_solution().unwrap().0;
println!("{}", g.display(sol));

} ```

The Exploration-Compression (EC) algorithm iteratively learns a better representation by finding common structure in induced programs. We can run the EC algorithm with a polymorphically-typed lambda calculus representation lambda::Language in a Boolean circuit domain:

```rust

[macro_use]

extern crate polytype; extern crate programinduction;

use programinduction::{domains, lambda, ECParams, EC};

fn main() { // circuit DSL let dsl = lambda::Language::uniform(vec![ // NAND takes two bools and returns a bool ("nand", arrow![tp!(bool), tp!(bool), tp!(bool)]), ]); // parameters let lambdaparams = lambda::CompressionParams::default(); let ecparams = ECParams { frontierlimit: 1, searchlimit: 50, }; // randomly sample 250 circuit tasks let tasks = domains::circuits::make_tasks(250);

// one iteration of EC:
let (new_dsl, _solutions) = dsl.ec(&ec_params, &lambda_params, &tasks);
// print the new concepts it invented, based on common structure:
for &(ref expr, _, _) in &new_dsl.invented {
    println!("invented {}", new_dsl.display(expr))
    // one of the inventions was "(λ (nand $0 $0))",
    // which is the common and useful NOT operation!
}

} ```

You may have noted the above use of domains::circuits. Some domains are already implemented for you. Currently, this only consists of circuits and strings. The strings domain uses a rich set of primitives and thus depends on lambda::LispEvaluator. If you find this evaluator to be slow, you may install racket and enable the racket feature in your Cargo.toml:

toml [dependencies.programinduction] version = "0.2" features = ["racket"]

See the documentation for more details.

TODO

(you could be the one who does one of these!)

[ ] PCFG compression is currently only estimating parameters, not actually learning pieces of programs. An adaptor grammar approach seems like a good direction to go, perhaps minus the Bayesian non-parametrics. [ ] impl GP for pcfg::Grammar is not yet complete. [ ] Add more representations [ ] Add more domains [ ] Add task generation function in domains::strings