postgres-parser

This project is the beginnings of using Postgres' SQL Parser (effectively gram.y and the List *raw_parser(const char *str) function) from Rust.

The way this works is by downloading the Postgres source code, patching a few of its Makefiles (see patches/makefiles-12.3.patch), compiling it to LLVM IR, converting that to LLVM bitcode, and linking against it with Rust, using LTO (link-time-optimization) to ensure that the resulting Rust library only contains the bits needed parse SQL statements, and not the entirety of Postgres.

This is accomplished via a custom build.rs program, which shells out to build.sh to perform all the hard work.

At the end of the process we're left with a libpostgres.a archive, which build.rs instructs cargo to link against.

There's a few RUSTFLAGS set in .cargo/config which are necessary to tell Rust which linker we need to use (we don't want to mix/match clang and gcc -- we only want clang!) along with the LTO flags.

Using this Crate

Using this create is just like any other. Add it as a dependency to your Cargo.toml:

toml [dependencies] postgres-parser = "0.0.1"

Note that any crate that uses postgres-parser as a dependency will need a custom .cargo/config. And as such, so will any crates which rely on crates which use postgres-parser.

This is necessary to ensure that Rust is using clang proper, and enabling LTO during the build process:

```toml [target.'cfg(target_os="macos")'] rustflags=[ "-C", "linker=clang", "-C", "link-arg=-flto" ]

[target.'cfg(target_os="linux")'] rustflags=[ "-C", "linker=clang", "-C", "link-arg=-fuse-ld=gold", "-C", "link-arg=-flto" ] ```

Additionally, see the System Requirements section below.

Here's a simple example that outputs a SQL parse tree as JSON.

```rust use postgres_parser::*;

fn main() { let args: Vec = std::env::args().collect(); let querystring = args.get(1).expect("no arguments"); let parselist = match parsequery(querystring) { Ok(query) => query, Err(e) => { eprintln!("{:?}", e); std::process::exit(1); } };

let as_json = serde_json::to_string_pretty(&parse_list).expect("failed to convert to json");
println!("{}", as_json);

} ```

System Requirements

For an Ubuntu-based Linux system you'll need:

shell script $ sudo apt-get install clang llvm make curl $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

For MacOS you'll need:

shell script $ brew install wget $ brew install llvm $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

As Linux goes, so far I've tested this on Ubuntu 18.04 with LLVM 6.0.0, and Ubuntu 20.04 with LLVM 10.0.0.

You'll also want to make sure the LLVM and clang tools are on your $PATH. Especially the clang, opt, and llvm-ar tools.

Building

Build this just like any other Rust binary crate:

shell script $ cargo build [--release]

This will take awhile as again, the build process:

On my relatively new MacBook Pro 16", this process takes about 2.5 minutes the first time.

On my incredibly old Mac Mini, running Ubuntu 16.04 (yikes!), this process takes about 25 minutes. So be patient if you have an older computer.

Subsequent builds (assuming no cargo clean) are able to elide all of the above steps as the final libpostgres.a archive artifact is cached in the target/ directory.

Known Issues

Please Help!

We'd sincerely appreciate the time and effort you spend cloning this repo and at least trying to cargo test --all. If it doesn't work, or if these instructions are bad, we definitely want to know. We'd like this to be as easy as possible for everyone.

Furthermore, this is v0.0.1. Please feel free to submit bug reports, feature requests, and most especially Pull Requests.

Thanks

Thanks for checking this out. Here's the obligatory GitHub Sponsors link.

If you like what we're doing and where this is going, your sponsorship will keep us motivated.