This crate aims to make it extremely simple to implement UDFs for SQL, in a minimally error-prone fashion.
Basic SQL UDFs consist of three exposed functions:
This wrapper greatly simplifies the process so that you only need to worry about checking arguments and performing the task.
There are also aggregate UDFs, which simply need to register two to three additional functions.
A quick overview of the workflow process is:
Create a new rust project (cargo new --lib my-udf
), add udf
as a
dependency (cd my-udf; cargo add udf
) and change the crate type to a
cdylib
by adding the following to Cargo.toml
:
toml
[lib]
crate-type = ["cdylib"]
Make a struct or enum that will share data between initializing and processing steps (it may be empty). The name of this struct will be the name of your function in SQL, as converted to snake case (adjustable names are planned but not yet available).
BasicUdf
trait on this structAggregateUdf
trait if you want it to be an aggregate function#[udf::register]
to each of these impl
blockscargo build --release
(output will be
target/release/libmy_udf.so
)CREATE FUNCTION ...
This section goes into the details of implementing a UDF with this library, but
it is non-exhaustive. For that, see the documentation, or the udf-examples
directory for well-annotated examples.
The first step is to create a struct (or enum) that will be used to share data between all relevant SQL functions. These include:
init
Called once per result set. Here, you can store const data to your
struct (if applicable)process
Called once per row (or per group for aggregate functions). This
function uses data in the struct and in the current row's arguments clear
Aggregate only, called once per group at the beginning. Reset the
struct as needed.add
Aggregate only, called once per row within a group. Perform needed
calculations and save the data in the struct.remove
Window functions only, called to remove a value from a groupIt is quite possible, especially for simple functions, that there is no data that needs sharing. In this case, just make an empty struct and no allocation will take place.
``rust
/// Function
sum_int` just adds all arguments as integers and needs no shared data
struct SumInt;
/// Function avg
on the other hand may want to save data to perform aggregation
struct Avg {
running_total: f64
}
```
There is a bit of a caveat for functions returning buffers (string & decimal
functions): if there is a possibility that string length exceeds
MYSQL_RESULT_BUFFER_SIZE
(255), then the string to be returned must be
contained within the struct (the process
function will then return a
reference).
rust
/// Generate random lipsum that may be longer than 255 bytes
struct Lipsum {
res: String
}
The next step is to implement the BasicUdf
and optionally AggregateUdf
traits. See the docs for more information.
If you use rust-analyzer with your IDE, it can help you out. Just type
impl BasicUdf for MyStruct {}
and place your cursor between the brackets
- it will offer to autofill the function skeletons.
```rust use udf::prelude::*;
struct SumInt;
impl BasicUdf for SumInt {
type Returns<'a> = Option
fn init<'a>(
cfg: &UdfCfg<Init>,
args: &'a ArgList<'a, Init>
) -> Result<Self, String> {
// ...
}
fn process<'a>(
&'a mut self,
cfg: &UdfCfg<Process>,
args: &ArgList<Process>,
error: Option<NonZeroU8>,
) -> Result<Self::Returns<'a>, ProcessError> {
// ...
}
} ```
Assuming the above has been followed, all that is needed is to produce a C
dynamic library for the project. This can be done by specifying
crate-type = ["cdylib"]
in your Cargo.toml
. After this, compiling with
cargo build --release
will produce a loadable .so
file (located in
target/release
).
Important version note: this crate relies on a feature called generic associated
types (GATs) which are only available on rust >= 1.65. This version only just
became stable (2022-11-03), so be sure to run rustup update
, or use the
nightly toolchain.
If you would like to verify that the correct C-callable functions are present,
you can inspect the dynamic library with nm
.
```sh
$ nm -gC --defined-only target/release/libudfexamples.so 00000000000081b0 T avgcost 0000000000008200 T avgcostadd 00000000000081e0 T avgcostclear 0000000000008190 T avgcostdeinit 0000000000008100 T avgcostinit 0000000000009730 T isconst 0000000000009710 T isconstdeinit 0000000000009680 T isconstinit 0000000000009320 T sqlsequence ... ```
Once compiled, the produced object file needs to be copied to the location of
the plugin_dir
SQL variable - usually, this is /usr/lib/mysql/plugin/
.
Once that has been done, CREATE FUNCTION
can be used in MariaDB/MySql to load
it.
If you require a linux object file but are compiling on a different platform, building in docker is a convenient option:
```sh
target/
directorybash -c
command based on what you want to build.docker run --rm -it \ -v "$(pwd):/build" \ -e CARGO_HOME=/build/.docker-cargo \ rustlang/rust:nightly \ bash -c "cd /build; cargo build -p udf-examples --release" ```
Testing in Docker is highly recommended, so as to avoid disturbing a host SQL installation. See the udf-examples readme for instructions on how to do this.
The udf-examples
crate contains examples of various UDFs, as well as
instructions on how to compile them. See the readme
there.
If you need to log things like warnings during normal use of the function,
eprintln!()
can be used to print to stderr
. This will show up in the SQL
server logs; these can be viewed with e.g. docker logs mariadb_udf_test
if
testing in docker
.
The quickest way to do simple debugging is by using the dbg!(...)
macro (rust
builtin). This also writes to stderr
but prints file & line information and
the value of its argument (prettyprinted), and returns the argument for further
assignment or use.
rust
dbg!(&self);
let arg0 = dbg!(args.get(0).unwrap())
``` [udfexamples/src/avgcost.rs:58] &self = AvgCost { count: 0, totalqty: 0, total_price: 0.0, }
[udfexamples/src/avgcost.rs:60] args.get(0).unwrap() = SqlArg {
value: Int(
Some(
10,
),
),
attribute: "qty",
maybenull: true,
argtype: Cell {
value: INTRESULT,
},
marker: PhantomData