DataFusion

logo

DataFusion is an extensible query planning, optimization, and execution framework, written in Rust, that uses Apache Arrow as its in-memory format.

Coverage Status

Features

Use Cases

DataFusion is modular in design with many extension points and can be used without modification as an embedded query engine and can also provide a foundation for building new systems. Here are some example use cases:

Why DataFusion?

DataFusion Community Extensions

There are a number of community projects that extend DataFusion or provide integrations with other systems.

Language Bindings

Integrations

Known Uses

Here are some of the projects known to use DataFusion:

(if you know of another project, please submit a PR to add a link!)

Example Usage

Please see example usage to find how to use DataFusion.

Roadmap

Please see Roadmap for information of where the project is headed.

Architecture Overview

There is no formal document describing DataFusion's architecture yet, but the following presentations offer a good overview of its different components and how they interact together.

User Guide

Please see User Guide for more information about DataFusion.

Contributor Guide

Please see Contributor Guide for information about contributing to DataFusion.