Documentation: Python - Rust - Node.js | StackOverflow: Python - Rust - Node.js | User Guide | Discord
Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as the memory model.
To learn more, read the User Guide.
```python
import polars as pl df = pl.DataFrame( ... { ... "A": [1, 2, 3, 4, 5], ... "fruits": ["banana", "banana", "apple", "apple", "banana"], ... "B": [5, 4, 3, 2, 1], ... "cars": ["beetle", "audi", "beetle", "beetle", "beetle"], ... } ... )
( ... df ... .sort("fruits") ... .select( ... [ ... "fruits", ... "cars", ... pl.lit("fruits").alias("literalstringfruits"), ... pl.col("B").filter(pl.col("cars") == "beetle").sum(), ... pl.col("A").filter(pl.col("B") > 2).sum().over("cars").alias("sumAbycars"), # groups by "cars" ... pl.col("A").sum().over("fruits").alias("sumAbyfruits"), # groups by "fruits" ... pl.col("A").reverse().over("fruits").alias("revAbyfruits"), # groups by "fruits ... pl.col("A").sortby("B").over("fruits").alias("sortAbyBbyfruits"), # groups by "fruits" ... ] ... ) ... ) shape: (5, 8) ┌──────────┬──────────┬──────────────┬─────┬─────────────┬─────────────┬─────────────┬─────────────┐ │ fruits ┆ cars ┆ literalstri ┆ B ┆ sumAbyca ┆ sumAbyfr ┆ revAbyfr ┆ sortAbyB │ │ --- ┆ --- ┆ ngfruits ┆ --- ┆ rs ┆ uits ┆ uits ┆ _byfruits │ │ str ┆ str ┆ --- ┆ i64 ┆ --- ┆ --- ┆ --- ┆ --- │ │ ┆ ┆ str ┆ ┆ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞══════════╪══════════╪══════════════╪═════╪═════════════╪═════════════╪═════════════╪═════════════╡ │ "apple" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 7 ┆ 4 ┆ 4 │ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ "apple" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 7 ┆ 3 ┆ 3 │ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ "banana" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 8 ┆ 5 ┆ 5 │ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ "banana" ┆ "audi" ┆ "fruits" ┆ 11 ┆ 2 ┆ 8 ┆ 2 ┆ 2 │ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ "banana" ┆ "beetle" ┆ "fruits" ┆ 11 ┆ 4 ┆ 8 ┆ 1 ┆ 1 │ └──────────┴──────────┴──────────────┴─────┴─────────────┴─────────────┴─────────────┴─────────────┘
```
Polars is very fast. In fact, it is one of the best performing solutions available. See the results in h2oai's db-benchmark.
Install the latest polars version with:
```
$ pip3 install -U 'polars'
$ pip3 install -U 'polars[all]'
$ pip3 install -U 'polars[numpy]'
$ pip3 install -U 'polars[pyarrow]'
$ pip3 install -U 'polars[pyarrow,fsspec]'
$ pip3 install -U 'polars[connectorx]'
$ pip3 install -U 'polars[xlsx2csv]'
$ pip3 install -U 'polars[timezone]' ```
Releases happen quite often (weekly / every few days) at the moment, so updating polars regularly to get the latest bugfixes / features might not be a bad idea.
You can take latest release from crates.io
, or if you want to use the latest features / performance improvements
point to the master
branch of this repo.
toml
polars = { git = "https://github.com/pola-rs/polars", rev = "<optional git tag>" }
Required Rust version >=1.58
Want to know about all the features Polars supports? Read the docs!
$ pip3 install polars
$ yarn add nodejs-polars
Want to contribute? Read our contribution guideline.
If you want a bleeding edge release or maximal performance you should compile polars from source.
This can be done by going through the following steps in sequence:
$ pip3 install maturin
bash
$ cd py-polars && maturin develop --release -- -C target-cpu=native
bash
$ cd py-polars && maturin develop --release -- -C codegen-units=16 -C lto=thin -C target-cpu=native
Note that the Rust crate implementing the Python bindings is called py-polars
to distinguish from the wrapped
Rust crate polars
itself. However, both the Python package and the Python module are named polars
, so you
can pip install polars
and import polars
.
Polars has transitioned to arrow2. Arrow2 is a faster and safer implementation of the Apache Arrow Columnar Format. Arrow2 also has a more granular code base, helping to reduce the compiler bloat.
See this example.
Do you expect more than 2^32
~4,2 billion rows? Compile polars with the bigidx
feature flag.
Or for python users install $ pip install -U polars-u64-idx
.
Don't use this unless you hit the row boundary as the default polars is faster and consumes less memory.
Do you want polars to run on an old CPU (e.g. dating from before 2011)? Install $pip -U polars-lts-cpu
. This polars project is
compiled without avx target features.
Development of Polars is proudly powered by