angle-grinder Build Status Gitter chat

Slice and dice log files on the command line.

Angle-grinder allows you to parse, aggregate, sum, average, percentile, and sort your data. You can see it, live-updating, in your terminal. Angle grinder is designed for when, for whatever reason, you don't have your data in graphite/honeycomb/kibana/sumologic/splunk/etc. but still want to be able to do sophisticated analytics.

Angle grinder can process well above 1M rows per second (simple pipelines as high as 5M), so it's usable for fairly meaty aggregation. The results will live update in your terminal as data is processed. Angle grinder is a bare bones functional programming language coupled with a pretty terminal UI.

overview gif

Installation

Binaries are available for Linux and OS X. Many more platforms (including Windows) are available if you compile from source. In all of the commands below, the resulting binary will be called agrind.

With Brew (OS X)

bash brew install angle-grinder

With Curl (Single binary)

Linux: bash curl -L https://github.com/rcoh/angle-grinder/releases/download/v0.7.6/angle_grinder-v0.7.6-x86_64-unknown-linux-musl.tar.gz \ | tar Ozxf - \ | sudo tee /usr/local/bin/agrind > /dev/null && sudo chmod +x /usr/local/bin/agrind

OS X: bash curl -L https://github.com/rcoh/angle-grinder/releases/download/v0.7.6/angle_grinder-v0.7.6-x86_64-apple-darwin.tar.gz \ | tar Ozxf - \ | sudo tee /usr/local/bin/agrind > /dev/null && sudo chmod +x /usr/local/bin/agrind

From Source

If you have Cargo installed, you can compile & install from source: (Works with Stable Rust >=1.26) bash cargo install ag

Query Syntax

An angle grinder query is composed of filters followed by a series of operators. The filters select the lines from the input stream to be transformed by the operators. Typically, the initial operators will transform the data in some way by parsing fields or JSON from the log line. The subsequent operators can then aggregate or group the data via operators like sum, average, percentile, etc. bash agrind '<filter1> [... <filterN>] | operator1 | operator2 | operator3 | ...'

A simple query that operates on JSON logs and counts the number of logs per level could be: bash agrind '* | json | count by log_level'

Filters

Filters may be *, filter-me, or "filter me!". Only lines that match all filters will be passed to the subsequent operators. * matches all lines. filter.gif

Operators

Non Aggregate Operators

These operators have a 1 to 1 correspondence between input data and output data. 1 row in, 0 or 1 rows out.

JSON

json [from other_field]: Extract json-serialized rows into fields for later use. If the row is not valid JSON, then it is dropped. Optionally, from other_field can be specified.

Examples: agrind * | json agrind * | parse "INFO *" as js | json from js json.gif

Parse

parse "* pattern * otherpattern *" [from field] as a,b,c: Parse text that matches the pattern into variables. Lines that don't match the pattern will be dropped. * is equivalent to regular expression .* and is greedy. By default, parse operates on the raw text of the message. With from field_name, parse will instead process input from a specific column.

Examples: agrind * | parse "[status_code=*]" as status_code parse.gif

Fields

fields [only|except|-|+] a, b: Drop fields a, b or include only a, b depending on specified mode.

Examples: Drop all fields except event and timestamp agrind * | json | fields + event, timestamp Drop only the event field agrind * | fields except event

Where

where a == b: Drop rows where the condition is not met. Note that None == None, so a row where both the left and right sides match a non-existent key will match.

Examples agrind * | json | where status_code >= 400 agrind * | json | where user_id_a == user_id_b agrind * | json | where url != "/hostname"

Aggregate Operators

Aggregate operators group and combine your data by 0 or more key fields. The same query can include multiple aggregates. The general syntax is: noformat (operator [as renamed_column])+ [by key_col1, key_col2]

In the simplest form, key fields refer to columns, but they can also be generalized expressions (see examples) Examples: agrind * | count agrind * | json | count by status_code agrind * | json | count, p50(response_ms), p90(response_ms) by status_code agrind * | json | count as num_requests, p50(response_ms), p90(response_ms) by status_code agrind * | json | count, p50(response_ms), p90(response_ms), count by status_code >= 400, url

There are several aggregate operators available.

Count

count [as count_column]: Counts the numer of input rows. Output column Defaults to _count

Examples:

Count number of rows by source_host: agrind * | count by source_host Count number of source_hosts: agrind * | count by source_host | count

Sum

sum(column) [as sum_column]: Sum values in column. If the value in column is non-numeric, the row will be ignored. Examples: agrind * | json | sum(num_records) by action

Average

average(column) [as average_column] [by a, b]: Average values in column. If the value in column is non-numeric, the row will be ignored.

Examples: agrind * | json | average(response_time)

Percentile

pXX(column): calculate the XXth percentile of column

Examples: agrind * | json | p50(response_time), p90(response_time) by endpoint_url, status_code

Sort

sort by a, [b, c] [asc|desc]: Sort aggregate data by a collection of columns. Defaults to ascending.

Examples: agrind * | json | count by endpoint_url, status_code | sort by endpoint_url desc

Total

total(a) [as renamed_total]: Compute the running total of a given field. Total does not currently support grouping!

Examples: agrind * | json | total(num_requests) as tot_requests

Count Distinct

count_distinct(a): Count distinct values of column a. Warning: this is not fixed memory. Be careful about processing too many groups.

Examples: agrind * | json | count_distinct(ip_address)

Example Queries

v0.6.2 0 v0.6.1 4 v0.6.0 5 v0.5.1 0 v0.5.0 4 v0.4.0 0 v0.3.3 0 v0.3.2 2 v0.3.1 9 v0.3.0 7 v0.2.1 0 v0.2.0 1 - Take the 50th percentile of response time by host: bash tail -F myjsonlogs | agrind '* | json | pct50(responsetime) by url' - Count the number of status codes by url: bash tail -F myjsonlogs | agrind '* | json | count statuscode by url' ``` More example queries can be found in the tests folder

Rendering

Non-aggregate data is simply written row-by-row to the terminal as it is received: noformat tail -f live_pcap | agrind '* | parse "* > *:" as src, dest | parse "length *" as length' [dest=111.221.29.254.https] [length=0] [src=21:50:18.458331 IP 10.0.2.243.47152] [dest=111.221.29.254.https] [length=310] [src=21:50:18.458527 IP 10.0.2.243.47152]

Aggregate data is written to the terminal and will live-update until the stream ends: ```noformat

k2 avg

test longer test 500.50
test test 375.38
alternate input 4.00
hello 3.00
hello thanks 2.00
```

The renderer will do its best to keep the data nicely formatted as it changes and the number of output rows is limited to the length of your terminal. Currently, it has a refresh rate of about 20hz.

Contributing

angle-grinder builds with Rust >= 1.26. There are a number of ways you can contribute: - Adding new special purpose operators - Improve documentation of existing operators + providing more usage examples - Provide more test cases of real queries on real world data - Tell more people about angle grinder! bash cargo build cargo test cargo install --path . agrind --help

See the open issues for specific potential improvements/bugs.

Similar Projects