Super Speedy Syslog Searcher! (s4)

Speedily search and sort many syslog files by datetime.

Super Speedy Syslog Searcher (s4) is a command-line tool to search and sort syslog files within compressed files (.gz, .xz) and archives (.tar). The first goal of s4 is speedy searching and printing.

Build status docs.rs License

crates.io version crates.io downloads codecov.io Commits since Requirements Status



Use

Install super_speedy_syslog_searcher

lang-text cargo install super_speedy_syslog_searcher

Run s4

For example, print all the syslog lines in syslog files under /var/log/

lang-text s4 /var/log

Print only the syslog lines since yesterday

lang-text s4 /var/log -a $(date -d "yesterday" '+%Y-%m-%d')

Print only the syslog lines that occurred two days ago

lang-text s4 /var/log -a $(date -d "2 days ago" '+%Y-%m-%d') -b $(date -d "1 days ago" '+%Y-%m-%d')

Print only the syslog lines that occurred two days ago during the noon hour

lang-text s4 /var/log -a $(date -d "2 days ago 12:00" '+%Y-%m-%dT%H:%M:%S') -b $(date -d "2 days ago 13:00" '+%Y-%m-%dT%H:%M:%S')

Print only the syslog lines that occurred two days ago during the noon hour in Bengaluru, India (timezone offset +05:30) and prepended with equivalent UTC datetime.

lang-text s4 /var/log -u -a "$(date -d "2 days ago 12:00" '+%Y-%m-%dT%H:%M:%S') +05:30" -b "$(date -d "2 days ago 13:00" '+%Y-%m-%dT%H:%M:%S') +05:30"

--help

```lang-text Super Speedy Syslog Searcher will search syslog files and sort entries by datetime. DateTime filters may be passed to narrow the search. It aims to be very fast.

USAGE: s4 [OPTIONS] ...

ARGS: ... Path(s) of syslog files or directories. Directories will be recursed, remaining on the same filesystem. Symlinks will be followed

OPTIONS: -a, --dt-after DateTime After filter - print syslog lines with a datetime that is at or after this datetime. For example, '20200102T123000'

-b, --dt-before <DT_BEFORE>
        DateTime Before filter - print syslog lines with a datetime that is at or before this
        datetime. For example, '20200102T123001'

-t, --tz-offset <TZ_OFFSET>
        DateTime Timezone offset - for syslines with a datetime that does not include a
        timezone, this will be used. For example, '-0800', '+02:00', 'EDT' (to pass a value with
        leading '-', use '=', e.g. '-t=-0800'). Default is local system timezone offset.
        [default: -08:00]

-u, --prepend-utc
        Prepend DateTime in the UTC Timezone for every line

-l, --prepend-local
        Prepend DateTime in the Local Timezone for every line

-d, --prepend-dt-format <PREPEND_DT_FORMAT>
        Prepend DateTime using strftime format string [default: %Y%m%dT%H%M%S%.3f%z:]

-n, --prepend-filename
        Prepend file basename to every line

-p, --prepend-filepath
        Prepend file full path to every line

-w, --prepend-file-align
        Align column widths of prepended data

-c, --color <COLOR_CHOICE>
        Choose to print to terminal using colors [default: auto] [possible values: always, auto,
        never]

-z, --blocksz <BLOCKSZ>
        Read blocks of this size in bytes. May pass decimal or hexadecimal numbers. Using the
        default value is recommended [default: 65535]

-s, --summary
        Print a summary of files processed. Printed to stderr

-h, --help
        Print help information

-V, --version
        Print version information

DateTime Filter patterns may be: "%Y%m%dT%H%M%S" "%Y%m%dT%H%M%S%z" "%Y%m%dT%H%M%S%:z" "%Y%m%dT%H%M%S%#z" "%Y%m%dT%H%M%S%Z" "%Y-%m-%d %H:%M:%S" "%Y-%m-%d %H:%M:%S %z" "%Y-%m-%d %H:%M:%S %:z" "%Y-%m-%d %H:%M:%S %#z" "%Y-%m-%d %H:%M:%S %Z" "%Y-%m-%dT%H:%M:%S" "%Y-%m-%dT%H:%M:%S %z" "%Y-%m-%dT%H:%M:%S %:z" "%Y-%m-%dT%H:%M:%S %#z" "%Y-%m-%dT%H:%M:%S %Z" "%Y/%m/%d %H:%M:%S" "%Y/%m/%d %H:%M:%S %z" "%Y/%m/%d %H:%M:%S %:z" "%Y/%m/%d %H:%M:%S %#z" "%Y/%m/%d %H:%M:%S %Z" "%Y%m%d" "%Y-%m-%d" "%Y/%m/%d" "%Y%m%d %z" "%Y%m%d %:z" "%Y%m%d %#z" "%Y%m%d %Z" "+%s"

Pattern "+%s" is Unix epoch timestamp in seconds with a preceding "+". Without a timezone offset ("%z" or "%Z"), the Datetime Filter is presumed to be the local system timezone. Ambiguous named timezones will be rejected, e.g. "SST".

DateTime formatting specifiers are described at https://docs.rs/chrono/latest/chrono/format/strftime/

DateTimes supported are only of the Gregorian calendar. DateTimes supported language is English. ```

About

Super Speedy Syslog Searcher (s4) is meant to aid Engineers in reviewing varying syslog files from any Unix system in a time-sorted manner. The primary use-case is to aid investigating problems wherein the time of occurrence is known but there is little other problem evidence.

Currently, Unix log file formats vary widely. Most logs are an ad-hoc format. Even separate log files on the same system for the same service may have different message formats! 😵 Sorting these logged messages by datetime may be prohibitively difficult. The result is an engineer may have to "hunt and peck" among many log files, looking for problem clues around some datetime; very tedious!

Enter Super Speedy Syslog Searcher 🦸 ‼

s4 will print syslog file messages in datetime-sorted order. A "window" of datetimes may be passed, to constrain the period of printed messages. This will assist an engineer that, for example, needs to view all syslog messages that occured two days ago among log files taken from multiple systems.

The alterior motive for Super Speedy Syslog Searcher was the primary developer wanted an excuse to learn rust 🦀, and wanted to create an open-source tool for a recurring need of some Software Test Engineers 😄.

A longer rambling pontification about this project is in Extended-Thoughts.md.

Features

Limitations

Hacks


"syslog" definition chaos

In this project, the term "syslog" is used casually to refer any log message that has a datetime stamp on the first line of log text.


Technically, "syslog" is defined among several RFCs proscribing fields, formats, maximum lengths, and other technical constraints.

Here is a RFC 5424 qualifying syslog message example:

text <165>1 2003-10-11T22:14:15.003Z mymachine.example.com eventslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@32473 class="high"]


In practice, many logged messages on a Unix system are an ad-hoc format that may not follow any formal definition, they are merely "log" messages.

For example, the nginx web server logs access attempts in an ad-hoc format in the access.log

text 192.168.0.115 - - [08/Oct/2022:22:26:35 +0000] "GET / HTTP/1.1" 200 7620 "-" "curl/7.76.1" "-"

which is an entirely dissimlar log format to neighboring log file, error.log

text 2022/10/08 22:27:40 [error] 6068#6068: *3 open() "/usr/share/nginx/html/DOES-NOT-EXIST" failed (2: No such file or directory), client: 165.227.95.115, server: _, request: "GET /DOES-NOT-EXIST HTTP/1.0", host: "165.227.95.115"


Commercial computer appliance vendors, NAS vendors, router vendors, etc., often use ad-hoc log message formatting that is even more unpredictable.

For example, from the Netgear Orbi Router SOAP client per-host log file:

text [SOAPClient]{DEBUG}{2022-05-10 16:19:13}[soap.c:1060] generate soap request, action=ParentalControl, method=Authenticate

Here is a log snippet from a Synology DiskStation package DownloadStation:

text 2019/06/23 21:13:34 (system) trigger DownloadStation 3.8.13-3519 Begin start-stop-status start

And a snippet from a Synology DiskStation OS log file sfdisk.log:

text 2019-04-06T01:07:40-07:00 dsnet sfdisk: Device /dev/sdq change partition.

And a snippet from a Synology DiskStation OS log file synobackup.log on the same host:

text info 2018/02/24 02:30:04 SYSTEM: [Local][Backup Task Backup1] Backup task started.

(yes, those are tab characters)


To be fair to nginx, Netgear, and Synology, this chaotic approach to logging is typical of commercial and open-source software.

Hence the need for Super Speedy Syslog Searcher!

Further Reading


profile for @JamesThomasMoon on Stack Exchange, a network of free, community-driven Q&A sites