Super Speedy Syslog Searcher! (s4)

Speedily search and merge log file entries by datetime.

Super Speedy Syslog Searcher (s4) is a command-line tool to search and merge log files by datetime, including log files that are compressed (.gz, .xz), archived (.tar), utmpx user accounting records (utmp, wtmp), systemd journal files (.journal), or Microsoft Event Logs (.evtx). It will parse a variety of formal and ad-hoc log message datetime formats.

The first goal of s4 is speedy searching and printing.

Build status docs.rs License

crates.io version crates.io downloads codecov.io Commits since


- Use - Install super_speedy_syslog_searcher - Run s4 - --help - About - Features - Limitations - Hacks - More - Building locally - Parsing .journal files - Requesting Support For DateTime Formats; your particular log file - "syslog" and other project definitions - syslog - log message - logging chaos; the problem s4 solves - Further Reading


Use

Install super_speedy_syslog_searcher

lang-text cargo install super_speedy_syslog_searcher

Run s4

For example, print all the log messages in syslog files under /var/log/

lang-text s4 /var/log

On Windows, the ad-hoc logs under C:\Windows\Logs

lang-text s4.exe C:\Windows\Logs

Or the [Windows Event logs]

lang-text s4.exe C:\Windows\System32\winevt\Logs

Print the log messages after January 1, 2022 at 00:00:00

lang-text s4 /var/log -a 20220101

Print the log messages from January 1, 2022 00:00:00 to January 2, 2022

lang-text s4 /var/log -a 20220101 -b 20220102

or

lang-text s4 /var/log -a 20220101 -b @+1d

Print the log messages on January 1, 2022, from 12:00:00 to 16:00:00

lang-text s4 /var/log -a 20220101T120000 -b 20220101T160000

Print only the log messages since yesterday at this time

lang-text s4 /var/log -a=-1d

Print only the log messages that occurred two days ago (with the help of GNU date)

lang-text s4 /var/log -a $(date -d "2 days ago" '+%Y%m%d') -b @+1d

Print only the log messages that occurred two days ago during the noon hour (with the help of GNU date)

lang-text s4 /var/log -a $(date -d "2 days ago 12" '+%Y%m%dT%H%M%S') -b @+1h

Print only the log messages that occurred two days ago during the noon hour in Bengaluru, India (timezone offset +05:30) and prepended with equivalent UTC datetime (with the help of GNU date)

lang-text s4 /var/log -u -a $(date -d "2 days ago 12" '+%Y%m%dT%H%M%S+05:30') -b @+1h

--help

```lang-text Speedily search and merge log file entries by datetime. DateTime filters may be passed to narrow the search. It aims to be very fast.

Usage: s4 [OPTIONS] ...

Arguments: ... Path(s) of log files or directories. Directories will be recursed. Symlinks will be followed. Paths may also be passed via STDIN, one per line. The user must supply argument "-" to signify PATHS are available from STDIN.

Options: -a, --dt-after DateTime Filter After: print syslog lines with a datetime that is at or after this datetime. For example, "20200102T120000" or "-5d". -b, --dt-before DateTime Filter Before: print syslog lines with a datetime that is at or before this datetime. For example, "20200103T230000" or "@+1d+11h" -t, --tz-offset Default timezone offset for datetimes without a timezone. For example, log message "20200102T120000 Starting" has a datetime substring "20200102T120000". The datetime substring does not have a timezone offset so the TZOFFSET value would be used. Example values, "+12", "-0800", "+02:00", or "EDT". To pass a value with leading "-" use "=" notation, e.g. "-t=-0800". If not passed then the local system timezone offset is used. [default: -07:00] -z, --prepend-tz Prepend a DateTime in the timezone PREPENDTZ for every line. Used in PREPENDDTFORMAT. -u, --prepend-utc Prepend a DateTime in the UTC timezone offset for every line. This is the same as "--prepend-tz Z". Used in PREPENDDTFORMAT. -l, --prepend-local Prepend DateTime in the local system timezone offset for every line. This is the same as "--prepend-tz +XX" where +XX is the local system timezone offset. Used in PREPENDDTFORMAT. -d, --prepend-dt-format Prepend a DateTime using the strftime format string. If PREPEND_TZ is set then that value is used for any timezone offsets, i.e. strftime "%z" "%:z" "%Z" values, otherwise the timezone offset value is the local system timezone offset. [Default: %Y%m%dT%H%M%S%.3f%z] -n, --prepend-filename Prepend file basename to every line. -p, --prepend-filepath Prepend file full path to every line. -w, --prepend-file-align Align column widths of prepended data. --prepend-separator Separator string for prepended data. [default: :] --separator An extra separator string between printed log messages. Per log message not per line of text. Accepts a basic set of backslash escape sequences, e.g. "\0" for the null character. --journal-output The format for .journal file log messages. Matches journalctl --output options. : [default: short] [possible values: short, short-precise, short-iso, short-iso-precise, short-full, short-monotonic, short-unix, verbose, export, cat] -c, --color Choose to print to terminal using colors. [default: auto] [possible values: always, auto, never] --blocksz Read blocks of this size in bytes. May pass value as any radix (hexadecimal, decimal, octal, binary). Using the default value is recommended. Most useful for developers. [default: 65535] -s, --summary Print a summary of files processed to stderr. Most useful for developers. -h, --help Print help -V, --version Print version

DateTime Filters may be strftime specifier patterns: "%Y%m%dT%H%M%S" "%Y%m%dT%H%M%S%z" "%Y%m%dT%H%M%S%:z" "%Y%m%dT%H%M%S%#z" "%Y%m%dT%H%M%S%Z" "%Y-%m-%d %H:%M:%S" "%Y-%m-%d %H:%M:%S %z" "%Y-%m-%d %H:%M:%S %:z" "%Y-%m-%d %H:%M:%S %#z" "%Y-%m-%d %H:%M:%S %Z" "%Y-%m-%dT%H:%M:%S" "%Y-%m-%dT%H:%M:%S %z" "%Y-%m-%dT%H:%M:%S %:z" "%Y-%m-%dT%H:%M:%S %#z" "%Y-%m-%dT%H:%M:%S %Z" "%Y/%m/%d %H:%M:%S" "%Y/%m/%d %H:%M:%S %z" "%Y/%m/%d %H:%M:%S %:z" "%Y/%m/%d %H:%M:%S %#z" "%Y/%m/%d %H:%M:%S %Z" "%Y%m%d" "%Y-%m-%d" "%Y/%m/%d" "%Y%m%d %z" "%Y%m%d %:z" "%Y%m%d %#z" "%Y%m%d %Z" "+%s"

Or, DateTime Filter may be custom relative offset patterns: "+DwDdDhDmDs" or "-DwDdDhDmDs" "@+DwDdDhDmDs" or "@-DwDdDhDmDs"

Pattern "+%s" is Unix epoch timestamp in seconds with a preceding "+". For example, value "+946684800" is be January 1, 2000 at 00:00, GMT.

Custom relative offset pattern "+DwDdDhDmDs" and "-DwDdDhDmDs" is the offset from now (program start time) where "D" is a decimal number. Each lowercase identifier is an offset duration: "w" is weeks, "d" is days, "h" is hours, "m" is minutes, "s" is seconds. For example, value "-1w22h" is one week and twenty-two hours in the past. Value "+30s" is thirty seconds in the future.

Custom relative offset pattern "@+DwDdDhDmDs" and "@-DwDdDhDmDs" is relative offset from the other datetime. Arguments "-a 20220102 -b @+1d" are equivalent to "-a 20220102 -b 20220103". Arguments "-a @-6h -b 20220101T120000" are equivalent to "-a 20220101T060000 -b 20220101T120000".

Without a timezone offset (strftime specifier "%z" or "%Z"), the Datetime Filter is presumed to be the local system timezone.

Ambiguous named timezones will be rejected, e.g. "SST".

--prepend-tz and --dt-offset function indepdendently: --prepend-tz affects what is pre-printed before each printed log message line. --dt-offset is used to interpret processed log message datetime stamps that do not have a timezone offset.

--prepend-tz accepts numieric timezone offsets, e.g. "+09:00", "+0900", or "+09", and named timezone offsets, e.g. "JST".

Backslash escape sequences accepted by "--separator" are: "\0", "\a", "\b", "\e", "\f", "\n", "\r", "\", "\t", "\v",

Resolved values of "--dt-after" and "--dt-before" can be reviewed in the "--summary" output.

DateTime strftime specifiers are described at https://docs.rs/chrono/latest/chrono/format/strftime/

DateTimes supported are only of the Gregorian calendar.

DateTimes supported language is English.

Is s4 failing to parse a log file? Report an Issue at https://github.com/jtmoon79/super-speedy-syslog-searcher/issues/new/choose ```

About

Super Speedy Syslog Searcher (s4) is meant to aid Engineers in reviewing varying log files in a datetime-sorted manner. The primary use-case is to aid investigating problems wherein the time of problem occurrence is known but otherwise there is little source evidence.

Currently, log file formats vary widely. Most logs are an ad-hoc format. Even separate log files on the same system for the same service may have different message formats! 😵 Sorting these logged messages by datetime may be prohibitively difficult. The result is an engineer may have to "hunt and peck" among many log files, looking for problem clues around some datetime; so tedious!

Enter Super Speedy Syslog Searcher 🦸 ‼

s4 will print log messages from multiple log files in datetime-sorted order. A "window" of datetimes may be passed, to constrain the period of printed messages. This will assist an engineer that, for example, needs to view all syslog messages that occurred two days ago among log files taken from multiple systems.

The ulterior motive for Super Speedy Syslog Searcher was the primary developer wanted an excuse to learn rust 🦀, and wanted to create an open-source tool for a recurring need of some Software Test Engineers 😄

A longer rambling pontification about this project is in [Extended-Thoughts.md].

Features

Limitations

Hacks


More

Building locally

Building on Linux requires:

From the git cloned project directory run cargo build.

Parsing .journal files

Requires libsystemd to be installed to then use libsystemd.so.

Requesting Support For DateTime Formats; your particular log file

If you have found a log file that Super Speedy Syslog Searcher does not parse then you may create a [new Issue type Feature request (datetime format)].

Here is [an example user-submitted Issue].

"syslog" and other project definitions

syslog

In this project, the term "syslog" is used generously to refer to any log message that has a datetime stamp on the first line of log text.

Technically, "syslog" is [defined among several RFCs] proscribing fields, formats, lengths, and other technical constraints. In this project, the term "syslog" is interchanged with "log".

The term "sysline" refers to a one log message which may comprise multiple text lines.

See [docs section Definitions of data] for more project definitions.

log message

A "log message" is a single log entry for any type of logging scheme; an entry in a utmpx file, an entry in a systemd journal, an entry in a Windows Event Log, a formal syslog message, or an ad-hoc log message.

logging chaos; the problem s4 solves

In practice, most log file formats are an ad-hoc format that may not follow any formal definition. Sorting varying log messages by datetime is prohibitively tedious.

The following real-world example log files are available in project directory ./logs.

For example, the open-source nginx web server [logs access attempts in an ad-hoc format] in the file access.log

text 192.168.0.115 - - [08/Oct/2022:22:26:35 +0000] "GET /DOES-NOT-EXIST HTTP/1.1" 404 0 "-" "curl/7.76.1" "-"

which is an entirely dissimilar log format to the neighboring nginx log file, error.log

text 2022/10/08 22:26:35 [error] 6068#6068: *3 open() "/usr/share/nginx/html/DOES-NOT-EXIST" failed (2: No such file or directory), client: 192.168.0.115, server: _, request: "GET /DOES-NOT-EXIST HTTP/1.0", host: "192.168.0.100"

nginx is following the bad example set by the apache web server.


Commercial software and computer hardware vendors nearly always use ad-hoc log message formatting that is even more unpredictable among each log file on the same system.


Here is a log snippet from a Debian 11 host, file /var/log/alternatives.log:

text update-alternatives 2022-10-10 23:59:47: run with --quiet --remove rcp /usr/bin/ssh

And a snippet from the same Debian 11 host, file /var/log/dpkg.log:

text 2022-10-10 15:15:02 upgrade gpgv:amd64 2.2.27-2 2.2.27-2+deb11u1

And a snippet from the same Debian 11 host, file /var/log/kern.log:

text Oct 10 23:07:16 debian11-b kernel: [ 0.10034] Linux version 5.10.0-11-amd64

And a snippet from the same Debian 11 host, file /var/log/unattended-upgrades/unattended-upgrades-shutdown.log:

text 2022-10-10 23:07:16,775 WARNING - Unable to monitor PrepareForShutdown() signal, polling instead.


Here is a log snippet from a Synology DiskStation package DownloadStation:

text 2019/06/23 21:13:34 (system) trigger DownloadStation 3.8.13-3519 Begin start-stop-status start

And a snippet from a Synology DiskStation OS log file sfdisk.log on the same host:

text 2019-04-06T01:07:40-07:00 dsnet sfdisk: Device /dev/sdq change partition.

And a snippet from a Synology DiskStation OS log file synobackup.log on the same host:

text info 2018/02/24 02:30:04 SYSTEM: [Local][Backup Task Backup1] Backup task started.

(yes, those are tab characters)


Here are is a snippet from a Windows 10 Pro host, log file ${env:SystemRoot}\debug\mrt.log

text Microsoft Windows Malicious Software Removal Tool v5.83, (build 5.83.13532.1) Started On Thu Sep 10 10:08:35 2020

And a snippet from the same Windows host, log file ${env:SystemRoot}\comsetup.log

text COM+[12:24:34]: ******************************************************************************** COM+[12:24:34]: Setup started - [DATE:05,27,2020 TIME: 12:24 pm] COM+[12:24:34]: ********************************************************************************

And a snippet from the same Windows host, log file ${env:SystemRoot}\DirectX.log

text 11/01/19 20:03:40: infinst: Installed file C:\WINDOWS\system32\xactengine2_1.dll

And a snippet from the same Windows host, log file ${env:SystemRoot}/Microsoft.NET/Framework/v4.0.30319/ngen.log

text 09/15/2022 14:13:22.951 [515]: 1>Warning: System.IO.FileNotFoundException: Could not load file or assembly

And a snippet from the same Windows host, log file ${env:SystemRoot}/Performance/WinSAT/winsat.log

text 68902359 (21103) - exe\logging.cpp:0841: --- START 2022\5\17 14:26:09 PM --- 68902359 (21103) - exe\main.cpp:4363: WinSAT registry node is created or present

(yes, it reads hour 14, and PM… 🙄)


This chaotic logging approach is typical of commercial and open-source software. And it's a mess! Attempting to sort log messages by their natural sort mechanism, a datetime stamp, is difficult to impossible.

Hence the need for Super Speedy Syslog Searcher! 🦸

Further Reading


profile for @JamesThomasMoon on Stack Exchange, a network of free, community-driven Q&A sites