Speedily search and merge many syslog files by datetime.
Super Speedy Syslog Searcher (s4) is a command-line tool to search
and merge plain log files, including log compressed log files (.gz
, .xz
) and
within archives (.tar
).
The first goal of s4 is speedy searching and printing.
super_speedy_syslog_searcher
lang-text
cargo install super_speedy_syslog_searcher
s4
For example, print all the syslog lines in syslog files under /var/log/
lang-text
s4 /var/log
On Windows under C:\Windows\Logs
lang-text
s4.exe C:\Windows\Logs
Print the syslog lines after January 1, 2022 at 00:00:00
lang-text
s4 /var/log -a 20220101
Print the syslog lines from January 1, 2022 00:00:00 to January 2, 2022
lang-text
s4 /var/log -a 20220101 -b 20220102
or
lang-text
s4 /var/log -a 20220101 -b @+1d
Print the syslog lines on January 1, 2022, from 12:00:00 to 16:00:00
lang-text
s4 /var/log -a 20220101T120000 -b 20220101T160000
Print only the syslog lines since yesterday at this time
lang-text
s4 /var/log -a=-1d
Print only the syslog lines that occurred two days ago
(with the help of GNU date
)
lang-text
s4 /var/log -a $(date -d "2 days ago" '+%Y%m%d') -b @+1d
Print only the syslog lines that occurred two days ago during the noon hour
(with the help of GNU date
)
lang-text
s4 /var/log -a $(date -d "2 days ago 12" '+%Y%m%dT%H%M%S') -b @+1h
Print only the syslog lines that occurred two days ago during the noon hour in
Bengaluru, India (timezone offset +05:30) and prepended with equivalent UTC
datetime (with the help of GNU date
)
lang-text
s4 /var/log -u -a $(date -d "2 days ago 00" '+%Y%m%dT%H%M%S+05:30') -b @+1h
--help
```lang-text Super Speedy Syslog Searcher will search syslog files and sort entries by datetime. DateTime filters may be passed to narrow the search. It aims to be very fast.
USAGE:
s4 [OPTIONS]
ARGS:
OPTIONS:
-a, --dt-after
-b, --dt-before <DT_BEFORE>
DateTime Before filter - print syslog lines with a datetime that is at or before this
datetime. For example, "20200102T123001"
-t, --tz-offset <TZ_OFFSET>
DateTime Timezone offset - for syslines with a datetime that does not include a
timezone, this will be used. For example, "-0800", "+02:00", or "EDT". Ambiguous named
timezones parsed from logs will use this value, e.g. timezone "IST". (to pass a value
with leading "-", use ", e.g. "-t=-0800"). Default is local system timezone offset.
[default: -08:00]
-u, --prepend-utc
Prepend DateTime in the UTC Timezone for every line
-l, --prepend-local
Prepend DateTime in the Local Timezone for every line
-d, --prepend-dt-format <PREPEND_DT_FORMAT>
Prepend DateTime using strftime format string [default: %Y%m%dT%H%M%S%.3f%z:]
-n, --prepend-filename
Prepend file basename to every line
-p, --prepend-filepath
Prepend file full path to every line
-w, --prepend-file-align
Align column widths of prepended data
-c, --color <COLOR_CHOICE>
Choose to print to terminal using colors [default: auto] [possible values: always, auto,
never]
-z, --blocksz <BLOCKSZ>
Read blocks of this size in bytes. May pass decimal or hexadecimal numbers. Using the
default value is recommended. Most useful for developers [default: 65535]
-s, --summary
Print a summary of files processed to stderr. Most useful for developers
-h, --help
Print help information
-V, --version
Print version information
DateTime Filter strftime specifier patterns may be: "%Y%m%dT%H%M%S" "%Y%m%dT%H%M%S%z" "%Y%m%dT%H%M%S%:z" "%Y%m%dT%H%M%S%#z" "%Y%m%dT%H%M%S%Z" "%Y-%m-%d %H:%M:%S" "%Y-%m-%d %H:%M:%S %z" "%Y-%m-%d %H:%M:%S %:z" "%Y-%m-%d %H:%M:%S %#z" "%Y-%m-%d %H:%M:%S %Z" "%Y-%m-%dT%H:%M:%S" "%Y-%m-%dT%H:%M:%S %z" "%Y-%m-%dT%H:%M:%S %:z" "%Y-%m-%dT%H:%M:%S %#z" "%Y-%m-%dT%H:%M:%S %Z" "%Y/%m/%d %H:%M:%S" "%Y/%m/%d %H:%M:%S %z" "%Y/%m/%d %H:%M:%S %:z" "%Y/%m/%d %H:%M:%S %#z" "%Y/%m/%d %H:%M:%S %Z" "%Y%m%d" "%Y-%m-%d" "%Y/%m/%d" "%Y%m%d %z" "%Y%m%d %:z" "%Y%m%d %#z" "%Y%m%d %Z" "+%s", "+DwDdDhDmDs" or "-DwDdDhDmDs", @+DwDdDhDmDs" or "@-DwDdDhDmDs",
Pattern "+%s" is Unix epoch timestamp in seconds with a preceding "+".
Custom pattern "+DwDdDhDmDs" and "-DwDdDhDmDs" is relative offset from now (program start time) where "D" is a decimal number. Each lowercase identifier is an offset duration: "w" is weeks, "d" is days, "h" is hours, "m" is minutes, "s" is seconds. Value "-1w22h" would be one week and twenty-two hours in the past. Value "+30s" would be thirty seconds in the future.
Custom pattern "@+DwDdDhDmDs" and "@-DwDdDhDmDs" is relative offset from the other datetime. Arguments "-a 20220102 -b @+1d" are equivalent to "-a 20220102 -b 20220103". Arguments "-a @-6h -b 20220101T120000" are equivalent to "-a 20220101T060000 -b 20220101T120000".
Without a timezone offset ("%z" or "%Z"), the Datetime Filter is presumed to be the local system timezone.
Ambiguous user-passed named timezones will be rejected, e.g. "SST".
Resolved values of "--dt-after" and "--dt-before" can be reviewed in the "--summary" output.
DateTime strftime specifier patterns are described at https://docs.rs/chrono/latest/chrono/format/strftime/
DateTimes supported are only of the Gregorian calendar.
DateTimes supported language is English. ```
Super Speedy Syslog Searcher (s4) is meant to aid Engineers in reviewing varying log files in a datetime-sorted manner. The primary use-case is to aid investigating problems wherein the time of problem occurrence is known but otherwise there is little source evidence.
Currently, log file formats vary widely. Most logs are an ad-hoc format. Even separate log files on the same system for the same service may have different message formats! 😵 Sorting these logged messages by datetime may be prohibitively difficult. The result is an engineer may have to "hunt and peck" among many log files, looking for problem clues around some datetime; so tedious!
Enter Super Speedy Syslog Searcher 🦸 ‼
s4 will print log messages from multiple log files in datetime-sorted order. A "window" of datetimes may be passed, to constrain the period of printed messages. This will assist an engineer that, for example, needs to view all syslog messages that occured two days ago among log files taken from multiple systems.
The alterior motive for Super Speedy Syslog Searcher was the primary developer wanted an excuse to learn rust 🦀, and wanted to create an open-source tool for a recurring need of some Software Test Engineers 😄
A longer rambling pontification about this project is in
Extended-Thoughts.md
.
./logs/
)grep
and sort
(see project tool ./tools/compare-grep-sort.sh
; run in github Actions, Job
run s4, Step Run script compare-grep-sort).gz
files (only processes first stream found) (Issue #8).xz
files (only processes first stream found) (Issue #11)logs.tgz
. (Issue #14)
e.g. file syslog.xz
file within file logs.tar
will not be processed,.zip
archives (Issue #39)YYYY-MM-DDThh:mm:ss
YYYY-MM-DDThhmmss
YYYYMMDDThhmmss
(may use date-time separator character 'T'
or character blank space ' '
)YYYY-DDD
, e.g. "2022-321"
YYYY-Www-D
, e.g. "2022-W25-1"
hh
).xz
files are read into memory during the initial open
(Issue #12)In this project, the term "syslog" is used generously to refer any log message that has a datetime stamp on the first line of log text.
Technically, "syslog" is defined among several RFCs proscribing fields, formats, lengths, and other technical constraints. In this project, the term "syslog" is interchanged with "log".
In practice, most log file formats are an ad-hoc format that may not follow any formal definition.
The following real-world example log files are available in project directory
./logs
.
For example, the open-source nginx web server
logs access attempts in an ad-hoc format in the file access.log
text
192.168.0.115 - - [08/Oct/2022:22:26:35 +0000] "GET /DOES-NOT-EXIST HTTP/1.1" 404 0 "-" "curl/7.76.1" "-"
which is an entirely dissimilar log format to the neighboring nginx log file,
error.log
text
2022/10/08 22:26:35 [error] 6068#6068: *3 open() "/usr/share/nginx/html/DOES-NOT-EXIST" failed (2: No such file or directory), client: 192.168.0.115, server: _, request: "GET /DOES-NOT-EXIST HTTP/1.0", host: "192.168.0.100"
nginx is following the example set by the apache web server (a bad example!).
Commercial computer appliance vendors; NAS vendors, router vendors, etc., often use ad-hoc log message formatting that is even more unpredictable.
For example, from the Netgear Orbi Router SOAP client per-host log file:
text
[SOAPClient]{DEBUG}{2022-05-10 16:19:13}[soap.c:1060] generate soap request, action=ParentalControl, method=Authenticate
Here is a log snippet from a Synology DiskStation package DownloadStation:
text
2019/06/23 21:13:34 (system) trigger DownloadStation 3.8.13-3519 Begin start-stop-status start
And a snippet from a Synology DiskStation OS log file sfdisk.log
:
text
2019-04-06T01:07:40-07:00 dsnet sfdisk: Device /dev/sdq change partition.
And a snippet from a Synology DiskStation OS log file synobackup.log
on the
same host:
text
info 2018/02/24 02:30:04 SYSTEM: [Local][Backup Task Backup1] Backup task started.
(yes, those are tab characters)
Here are is a snippet from a Windows 10 Pro host, log file
${env:SystemRoot}\debug\mrt.log
text
Microsoft Windows Malicious Software Removal Tool v5.83, (build 5.83.13532.1)
Started On Thu Sep 10 10:08:35 2020
And a snippet from the same Windows host, log file
${env:SystemRoot}\comsetup.log
text
COM+[12:24:34]: ********************************************************************************
COM+[12:24:34]: Setup started - [DATE:05,27,2020 TIME: 12:24 pm]
COM+[12:24:34]: ********************************************************************************
And a snippet from the same Windows host, log file
${env:SystemRoot}\DirectX.log
text
11/01/19 20:03:40: infinst: Installed file C:\WINDOWS\system32\xactengine2_1.dll
To be fair to nginx, Netgear, Synology, and Microsoft, this chaotic logging data is typical of commercial and open-source software. But it's a mess!
Hence the need for Super Speedy Syslog Searcher!