Strainer: Find copypasta in your project

Strainer is a command-line tool that will recursively search the text files in a directory, track all duplicate lines across files, and output the matched lines and where they reside in each file.

Installation

Strainer is available on cargo.io: cargo install strainer

It has one compile-time feature flag: syntax-highlighting. With this enabled the syntect library will be used to automatically syntax-highlight code lines in the output. This roughly doubles the binary size (it's still small) and slows down the output a bit (not the search itself), and the coloration is also broken on the default macOS terminal app (iTerm2 works fine). cargo install strainer --features "syntax-highlighting"

Usage

``` USAGE: strainer [FLAGS] [OPTIONS]

FLAGS: -h, --help Prints help information -r, --removeduplicates Remove duplicate lines (keep the first occurrence). Requires --samefile. DANGER: Overwrites source files, use with caution! -s, --samefile Only check for duplicate lines within the same file. -t, --trimwhitespace Trim whitespace from the start and end of each line before comparing. -V, --version Prints version information

OPTIONS: -d, --linedelimiter The character that delimits 'lines'. Can be used, for example, to search a natural-language file by passing '.' to split on sentences. [default: \n] -l, --linepattern A basic pattern string to filter which lines will show up in results. Asterisks ('') will match any substring. [default: *] -p, --path_pattern A basic pattern string to filter which files will be searched. Asterisks ('') will match any substring. [default: *] -s, --squash_chars ... Characters that should be 'squashed' when processing a line. When a character is 'squashed', any continuous sequence of that character will be treated as a single instance. This cen be used to, for example, normalize indentation. [default: false]

ARGS: The root directory to search within ```