Gene Michaels

Status: Alpha. Tested against various code bases and doesn't blow them up, but there could still be some missed things. Right now post-formatting pre-writing it re-parses and confirms all comments are consumed as safety checks. Also files over 500kb may take all your memory and invoke the OOM killer.

Named after Gene Michaels.

Everything includes macros and comments. Dog fooded in this repo.

Differences to Rustfmt

Usage

Run cargo install genemichaels

If you're using VS Code, add the setting:

"rust-analyzer.rustfmt.overrideCommand": [ "${userHome}/.cargo/bin/genemichaels" ]

to use it with reckless abandon.

To skip specific files, in the first 5 lines of the source add a comment containing nogenemichaels.

Programmatic usage

Do cargo add genemichaels

There are three main functions:

The format functions also return lost comments - comments not formatted/added to the formatted source after processing. In an ideal world this wouldn't exist, but right now comments are added on a case by case basis and not all source tokens support comments.

How it works

At a very high level:

Then the algorithm basically wraps nodes until all lines are less than the max line width.

Leaveraging multi-threading

genemichaels now uses multi-threading!

This is the time output of a single threaded run, formatting a large codebase (78730 lines of rust): ``` Finished workspace formatting successfully in 15.99s


Executed in 16.00 secs fish external usr time 14.59 secs 331.00 micros 14.59 secs sys time 1.19 secs 174.00 micros 1.19 secs and with multi-threading: Finished workspace formatting successfully in 11.99s


Executed in 12.06 secs fish external usr time 35.99 secs 349.00 micros 35.99 secs sys time 4.87 secs 64.00 micros 4.87 secs ```

Comments

Comments deserve a special mention since they're handled out of band.

syn doesn't parse comments (except sometimes) so all the comments are extracted at the start of processing. Generally comments are associated with the next syntax element, except for end of line // comments which get associated with the first syntax element on the current line.

When building split groups, if the current syntax element has a token with a line/column matching an extracted comment, the comment is added to the split group.

Verbatim comments

Gene Michaels supports an extra comment type, //. which signals a verbatim comment, which isn't processed. Use these for commenting out source code.

Macros

Macros are formatted with a couple tricks:

  1. If it parses as rust code (either an expression or statement list), it's formatted normally.
  2. If it doesn't, it's split by ; and , since those are usually separators, then the above is tried for each chunk.
  3. Otherwise each token in the macro is concatenated with spaces (with a couple other per-case tweaks)

Q&A

See this Reddit post for many questions and answers.