Scan your Rust crate for semver violations.
Queries rustdoc
-generated crate documentation using the trustfall
"query everything" engine.
Each query looks for a particular kind of semver violation, such as:
- public struct was removed
- public enum's variant was removed
- public struct is no longer Send/Sync/Debug/Clone
etc.
- public enum has a new variant, but wasn't non-exhaustive in the last version
``` cargo install cargo-semver-checks
cargo semver-checks check-release --current
cargo publish
code herecargo +nightly rustdoc --all-features -- --document-private-items -Zunstable-options --output-format json ```
Each failing check references specific items in the Cargo SemVer reference
or other reference pages, as appropriate. It also includes the item name
and file location that are the cause of the problem, as well as a link
to the implementation of that query in the current version of the tool:
This crate is functional and capable of catching many semver violations. However, it won't catch every kind of semver issue, and its performance on massive crates (X00,000 lines+) has not been optimized. If you run into any problems, please open an issue!
cargo-semver-checks
to check your crateThe easiest way to use this crate is via the corresponding GitHub Action that will automatically do all the steps for you.
If you'd like to perform those steps manually, here they are:
- Perform a git checkout
of your crate's last published version*,
which will represent your semver baseline.
- Generate rustdoc
documentation in JSON format for the crate's last published version
by running cargo +nightly rustdoc --all-features -- --document-private-items -Zunstable-options --output-format json
.
- The above command will generate a file named doc/<your-crate-name>.json
in your crate's
build target directory. Copy this file somewhere else -- otherwise it will be overwritten
by the next steps.
- Switch to the version of your crate that you'd like to check.
- Repeat the cargo rustdoc
command above, and note
the newly-generated doc/<your-crate-name>.json
file in your build target directory.
- Run cargo semver-checks check-release --current <new-rustdoc> --baseline <previous-rustdoc>
.
This step will run multiple queries that look for particular kinds of semver violations,
and report violations they find.
*: Specifically, we want the largest published version number that is smaller than the version that we are preparing to publish. The distinction matters if, say, you've already published v1.2.2 and v1.3.0, and you need to backport some fixes and release v1.2.3: 1.2.2 would be your baseline, and you'd compare 1.2.3 -> 1.2.2 and not 1.2.3 -> 1.3.0.
Broadly, there are two approaches for semver-checkers: compiler-based like rust-semverver
or cargo-breaking
, and rustdoc-based like cargo-semver-checks
, cargo-crate-api
,
cargo-public-api
, and others.
Compiler-based approaches interface directly with rustc
, whereas rustdoc-based approaches
use the JSON output of rustdoc
.
Interfacing with rustc
directly allows compiler-based tools to wield the full power of
the Rust compiler, so they can check for many more kinds of semver violations. However, rustc
internals are not stable and not meant to be used as a library, making
maintainability difficult
for compiler-based approaches. They typically require running the checks on nightly Rust,
and sometimes even mandate a single specific nightly version: rust-semverver
as of this writing
needs nightly-2022-08-03
. As rustc
internals change,
constant work is required
to keep compiler-based semver tools up to date and working. It's not clear that there is a path
toward minimizing this maintenance burden, or a path toward getting these tools running
on stable Rust.
Meanwhile, rustdoc-based tools are limited by the information contained in the rustdoc JSON output,
but have a clearer stable Rust story: the rustdoc JSON generation itself is unstable
(though working with a much wider range of nightlies, and on
a path toward stabilization),
but the checking of the JSON files can be done on stable Rust. There is interest in
hosting rustdoc JSON on docs.rs
meaning
that semver-checking could one day download the baseline rustdoc JSON file instead of generating it.
Also, generally speaking, inspecting JSON data is likely going to be faster than full compilation.
cargo-semver-checks
vs cargo-crate-api
and cargo-public-api
TL;DR: cargo-semver-checks
is a linter, not
a differ.
You probably want both a linter and a differ in your toolbox, just like you'd want
both a screwdriver and a wrench.
cargo-crate-api
and cargo-public-api
output
a diff ("what changed") of your public API.
This diff is primarily useful for humans instead of in CI: just because a diff is non-empty
doesn't mean you've violated semver. These tools are most useful when generating a changelog,
or when manually inspecting the summary of changes happening in the public API.
cargo-semver-checks
is meant to be clippy-like and is especially useful in CI:
- outputs specific issues it finds,
rather than a diff you have to read and understand
- no false positives: if it reports a problem, you know it's real
- fast runtime
- clear and actionable error messages that cite relevant sections of the Rust and cargo books
cargo-semver-checks
instead of adding a linter to an existing tool?In short: - Checks should be configuration, not code. - That helps us ensure we don't have to trade off ergonomics versus maintainability.
To make a semver-checker that is pleasant to use (and therefore gets widely adopted), we have to go beyond being merely "technically correct" when reporting problems.
For example, say the tool has discovered that a pub struct
no longer implements some trait:
this is a breaking change and semver requires a major version bump. It's technically correct to
state this fact and move on, but it's more helpful to have contextually-appropriate advice and
reference links based on whether the trait in question is:
- an auto-trait like Send
, Sync
, or Sized
- a trait that is usually added via #[derive(...)]
, like Debug
or Clone
- a built-in trait that is usually not derived, like From
- one of the crate's own traits
If all our semver checks were written imperatively, it would have been difficult to reuse code and optimizations across different checks. This would have incentivized having a single overarching "trait is missing" check with a ton of special cases, i.e. complex code with a maintainability hazard.
Checks should be configuration, not code, and that's what cargo-semver-checks
does.
It uses a datasource-agnostic query engine called
Trustfall to allow writing semver checks as
declarative strongly-typed queries
over a
schema.
Adding a new semver check is as simple as adding a new file that specifies the query to run and
metadata like the error message to display in case the query finds any results (errors).
This has several advantages:
- It's easy to write more checks or specialize existing ones. Just duplicate an existing
query file and edit it to your liking. The strongly-typed query language doesn't prevent
logic errors (neither does Rust 😅), but like Rust it has a strong tendency
to "work correctly as soon as it compiles."
- Fast performance without complex code. Trustfall enables efficient lazy evaluation of queries
without any cloning of rustdoc JSON data and without unsafe
. The obvious way to write queries
is also the fast way.
- Optimizations are decoupled from queries. When a new optimization (e.g. some caching)
is added to cargo-semver-checks
(or even Trustfall itself), all queries automatically
benefit from it without needing any changes.
In principle, cargo-semver-checks
could be extended to
support running custom user-specified checks
on top of the same rustdoc JSON + cargo manifest data it uses today.
Checks are configuration, not code: the custom checks would just be a set of files that
cargo-semver-checks
is configured to run.
Similarly, cargo-semver-checks
could warn about potentially-undesirable API changes that
may have been done unintentionally, and which could have semver implications without being breaking.
An example is removing the last private field of a pub struct
that is not #[non_exhaustive]
:
this would have the side-effect of adding to the public API the ability to construct the struct
with a literal. If this change were published accidentally, undoing the change would be breaking
and would require a new major version. More examples of such useful-but-not-semver checks are
here.
This crate was intended to be published under the name cargo-semver-check
, and may indeed one
day be published under that name. Due to
an unfortunate mishap,
it remains cargo-semver-checks
for the time being.
The cargo_semver_check
name is reserved on crates.io but all its versions
are intentionally yanked. Please use the cargo-semver-checks
crate instead.
cargo test
in this crate for the first timeTesting this crate requires rustdoc JSON output data, which is too large and variable
to check into git. It has to be generated locally before cargo test
will succeed,
and will be saved in a localdata
gitignored directory in the repo root.
To generate this data, please run ./scripts/regenerate_test_rustdocs.sh
.
Checklist:
- Choose an appropriate name for your query. We'll refer to it as <query_name>
.
- Add the query file: src/queries/<query_name>.ron
.
- Add a <query-name>
feature to semver_tests/Cargo.toml
.
- Add a <query-name>.rs
file in semver_tests/src/test_cases
.
- Add code to that file that demonstrates that semver issue: write the "baseline" first,
and then use #[cfg(feature = <query_name>)]
and #[cfg(not(feature = <query_name>))]
as
necessary to alter that baseline into a shape that causes the semver issue
your query looks for.
- Add test code for false-positives and/or true-but-unintended-positives your query might report.
For example, a true-but-unintended output would be if a query that looks for
removal of public fields were to report that a struct was removed. This is unintended
since it would overwhelm the user with errors, instead of having a separate query that
specifically reports the removal of the struct rather than all its fields separately.
- Add <query_name>
to the list of features that need rustdoc data
in scripts/regenerate_test_rustdocs.sh
.
- Add the outputs you expect your query to produce over your test case in
a new file: src/test_data/<query_name>.output.run
.
- Add <query_name>
to the list of queries tested by the query_execution_tests!()
macro near the bottom of src/adapter.rs
.
- Re-run ./scripts/regenerate_test_rustdocs.sh
to generate the new rustdoc JSON file.
- Run cargo test
and ensure your new test appears in the test list and runs correctly.
- Add an include_str!("queries/<query_name>.ron"),
line to SemverQuery::all_queries()
in the src/query.rs
file, to ensure your query is enabled for use in query runs.
- Whew! You're done. Thanks for your contribution.
- If you have the energy, please try to simplify this process by removing and
automating some of these steps.
Notes:
- If using it on a massive codebase (multiple hundreds of thousands of lines of Rust),
the queries may be a bit slow: there is some O(n^2)
scaling for n
items in a few places that
I haven't had time to optimize down to O(n)
yet. Apologies! I have temporarily prioritized
features over speed, and the runtime will improve significantly with a small amount of extra work.
- No false positives: Currently, all queries report constructive proof of semver violations:
there are no false positives. They always list a file name and line number for the baseline item
that could not be found in the new code.
- There are false negatives: This tool is a work-in-progress, and cannot check all kinds of
semver violations yet. Just because it doesn't find any semver issues doesn't mean
they don't exist.