Crates.io | determinator |
lib.rs | determinator |
version | 0.12.0 |
source | src |
created_at | 2020-12-02 23:55:12.613743 |
updated_at | 2023-06-26 01:42:36.049071 |
description | Figure out which packages changed between two commits to a workspace. |
homepage | |
repository | https://github.com/guppy-rs/guppy |
max_upload_size | |
id | 319090 |
size | 111,562 |
Figure out what packages in a Rust workspace changed between two commits.
A typical continuous integration system runs every build and every test on every pull request or proposed change. In large monorepos, most proposed changes have no effect on most packages. A target determinator decides, given a proposed change, which packages may have had changes to them.
The determinator is desiged to be used in the Diem Core workspace, which is one such monorepo.
use determinator::{Determinator, rules::DeterminatorRules};
use guppy::{CargoMetadata, graph::DependencyDirection};
use std::path::Path;
// guppy accepts `cargo metadata` JSON output. Use a pre-existing fixture for these examples.
let old_metadata = CargoMetadata::parse_json(include_str!("../../../fixtures/guppy/metadata_guppy_78cb7e8.json")).unwrap();
let old = old_metadata.build_graph().unwrap();
let new_metadata = CargoMetadata::parse_json(include_str!("../../../fixtures/guppy/metadata_guppy_869476c.json")).unwrap();
let new = new_metadata.build_graph().unwrap();
let mut determinator = Determinator::new(&old, &new);
// The determinator supports custom rules read from a TOML file.
let rules = DeterminatorRules::parse(include_str!("../../../fixtures/guppy/path-rules.toml")).unwrap();
determinator.set_rules(&rules).unwrap();
// The determinator expects a list of changed files to be passed in.
determinator.add_changed_paths(vec!["guppy/src/lib.rs", "tools/determinator/README.md"]);
let determinator_set = determinator.compute();
// determinator_set.affected_set contains the workspace packages directly or indirectly affected
// by the change.
for package in determinator_set.affected_set.packages(DependencyDirection::Forward) {
println!("affected: {}", package.name());
}
A Rust package can behave differently if one or more of the following change:
Cargo.toml
of the package.The determinator gathers data from several sources, and processes it through guppy, to figure out which packages need to be re-tested.
The determinator takes as input a list of file changes between two revisions. For each file provided:
The list of file changes can be obtained from a source control system such as Git. This crate
provides a helper which simplifies the process of enumerating file lists while handling some
gnarly edge cases. For more information, see the documentation for
Utf8Paths0
.
These simple rules may need to be customized for particular scenarios (e.g. to ignore certain files, or mark a package changed if a file outside of it changes). For those situations, the determinator lets you specify custom rules. See the Customizing behavior section below for more.
A dependency is assumed to have changed if one or more of the following change:
The determinator runs Cargo build simulations on every package in the workspace. For each package, the determinator figures out whether any of its dependencies (including feature sets) have changed. These simulations are done with:
If any of these simulated builds indicates that a workspace package has had any dependency changes, then it is marked changed.
The environment of a build or test run is anything not part of the source code that may influence it. This includes but is not limited to:
By default, the determinator assumes that the environment stays the same between runs.
To represent changes to the environment, you may need to find ways to represent those changes as files checked into the repository, and add custom rules for them. For example:
rust-toolchain
file
to represent the version of the Rust compiler. There is a default rule which causes a full
run if rust-toolchain
changes.The standard rules followed by the determinator may need to be tweaked in some situations:
For these situations, the determinator allows for custom rules to be specified. The
determinator also ships with
a default set of rules for common files
like .gitignore
and rust-toolchain
.
For more about custom rules, see the documentation for the rules
module.
While the determinator can bring significant benefits to CI and local workflows, its model is quite different from Cargo's. Please understand these limitations before using the determinator for your project.
For best results, consider doing occasional full runs in addition to determinator-based runs. You may wish to configure your CI system to use the determinator for pull-requests, and also schedule full runs every few hours on the main branch in case the determinator misses something.
The determinator cannot run build scripts.
The standard Cargo method for declaring a dependency on a file or environment variable is to
output rerun-if-changed
or rerun-if-env-changed
instructions in build scripts. These
instructions must be duplicated through custom rules.
The determinator doesn't track the include
and exclude
fields in
Cargo.toml
.
This is because the determinator's view of what's changed doesn't always align with these
fields. For example, packages typically include README
files, but the determinator has a
default rule to ignore them.
If a package includes a file outside of it, either move it into the package (recommended) or add a custom rule for it. Exclusions may be duplicated as custom rules that cause those files to be ignored.
The determinator may not be able to figure out changes to path dependencies outside the workspace. The determinator relies on metadata to figure out whether a non-workspace dependency has changed. The metadata includes:
crates.io
or a revision in a Git repositoryThis approach works for dependencies on crates.io
or other package repositories, because a
change to their source code necessarily requires a version change.
This approach also works for Git
dependencies.
It even works for Git dependencies that aren't pinned to an exact revision in Cargo.toml
,
because Cargo.lock
records exact revisions. For example:
# Specifying this in Cargo.toml...
[dependencies]
rand = { git = "https://github.com/rust-random/rand", branch = "master" }
# ...results in Cargo.lock with:
[[package]]
name = "rand"
version = "0.7.4"
source = "git+https://github.com/rust-random/rand?branch=master#50c34064c80762ddae11447adc6240f42a6bd266"
The hash at the end is the exact Git revision used, and a change to it is recognized by the determinator.
Where this scheme may not work is with path dependencies, because the files on disk can change
without a version bump. cargo build
can recognize those changes because it compares mtimes of
files on disk, but the determinator cannot do that.
This is not expected to be a problem for most projects that use workspaces. If there's future demand, it would be possible to add support for changes to non-workspace path dependencies if they're in the same repository.
One way to look at the determinator is as a kind of cache invalidation. Viewed through this lens, the main purpose of a build or test system is to cache results, and invalidate those caches based on certain parameters. When the determinator marks a package as changed, it invalidates any cached results for that package.
There are several other ways to design caching systems:
sccache
and other "bottom-up" hash-based
caching build systems.These other systems end up making different tradeoffs:
sccache
requires paths to be exact across
machines, and is unable to cache some
kinds of Rust artifacts. Also,
just like Cargo's caching, there is no way to use it for test results, only for builds.While the determinator is geared towards test runs, it also works for builds. If you wish to use the determinator for build runs, consider stacking it with another layer of caching:
sccache
to get hash-based caching for builds.This determinator is inspired by, and shares its name with, the target determinator used in Facebook's main source repository.
See the CONTRIBUTING file for how to help out.
This project is available under the terms of either the Apache 2.0 license or the MIT license.