Skip to content

Autoharness Subcommand #3874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- [Reference](./reference.md)
- [Attributes](./reference/attributes.md)
- [Experimental features](./reference/experimental/experimental-features.md)
- [Automatic Harness Generation](./reference/experimental/autoharness.md)
- [Coverage](./reference/experimental/coverage.md)
- [Stubbing](./reference/experimental/stubbing.md)
- [Contracts](./reference/experimental/contracts.md)
Expand Down
84 changes: 84 additions & 0 deletions docs/src/reference/experimental/autoharness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Automatic Harness Generation

Recall the harness for `estimate_size` that we wrote in [First Steps](../../tutorial-first-steps.md):
```rust
{{#include ../../tutorial/first-steps-v1/src/lib.rs:kani}}
```

This harness first declares a local variable `x` using `kani::any()`, then calls `estimate_size` with argument `x`.
Many proof harnesses follow this predictable format—to verify a function `foo`, we create arbitrary values for each of `foo`'s arguments, then call `foo` on those arguments.

The `autoharness` subcommand leverages this observation to automatically generate and run harnesses.
Kani scans the crate for functions whose arguments all implement the `kani::Arbitrary` trait, generates harnesses for them, then runs them.
These harnesses are internal to Kani--i.e., Kani does not make any changes to your source code.

## Usage
Run either:
```
# cargo kani autoharness -Z unstable-options
```
or
```
# kani autoharness -Z unstable-options <FILE>
```

If Kani detects that all of a function `foo`'s arguments implement `kani::Arbitrary`, it will generate and run a `#[kani::proof]` harness, which prints:

```
Autoharness: Checking function foo against all possible inputs...
<VERIFICATION RESULTS>
```

However, if Kani detects that `foo` has a [contract](./contracts.md), it will instead generate a `#[kani::proof_for_contract]` harness and verify the contract:
```
Autoharness: Checking function foo's contract against all possible inputs...
<VERIFICATION RESULTS>
```

Kani generates and runs these harnesses internally—the user only sees the verification results.

The `autoharness` subcommand has options `--include-function` and `--exclude-function` to include and exclude particular functions.
These flags look for partial matches against the fully qualified name of a function.

For example, if a module `my_module` has many functions, but we are only interested in `my_module::foo` and `my_module::bar`, we can run:
```
cargo run autoharness -Z unstable-options --include-function foo include-function bar
```
To exclude `my_module` entirely, run:
```
cargo run autoharness -Z unstable-options --exclude-function my_module
```

## Example
Using the `estimate_size` example from [First Steps](../../tutorial-first-steps.md) again:
```rust
{{#include ../../tutorial/first-steps-v1/src/lib.rs:code}}
```

We get:

```
# cargo kani autoharness -Z unstable-options
Autoharness: Checking function estimate_size against all possible inputs...
RESULTS:
Check 3: estimate_size.assertion.1
- Status: FAILURE
- Description: "Oh no, a failing corner case!"
[...]

Verification failed for - estimate_size
Complete - 0 successfully verified functions, 1 failures, 1 total.
```

## Request for comments
This feature is experimental and is therefore subject to change.
If you have ideas for improving the user experience of this feature,
please add them to [this GitHub issue](https://github.com/model-checking/kani/issues/3832).

## Limitations
Kani will only generate an automatic harness for a function if it can determine that all of the function's arguments implement Arbitrary.
It does not attempt to derive/implement Arbitrary for any types, even if those types could implement Arbitrary.

If a function contains a loop with a loop contract, Kani will detect the presence of a loop contract and verify that contract.
If, however, the loop does not have a contract, then there is currently no way to specify an unwinding bound for the function, meaning that Kani may hang as it tries to unwind the loop.
We recommend using the `--exclude-function` option to exclude any functions that have this issue (or `--harness-timeout` to bail after attempting verification for some amount of time).
9 changes: 9 additions & 0 deletions kani-compiler/src/args.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ pub enum ReachabilityType {
PubFns,
/// Start the cross-crate reachability analysis from all *test* (i.e. `#[test]`) harnesses in the local crate.
Tests,
/// Start the cross-crate reachability analysis from all functions in the local crate.
/// Currently, this mode is only used for automatic harness generation.
AllFns,
}

/// Command line arguments that this instance of the compiler run was called
Expand Down Expand Up @@ -88,6 +91,12 @@ pub struct Arguments {
/// Print the final LLBC file to stdout.
#[clap(long)]
pub print_llbc: bool,
/// If we are running the autoharness subcommand, the functions to autoharness
#[arg(long = "autoharness-include-function", num_args(1))]
pub autoharness_included_functions: Vec<String>,
/// If we are running the autoharness subcommand, the functions to exclude from autoverification
#[arg(long = "autoharness-exclude-function", num_args(1))]
pub autoharness_excluded_functions: Vec<String>,
}

#[derive(Debug, Clone, Copy, AsRefStr, EnumString, VariantNames, PartialEq, Eq)]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ impl CodegenBackend for LlbcCodegenBackend {
units.store_modifies(&modifies_instances);
units.write_metadata(&queries, tcx);
}
ReachabilityType::Tests => todo!(),
ReachabilityType::Tests | ReachabilityType::AllFns => todo!(),
ReachabilityType::None => {}
ReachabilityType::PubFns => {
let unit = CodegenUnit::default();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ impl CodegenBackend for GotocCodegenBackend {
let reachability = queries.args().reachability_analysis;
let mut results = GotoCodegenResults::new(tcx, reachability);
match reachability {
ReachabilityType::Harnesses => {
ReachabilityType::AllFns | ReachabilityType::Harnesses => {
let mut units = CodegenUnits::new(&queries, tcx);
let mut modifies_instances = vec![];
let mut loop_contracts_instances = vec![];
Expand Down Expand Up @@ -375,7 +375,9 @@ impl CodegenBackend for GotocCodegenBackend {
// Print compilation report.
results.print_report(tcx);

if reachability != ReachabilityType::Harnesses {
if reachability != ReachabilityType::Harnesses
&& reachability != ReachabilityType::AllFns
{
// In a workspace, cargo seems to be using the same file prefix to build a crate that is
// a package lib and also a dependency of another package.
// To avoid overriding the metadata for its verification, we skip this step when
Expand Down
185 changes: 161 additions & 24 deletions kani-compiler/src/kani_middle/codegen_units.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@
//! according to their stub configuration.

use crate::args::ReachabilityType;
use crate::kani_middle::attributes::is_proof_harness;
use crate::kani_middle::metadata::{gen_contracts_metadata, gen_proof_metadata};
use crate::kani_middle::attributes::{KaniAttributes, is_proof_harness};
use crate::kani_middle::kani_functions::{KaniIntrinsic, KaniModel};
use crate::kani_middle::metadata::{
gen_automatic_proof_metadata, gen_contracts_metadata, gen_proof_metadata,
};
use crate::kani_middle::reachability::filter_crate_items;
use crate::kani_middle::resolve::expect_resolve_fn;
use crate::kani_middle::stubbing::{check_compatibility, harness_stub_map};
Expand All @@ -20,8 +23,8 @@ use rustc_middle::ty::TyCtxt;
use rustc_session::config::OutputType;
use rustc_smir::rustc_internal;
use stable_mir::CrateDef;
use stable_mir::mir::mono::Instance;
use stable_mir::ty::{FnDef, IndexedVal, RigidTy, TyKind};
use stable_mir::mir::{TerminatorKind, mono::Instance};
use stable_mir::ty::{FnDef, GenericArgKind, GenericArgs, IndexedVal, RigidTy, TyKind};
use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};
use std::fs::File;
use std::io::BufWriter;
Expand Down Expand Up @@ -57,26 +60,66 @@ pub struct CodegenUnit {
impl CodegenUnits {
pub fn new(queries: &QueryDb, tcx: TyCtxt) -> Self {
let crate_info = CrateInfo { name: stable_mir::local_crate().name.as_str().into() };
if queries.args().reachability_analysis == ReachabilityType::Harnesses {
let base_filepath = tcx.output_filenames(()).path(OutputType::Object);
let base_filename = base_filepath.as_path();
let harnesses = filter_crate_items(tcx, |_, instance| is_proof_harness(tcx, instance));
let all_harnesses = harnesses
.into_iter()
.map(|harness| {
let metadata = gen_proof_metadata(tcx, harness, &base_filename);
(harness, metadata)
})
.collect::<HashMap<_, _>>();
let base_filepath = tcx.output_filenames(()).path(OutputType::Object);
let base_filename = base_filepath.as_path();
let args = queries.args();
match args.reachability_analysis {
ReachabilityType::Harnesses => {
let all_harnesses = get_all_manual_harnesses(tcx, base_filename);
// Even if no_stubs is empty we still need to store rustc metadata.
let units = group_by_stubs(tcx, &all_harnesses);
validate_units(tcx, &units);
debug!(?units, "CodegenUnits::new");
CodegenUnits { units, harness_info: all_harnesses, crate_info }
}
ReachabilityType::AllFns => {
let mut all_harnesses = get_all_manual_harnesses(tcx, base_filename);
let mut units = group_by_stubs(tcx, &all_harnesses);
validate_units(tcx, &units);

// Even if no_stubs is empty we still need to store rustc metadata.
let units = group_by_stubs(tcx, &all_harnesses);
validate_units(tcx, &units);
debug!(?units, "CodegenUnits::new");
CodegenUnits { units, harness_info: all_harnesses, crate_info }
} else {
// Leave other reachability type handling as is for now.
CodegenUnits { units: vec![], harness_info: HashMap::default(), crate_info }
let kani_fns = queries.kani_functions();
let kani_harness_intrinsic =
kani_fns.get(&KaniIntrinsic::AutomaticHarness.into()).unwrap();
let kani_any_inst = kani_fns.get(&KaniModel::Any.into()).unwrap();

let verifiable_fns = filter_crate_items(tcx, |_, instance: Instance| -> bool {
// If the user specified functions to include or exclude, only allow instances matching those filters.
let user_included = if !args.autoharness_included_functions.is_empty() {
fn_list_contains_instance(&instance, &args.autoharness_included_functions)
} else if !args.autoharness_excluded_functions.is_empty() {
!fn_list_contains_instance(&instance, &args.autoharness_excluded_functions)
} else {
true
};
user_included
&& is_eligible_for_automatic_harness(tcx, instance, *kani_any_inst)
});
let automatic_harnesses = get_all_automatic_harnesses(
tcx,
verifiable_fns,
*kani_harness_intrinsic,
base_filename,
);
// We generate one contract harness per function under contract, so each harness is in its own unit,
// and these harnesses have no stubs.
units.extend(
automatic_harnesses
.keys()
.map(|harness| CodegenUnit {
harnesses: vec![*harness],
stubs: HashMap::default(),
})
.collect::<Vec<_>>(),
);
all_harnesses.extend(automatic_harnesses);
// No need to validate the units again because validation only checks stubs, and we haven't added any stubs.
debug!(?units, "CodegenUnits::new");
CodegenUnits { units, harness_info: all_harnesses, crate_info }
}
_ => {
// Leave other reachability type handling as is for now.
CodegenUnits { units: vec![], harness_info: HashMap::default(), crate_info }
}
}
}

Expand All @@ -94,7 +137,15 @@ impl CodegenUnits {
/// We flag that the harness contains usage of loop contracts.
pub fn store_loop_contracts(&mut self, harnesses: &[Harness]) {
for harness in harnesses {
self.harness_info.get_mut(harness).unwrap().has_loop_contracts = true;
let metadata = self.harness_info.get_mut(harness).unwrap();
metadata.has_loop_contracts = true;
// If we're generating this harness automatically and we encounter a loop contract,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Functions with loop contracts can be verified with #[kani::proof] (e.g. see https://github.com/model-checking/kani/blob/main/tests/expected/loop-contract/count_zero.rs).

Copy link
Contributor Author

@carolynzech carolynzech Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I made the mistake of assuming that since we call them loop "contracts," they were composable with function contracts, and that does not appear to be the case.

I tried this example:

#![feature(stmt_expr_attributes)]
#![feature(proc_macro_hygiene)]

#[kani::requires(true)]
fn simple_loop_with_loop_contracts() {
    let mut x: u64 = kani::any_where(|i| *i >= 1);

    #[kani::loop_invariant(x >= 1)]
    while x > 1 {
        x = x - 1;
    }

    assert!(x == 1);
}


#[kani::proof_for_contract(simple_loop_with_loop_contracts)]
fn foo() {
    simple_loop_with_loop_contracts()
}

and the loop invariant is ignored, so the loop unwinds forever.

The autoharness generation feature shows the same behavior, i.e. given:

#![feature(proc_macro_hygiene)]
#![feature(stmt_expr_attributes)]

#[kani::requires(true)]
fn has_loop_contract() {
    let mut x: u8 = kani::any_where(|i| *i >= 2);

    #[kani::loop_invariant(x >= 2)]
    while x > 2 {
        x = x - 1;
    }

    assert!(x == 2);
}

the loop invariant is ignored.

This was an oversight on my part; I should have had test coverage for this loop contract / function contract combination case. That being said, I think we should clarify this in our documentation. @qinheping, I recommend updating the loop contracts reference chapter to clarify that the provided simple_loop_with_loop_contracts should be have a #[kani::proof] attribute--there's currently no attribute right now, so the example as given will just say "No proofs found to verify." I would also add an item to the limitations section that function contracts and loop contracts are not composable--i.e., if you try to prove your function contract, your loop invariant will be ignored.

I need to think more about what we should do if we encounter a function with both loop contracts and function contracts--I suppose we should generate both a #[kani::proof] and a #[kani::proof_for_contract] (to prove the loop invariant and the other contracts, respectively).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm at it, I'll implement the timeout/default unwind for loops without contracts that we discussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the loop invariant is ignored.

Even with -Zloop-contracts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

// ensure that the HarnessKind is updated to be a contract harness
// targeting the function to verify.
if metadata.is_automatically_generated {
metadata.attributes.kind =
HarnessKind::ProofForContract { target_fn: metadata.pretty_name.clone() }
}
}
}

Expand Down Expand Up @@ -252,3 +303,89 @@ fn apply_transitivity(tcx: TyCtxt, harness: Harness, stubs: Stubs) -> Stubs {
}
new_stubs
}

/// Fetch all manual harnesses (i.e., functions provided by the user) and generate their metadata
fn get_all_manual_harnesses(
tcx: TyCtxt,
base_filename: &Path,
) -> HashMap<Harness, HarnessMetadata> {
let harnesses = filter_crate_items(tcx, |_, instance| is_proof_harness(tcx, instance));
harnesses
.into_iter()
.map(|harness| {
let metadata = gen_proof_metadata(tcx, harness, &base_filename);
(harness, metadata)
})
.collect::<HashMap<_, _>>()
}

/// For each function eligible for automatic verification,
/// generate a harness Instance for it, then generate its metadata.
/// Note that the body of each harness instance is still the dummy body of `kani_harness_intrinsic`;
/// the AutomaticHarnessPass will later transform the bodies of these instances to actually verify the function.
fn get_all_automatic_harnesses(
tcx: TyCtxt,
verifiable_fns: Vec<Instance>,
kani_harness_intrinsic: FnDef,
base_filename: &Path,
) -> HashMap<Harness, HarnessMetadata> {
verifiable_fns
.into_iter()
.map(|fn_to_verify| {
// Set the generic arguments of the harness to be the function it is verifying
// so that later, in AutomaticHarnessPass, we can retrieve the function to verify
// and generate the harness body accordingly.
let harness = Instance::resolve(
kani_harness_intrinsic,
&GenericArgs(vec![GenericArgKind::Type(fn_to_verify.ty())]),
)
.unwrap();
let metadata = gen_automatic_proof_metadata(tcx, &base_filename, &fn_to_verify);
(harness, metadata)
})
.collect::<HashMap<_, _>>()
}

/// Determine whether `instance` is eligible for automatic verification.
fn is_eligible_for_automatic_harness(tcx: TyCtxt, instance: Instance, any_inst: FnDef) -> bool {
// `instance` is ineligble if it is a harness or has an nonexistent/empty body
if is_proof_harness(tcx, instance) || !instance.has_body() {
return false;
}
let body = instance.body().unwrap();

// `instance` is ineligble if it is an associated item of a Kani trait implementation,
// or part of Kani contract instrumentation.
// FIXME -- find a less hardcoded way of checking the former condition (perhaps upstream PR to StableMIR).
if instance.name().contains("kani::Arbitrary")
|| instance.name().contains("kani::Invariant")
|| KaniAttributes::for_instance(tcx, instance)
.fn_marker()
.is_some_and(|m| m.as_str() == "kani_contract_mode")
{
return false;
}

// Each non-generic argument of `instance`` must implement Arbitrary.
body.arg_locals().iter().all(|arg| {
let kani_any_body =
Instance::resolve(any_inst, &GenericArgs(vec![GenericArgKind::Type(arg.ty)]))
.unwrap()
.body()
.unwrap();
if let TerminatorKind::Call { func, .. } = &kani_any_body.blocks[0].terminator.kind {
if let Some((def, args)) = func.ty(body.arg_locals()).unwrap().kind().fn_def() {
return Instance::resolve(def, &args).is_ok();
}
}
false
})
}

/// Return whether the name of `instance` is included in `fn_list`.
/// If `exact = true`, then `instance`'s exact, fully-qualified name must be in `fn_list`; otherwise, `fn_list`
/// can have a substring of `instance`'s name.
fn fn_list_contains_instance(instance: &Instance, fn_list: &[String]) -> bool {
let pretty_name = instance.name();
fn_list.contains(&pretty_name) || fn_list.iter().any(|fn_name| pretty_name.contains(fn_name))
}
2 changes: 2 additions & 0 deletions kani-compiler/src/kani_middle/kani_functions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ pub enum KaniIntrinsic {
CheckedAlignOf,
#[strum(serialize = "CheckedSizeOfIntrinsic")]
CheckedSizeOf,
#[strum(serialize = "AutomaticHarnessIntrinsic")]
AutomaticHarness,
#[strum(serialize = "IsInitializedIntrinsic")]
IsInitialized,
#[strum(serialize = "ValidValueIntrinsic")]
Expand Down
Loading
Loading