Move content into new RFC file

adpaco-aws · adpaco-aws · commit ae64a17b36a0 · 2024-04-26T14:26:17.000Z
diff --git a/rfc/src/rfcs/0008-line-coverage.md b/rfc/src/rfcs/0008-line-coverage.md
@@ -1,7 +1,7 @@
 - **Feature Name:** Line coverage (`line-coverage`)
 - **Feature Request Issue:** <https://github.com/model-checking/kani/issues/2610>
 - **RFC PR:** <https://github.com/model-checking/kani/pull/2609>
-- **Status:** Unstable
+- **Status:** Cancelled
 - **Version:** 0
 - **Proof-of-concept:** <https://github.com/model-checking/kani/pull/2609> (Kani) + <https://github.com/model-checking/kani-vscode-extension/pull/122> (Kani VS Code Extension)
 
diff --git a/rfc/src/rfcs/0010-source-coverage.md b/rfc/src/rfcs/0010-source-coverage.md
@@ -0,0 +1,300 @@
+- **Feature Name:** Source-based code coverage (`source-coverage`)
+- **Feature Request Issue:** <https://github.com/model-checking/kani/issues/2640>
+- **RFC PR:** <https://github.com/model-checking/kani/pull/3143>
+- **Status:** Under Review
+- **Version:** 2
+- **Proof-of-concept:** <https://github.com/model-checking/kani/pull/3119> (Kani) + <https://github.com/model-checking/kani/pull/3121> (`kani-cov`)
+
+-------------------
+
+## Summary
+
+A source-based code coverage feature for Kani built on top of Rust's coverage instrumentation.
+
+## User Impact
+
+Nowadays, users can't easily obtain verification-based coverage reports in Kani.
+Generally speaking, these reports show which parts of the code under verification are covered and which are not.
+Because of that, users rely on these reports to ensure that their harnesses are sound
+---that is, that properties are checked for the entire body of code they're expecting to cover.
+
+Moreover, some users prefer using coverage information for harness development and debugging.
+That's because coverage information provides users with more familiar way to interpret verification results.
+
+As mentioned earlier, we expect users to employ this coverage-related option on several stages of a verification effort:
+ * **Learning:** New users are more familiar with coverage reports than property-based results.
+ * **Development:** Some users prefer coverage results to property-based results since they are easier to interpret.
+ * **CI Integration**: Users may want to enforce a minimum percentage of code coverage for new contributions.
+ * **Debugging:** Users may find coverage reports particularly helpful when inputs are over-constrained (missing some corner cases).
+ * **Evaluation:** Users can easily evaluate where and when more verification work is needed (some projects aim for 100% coverage).
+
+Moreover, adding this option directly to Kani, instead of relying on another tools, is likely to:
+ 1. Increase the speed of development
+ 2. Improve testing for coverage features
+
+Which translates into faster and more reliable coverage options for users.
+
+### Update: from line to source coverage
+
+In the previous version of this RFC, we introduced and made available a line coverage option in Kani.
+This option has since then allowed us to gather more data around the expectations for a coverage option in Kani.
+
+For example, the line coverage output we produced wasn't easy to interpret without knowing some implementation details.
+Aside from that, the feature requested in [#2795](https://github.com/model-checking/kani/issues/2795)
+alludes to the need of providing coverage-specific tooling in Kani.
+Nevertheless, as captured in [#2640](https://github.com/model-checking/kani/issues/2640),
+source-based coverage results provide the clearest and most precise coverage information.
+
+In this RFC, we propose an integration with [Rust's source-based code coverage instrumentation](https://doc.rust-lang.org/rustc/instrument-coverage.html).
+The integration would allow us to report source-based code coverage results from Kani.
+Also, we propose adding a new user-facing, coverage-focused tool called `kani-cov`.
+The tool would allow users to process coverage results generated by Kani and produce
+coverage artifacts such as summaries and reports according to their preferences.
+In the [next section](#user-experience), we'll explain in more detail how we
+expect `kani-cov` to assist with coverage-related tasks.
+
+With these changes, we expect our coverage options to become more flexible, precise and efficient.
+In the [last section](#future-possibilities) of this RFC,
+we'll also discuss the requirements for a potential integration of this coverage feature with the LLVM toolchain.
+
+## User Experience
+
+The proposed coverage workflow reproduces that of the most popular coverage frameworks.
+First, let's delve into the LLVM coverage workflow, followed by an explanation of our proposal.
+
+### The LLVM code coverage workflow
+
+The LLVM project is home to one of the most popular code coverage frameworks.
+The workflow associated to the LLVM framework is described in the documentation for [source-based code coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html)[^note-source], but we briefly describe it here to better relate it with our proposal.
+
+In short, the LLVM code coverage workflow follows three steps:
+ 1. **Compiling with coverage enabled.** This causes the compiler to generate an instrumented program.
+ 2. **Running the instrumented program.** This generates binary-encoded `.profraw` files.
+ 3. **Using tools to aggregate and export coverage information into other formats.**
+
+When working in a `cargo` project, step 1 can be done through this command:
+
+```sh
+RUSTFLAGS='-Cinstrument-coverage' cargo build
+```
+
+The same flag must to be used for step 2:
+
+```sh
+RUSTFLAGS='-Cinstrument-coverage' cargo run
+```
+
+This should populate the directory with at least one `.profraw` file.
+Each `.profraw` file corresponds to a specific source code file in your project.
+
+At this point, we'll have produced the artifacts that we generally require for the LLVM tools:
+ 1. **The instrumented binary** which, in addition to the instrumented program,
+ contains additional information (e.g., the coverage mappings) required to
+ interpret the profiling results.
+ 2. **The `.profraw` files** which essentially includes the profiling results
+ (counter and expression values) for each function of the corresponding source
+ code file.
+
+For step 3, the commands will depend on what kind of results we want.
+Most likely we will have to merge the `.profraw` files and produce a `.profdata` file as follows:
+
+```sh
+llvm-profdata merge -sparse *.profraw -o output.profdata
+```
+
+Then, we can use a command such as
+
+```sh
+llvm-cov show target/debug/binary —instr-profile=output.profdata -show-line-counts-or-regions
+```
+
+to visualize the code coverage through the terminal as in the image:
+
+![Source-based code coverage with `llvm-cov`](https://github.com/model-checking/kani/assets/73246657/4f8a973d-8977-4c0b-822d-e73ed6d223aa)
+
+or the command
+
+```sh
+llvm-cov report target/debug/binary --instr-profile=output.profdata --show-region-summary
+```
+
+to produce coverage summaries like this:
+
+```
+Filename                                             Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+/long/long/path/to/my/project/binary/src/main.rs           9                 3    66.67%           3                 1    66.67%          14                 4    71.43%           0                 0         -
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+TOTAL                                                      9                 3    66.67%           3                 1    66.67%          14                 4    71.43%           0                 0         -
+```
+
+[^note-source]: The LLVM project refers to their own coverage feature as "source-based code coverage".
+It's not rare to see the term "region coverage" being used instead to refer to the same thing.
+That's because LLVM's source-based code coverage feature can report coverage for code regions,
+but other coverage frameworks don't support the concept of code regions.
+
+### The Kani coverage workflow
+
+The proposed Kani coverage workflow imitates the LLVM coverage workflow as much as possible.
+
+The two main components of the Kani coverage workflow will be the following:
+ 1. A new subcommand `cov` that drives the coverage workflow in Kani and
+    produces machine-readable coverage results.
+ 2. A new tool `kani-cov` that consumes the machine-readable coverage results
+    emitted by Kani to produce human-readable results in the desired format(s).
+
+Therefore, the first part of the coverage workflow will be managed by Kani.
+Then, in the other part, we will use the `kani-cov` tool to produce the coverage
+output(s) we're interested in.
+
+In the following, we describe each one of these components in more detail.
+
+#### The `-cov` option
+
+The coverage workflow will be kicked off through a new `-cov` option:
+
+```sh
+cargo kani -cov
+```
+
+The main difference with respect to the regular verification workflow is that,
+at the end of the verification-based coverage run, Kani will generate two types
+of files:
+ - One single file `.kanimap` file for the project. This file will contain the
+ coverage mappings for the project's source code.
+ - One `.kaniraw` file for each harness. This file will contain the
+ verification-based results for the coverage-oriented properties corresponding
+ to a given harness.
+
+Note that `.kaniraw` files correspond to `.profraw` files in the LLVM coverage
+workflow. Similarly, the `.kanimap` file corresponds to the coverage-related
+information that's embedded into the project's binaries in the LLVM coverage
+workflow.[^note-kanimap]
+
+The files will be written into a new timestamped directory associated with the
+coverage run. The path to this directory will be printed to standard output in
+by default. For example, the [draft implementation](https://github.com/model-checking/kani/pull/3119)
+writes the coverage files into the `target/kani/<target_triple>/cov/` directory.
+
+Users aren't expected to read the information in any of these files.
+Therefore, there's no need to restrict their format.
+The [draft implementation](https://github.com/model-checking/kani/pull/3119)
+uses the JSON format but we might consider using a binary format if it doesn't
+scale.
+
+[^note-kanimap]: Note that the `.kanimap` generation isn't implemented in [#3119](https://github.com/model-checking/kani/pull/3119).
+The [draft implementation of `kani-cov`](https://github.com/model-checking/kani/pull/3121)
+simply reads the source files referred to by the code coverage checks, but it
+doesn't get information about code trimmed out by the MIR linker.
+
+#### The `kani-cov` tool
+
+The `kani-cov` tool will be used to process coverage information generated by
+Kani and produce coverage outputs as indicated by the user.
+Hence, the `kani-cov` tool corresponds to the set of LLVM tools
+(`llvm-profdata`, `llvm-cov`, etc.) that are used to produce coverage outputs
+through the LLVM coverage workflow.
+
+In contrast to LLVM, we'll have a single tool for all Kani coverage-related needs.
+We suggest that the tool initially offers three subcommands[^note-export]:
+ - `merge`: to combine the coverage results of one or more `.kaniraw` files into
+ a single `.kanicov` file, which will be required for the other subcommands.
+ - `report`: to display a summary of the coverage results.
+ - `show`: to produce source-based code coverage reports in human-readable formats (e.g., HTML).
+
+
+Let's assume that we've run `cargo kani cov` and generated coverage files in the `my-coverage` folder.
+Then, we'd use `kani-cov` as follows to combine the coverage results[^note-exclude] for all harnesses:
+
+```sh
+kani-cov merge my-coverage/*.kaniraw -o my-coverage.kanicov
+```
+
+Let's say the user is first interested in reading a coverage summary through the terminal.
+They will have to use the `report` command for that:
+
+```sh
+kani-cov report my-coverage/default.kanimap -instr-profile=my-coverage.kanicov --show-summary
+```
+
+The command could print a coverage summary like:
+
+```
+| Filename | Regions | Missed Regions | Cover | Functions | ...
+| -------- | ------- | -------------- | ----- | --------- | ...
+| main.rs  |       9 |              3 | 66.67 |         3 | ...
+[...]
+```
+
+Now, let's say the user wants to produce an HTML report of the coverage results.
+They will have to use the `show` command for that:
+
+```sh
+kani-cov show my-coverage/default.kanimap -format=html -instr-profile=my-coverage.kanicov -o coverage-report
+```
+
+This time, the command will generate a `coverage-report` folder including a
+browsable HTML webpage that highlights the regions covered in the source
+according to the coverage results in `my-coverage.kanicov`.
+
+[^note-export]: The `llvm-cov` tool includes the option [`gcov`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-gcov) to export into GCC's coverage format [Gcov](https://en.wikipedia.org/wiki/Gcov),
+and the option [`export`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-export) to export into the LCOV format.
+I'd strongly recommend against adding format-specific options to `kani-cov`
+unless there are technical reasons to do so.
+
+[^note-exclude]: Options to exclude certain coverage results (e.g, from the standard library) will likely be part of this option.
+
+#### Integration with the Kani VS Code Extension
+
+We will update the coverage feature of the
+[Kani VS Code Extension](https://github.com/model-checking/kani-vscode-extension)
+to follow this new coverage workflow.
+In other words, the extension will first run Kani with the `-cov` option and
+use `kani-cov` to produce a `.kanicov` file with the coverage results.
+The extension will consume the source-based code coverage results and
+highlight region coverage in the source code seen from VS Code.
+
+We could also consider other coverage-related features in order to enhance the
+experience through the Kani VS Code Extension. For example, we could
+automatically show the percentage of covered regions in the status bar by
+additionally extracting a summary of the coverage results.
+
+Finally, we could also consider an integration with other code coverage tools.
+For example, if we wanted to integrate with the VS Code extensions
+[Code Coverage](https://marketplace.visualstudio.com/items?itemName=markis.code-coverage) or
+[Coverage Gutters](https://marketplace.visualstudio.com/items?itemName=ryanluker.vscode-coverage-gutters),
+we would only need to extend `kani-cov` to export coverage results to the LCOV
+format or integrate Kani with LLVM tools as discussed in [Integration with LLVM](#integration-with-llvm).
+
+## Detailed Design
+
+THIS SECTION INTENTIONALLY LEFT BLANK.
+
+## Rationale and alternatives
+
+### Other coverage implementations
+
+In a previous version of this feature, we used an ad-hoc coverage implementation.
+In addition to being very inefficient[^note-benchmarks], the line-based coverage
+results were not trivial to interpret by users.
+At the moment, there's only another unstable, GCC-compatible code coverage implementation
+based on the Gcov format. The Gcov format is line-based so it's not able
+to report region coverage results.
+In other words, it's not as advanced nor precise as the source-based implementation.
+
+[^note-benchmarks]: Actual performance benchmarks to follow in [#3119](https://github.com/model-checking/kani/pull/3119).
+
+## Open questions
+
+ - Do we want to instrument dependencies by default? Preliminary benchmarking results show a slowdown of 100% and greater.
+ More evaluations are required to determine how we handle instrumentation for dependencies, and what options we might want
+ to provide to users.
+ - How do we handle features/options for `kani-cov`? In particular, do we need more details in this RFC?
+
+## Future possibilities
+
+### Integration with LLVM
+
+We don't pursue an integration with the LLVM framework in this RFC.
+We recommend against doing so at this time due to various technical limitations.
+In a future revision, I'll explain these limitations and how we can make steps to overcome them.