|
| 1 | +- **Feature Name:** Source-based code coverage (`source-coverage`) |
| 2 | +- **Feature Request Issue:** <https://github.com/model-checking/kani/issues/2640> |
| 3 | +- **RFC PR:** <https://github.com/model-checking/kani/pull/3143> |
| 4 | +- **Status:** Under Review |
| 5 | +- **Version:** 2 |
| 6 | +- **Proof-of-concept:** <https://github.com/model-checking/kani/pull/3119> (Kani) + <https://github.com/model-checking/kani/pull/3121> (`kani-cov`) |
| 7 | + |
| 8 | +------------------- |
| 9 | + |
| 10 | +## Summary |
| 11 | + |
| 12 | +A source-based code coverage feature for Kani built on top of Rust's coverage instrumentation. |
| 13 | + |
| 14 | +## User Impact |
| 15 | + |
| 16 | +Nowadays, users can't easily obtain verification-based coverage reports in Kani. |
| 17 | +Generally speaking, these reports show which parts of the code under verification are covered and which are not. |
| 18 | +Because of that, users rely on these reports to ensure that their harnesses are sound |
| 19 | +---that is, that properties are checked for the entire body of code they're expecting to cover. |
| 20 | + |
| 21 | +Moreover, some users prefer using coverage information for harness development and debugging. |
| 22 | +That's because coverage information provides users with more familiar way to interpret verification results. |
| 23 | + |
| 24 | +As mentioned earlier, we expect users to employ this coverage-related option on several stages of a verification effort: |
| 25 | + * **Learning:** New users are more familiar with coverage reports than property-based results. |
| 26 | + * **Development:** Some users prefer coverage results to property-based results since they are easier to interpret. |
| 27 | + * **CI Integration**: Users may want to enforce a minimum percentage of code coverage for new contributions. |
| 28 | + * **Debugging:** Users may find coverage reports particularly helpful when inputs are over-constrained (missing some corner cases). |
| 29 | + * **Evaluation:** Users can easily evaluate where and when more verification work is needed (some projects aim for 100% coverage). |
| 30 | + |
| 31 | +Moreover, adding this option directly to Kani, instead of relying on another tools, is likely to: |
| 32 | + 1. Increase the speed of development |
| 33 | + 2. Improve testing for coverage features |
| 34 | + |
| 35 | +Which translates into faster and more reliable coverage options for users. |
| 36 | + |
| 37 | +### Update: from line to source coverage |
| 38 | + |
| 39 | +In the previous version of this RFC, we introduced and made available a line coverage option in Kani. |
| 40 | +This option has since then allowed us to gather more data around the expectations for a coverage option in Kani. |
| 41 | + |
| 42 | +For example, the line coverage output we produced wasn't easy to interpret without knowing some implementation details. |
| 43 | +Aside from that, the feature requested in [#2795](https://github.com/model-checking/kani/issues/2795) |
| 44 | +alludes to the need of providing coverage-specific tooling in Kani. |
| 45 | +Nevertheless, as captured in [#2640](https://github.com/model-checking/kani/issues/2640), |
| 46 | +source-based coverage results provide the clearest and most precise coverage information. |
| 47 | + |
| 48 | +In this RFC, we propose an integration with [Rust's source-based code coverage instrumentation](https://doc.rust-lang.org/rustc/instrument-coverage.html). |
| 49 | +The integration would allow us to report source-based code coverage results from Kani. |
| 50 | +Also, we propose adding a new user-facing, coverage-focused tool called `kani-cov`. |
| 51 | +The tool would allow users to process coverage results generated by Kani and produce |
| 52 | +coverage artifacts such as summaries and reports according to their preferences. |
| 53 | +In the [next section](#user-experience), we'll explain in more detail how we |
| 54 | +expect `kani-cov` to assist with coverage-related tasks. |
| 55 | + |
| 56 | +With these changes, we expect our coverage options to become more flexible, precise and efficient. |
| 57 | +In the [last section](#future-possibilities) of this RFC, |
| 58 | +we'll also discuss the requirements for a potential integration of this coverage feature with the LLVM toolchain. |
| 59 | + |
| 60 | +## User Experience |
| 61 | + |
| 62 | +The proposed coverage workflow reproduces that of the most popular coverage frameworks. |
| 63 | +First, let's delve into the LLVM coverage workflow, followed by an explanation of our proposal. |
| 64 | + |
| 65 | +### The LLVM code coverage workflow |
| 66 | + |
| 67 | +The LLVM project is home to one of the most popular code coverage frameworks. |
| 68 | +The workflow associated to the LLVM framework is described in the documentation for [source-based code coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html)[^note-source], but we briefly describe it here to better relate it with our proposal. |
| 69 | + |
| 70 | +In short, the LLVM code coverage workflow follows three steps: |
| 71 | + 1. **Compiling with coverage enabled.** This causes the compiler to generate an instrumented program. |
| 72 | + 2. **Running the instrumented program.** This generates binary-encoded `.profraw` files. |
| 73 | + 3. **Using tools to aggregate and export coverage information into other formats.** |
| 74 | + |
| 75 | +When working in a `cargo` project, step 1 can be done through this command: |
| 76 | + |
| 77 | +```sh |
| 78 | +RUSTFLAGS='-Cinstrument-coverage' cargo build |
| 79 | +``` |
| 80 | + |
| 81 | +The same flag must to be used for step 2: |
| 82 | + |
| 83 | +```sh |
| 84 | +RUSTFLAGS='-Cinstrument-coverage' cargo run |
| 85 | +``` |
| 86 | + |
| 87 | +This should populate the directory with at least one `.profraw` file. |
| 88 | +Each `.profraw` file corresponds to a specific source code file in your project. |
| 89 | + |
| 90 | +At this point, we'll have produced the artifacts that we generally require for the LLVM tools: |
| 91 | + 1. **The instrumented binary** which, in addition to the instrumented program, |
| 92 | + contains additional information (e.g., the coverage mappings) required to |
| 93 | + interpret the profiling results. |
| 94 | + 2. **The `.profraw` files** which essentially includes the profiling results |
| 95 | + (counter and expression values) for each function of the corresponding source |
| 96 | + code file. |
| 97 | + |
| 98 | +For step 3, the commands will depend on what kind of results we want. |
| 99 | +Most likely we will have to merge the `.profraw` files and produce a `.profdata` file as follows: |
| 100 | + |
| 101 | +```sh |
| 102 | +llvm-profdata merge -sparse *.profraw -o output.profdata |
| 103 | +``` |
| 104 | + |
| 105 | +Then, we can use a command such as |
| 106 | + |
| 107 | +```sh |
| 108 | +llvm-cov show target/debug/binary —instr-profile=output.profdata -show-line-counts-or-regions |
| 109 | +``` |
| 110 | + |
| 111 | +to visualize the code coverage through the terminal as in the image: |
| 112 | + |
| 113 | + |
| 114 | + |
| 115 | +or the command |
| 116 | + |
| 117 | +```sh |
| 118 | +llvm-cov report target/debug/binary --instr-profile=output.profdata --show-region-summary |
| 119 | +``` |
| 120 | + |
| 121 | +to produce coverage summaries like this: |
| 122 | + |
| 123 | +``` |
| 124 | +Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover Branches Missed Branches Cover |
| 125 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 126 | +/long/long/path/to/my/project/binary/src/main.rs 9 3 66.67% 3 1 66.67% 14 4 71.43% 0 0 - |
| 127 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 128 | +TOTAL 9 3 66.67% 3 1 66.67% 14 4 71.43% 0 0 - |
| 129 | +``` |
| 130 | + |
| 131 | +[^note-source]: The LLVM project refers to their own coverage feature as "source-based code coverage". |
| 132 | +It's not rare to see the term "region coverage" being used instead to refer to the same thing. |
| 133 | +That's because LLVM's source-based code coverage feature can report coverage for code regions, |
| 134 | +but other coverage frameworks don't support the concept of code regions. |
| 135 | + |
| 136 | +### The Kani coverage workflow |
| 137 | + |
| 138 | +The proposed Kani coverage workflow imitates the LLVM coverage workflow as much as possible. |
| 139 | + |
| 140 | +The two main components of the Kani coverage workflow will be the following: |
| 141 | + 1. A new subcommand `cov` that drives the coverage workflow in Kani and |
| 142 | + produces machine-readable coverage results. |
| 143 | + 2. A new tool `kani-cov` that consumes the machine-readable coverage results |
| 144 | + emitted by Kani to produce human-readable results in the desired format(s). |
| 145 | + |
| 146 | +Therefore, the first part of the coverage workflow will be managed by Kani. |
| 147 | +Then, in the other part, we will use the `kani-cov` tool to produce the coverage |
| 148 | +output(s) we're interested in. |
| 149 | + |
| 150 | +In the following, we describe each one of these components in more detail. |
| 151 | + |
| 152 | +#### The `-cov` option |
| 153 | + |
| 154 | +The coverage workflow will be kicked off through a new `-cov` option: |
| 155 | + |
| 156 | +```sh |
| 157 | +cargo kani -cov |
| 158 | +``` |
| 159 | + |
| 160 | +The main difference with respect to the regular verification workflow is that, |
| 161 | +at the end of the verification-based coverage run, Kani will generate two types |
| 162 | +of files: |
| 163 | + - One single file `.kanimap` file for the project. This file will contain the |
| 164 | + coverage mappings for the project's source code. |
| 165 | + - One `.kaniraw` file for each harness. This file will contain the |
| 166 | + verification-based results for the coverage-oriented properties corresponding |
| 167 | + to a given harness. |
| 168 | + |
| 169 | +Note that `.kaniraw` files correspond to `.profraw` files in the LLVM coverage |
| 170 | +workflow. Similarly, the `.kanimap` file corresponds to the coverage-related |
| 171 | +information that's embedded into the project's binaries in the LLVM coverage |
| 172 | +workflow.[^note-kanimap] |
| 173 | + |
| 174 | +The files will be written into a new timestamped directory associated with the |
| 175 | +coverage run. The path to this directory will be printed to standard output in |
| 176 | +by default. For example, the [draft implementation](https://github.com/model-checking/kani/pull/3119) |
| 177 | +writes the coverage files into the `target/kani/<target_triple>/cov/` directory. |
| 178 | + |
| 179 | +Users aren't expected to read the information in any of these files. |
| 180 | +Therefore, there's no need to restrict their format. |
| 181 | +The [draft implementation](https://github.com/model-checking/kani/pull/3119) |
| 182 | +uses the JSON format but we might consider using a binary format if it doesn't |
| 183 | +scale. |
| 184 | + |
| 185 | +[^note-kanimap]: Note that the `.kanimap` generation isn't implemented in [#3119](https://github.com/model-checking/kani/pull/3119). |
| 186 | +The [draft implementation of `kani-cov`](https://github.com/model-checking/kani/pull/3121) |
| 187 | +simply reads the source files referred to by the code coverage checks, but it |
| 188 | +doesn't get information about code trimmed out by the MIR linker. |
| 189 | + |
| 190 | +#### The `kani-cov` tool |
| 191 | + |
| 192 | +The `kani-cov` tool will be used to process coverage information generated by |
| 193 | +Kani and produce coverage outputs as indicated by the user. |
| 194 | +Hence, the `kani-cov` tool corresponds to the set of LLVM tools |
| 195 | +(`llvm-profdata`, `llvm-cov`, etc.) that are used to produce coverage outputs |
| 196 | +through the LLVM coverage workflow. |
| 197 | + |
| 198 | +In contrast to LLVM, we'll have a single tool for all Kani coverage-related needs. |
| 199 | +We suggest that the tool initially offers three subcommands[^note-export]: |
| 200 | + - `merge`: to combine the coverage results of one or more `.kaniraw` files into |
| 201 | + a single `.kanicov` file, which will be required for the other subcommands. |
| 202 | + - `report`: to display a summary of the coverage results. |
| 203 | + - `show`: to produce source-based code coverage reports in human-readable formats (e.g., HTML). |
| 204 | + |
| 205 | + |
| 206 | +Let's assume that we've run `cargo kani cov` and generated coverage files in the `my-coverage` folder. |
| 207 | +Then, we'd use `kani-cov` as follows to combine the coverage results[^note-exclude] for all harnesses: |
| 208 | + |
| 209 | +```sh |
| 210 | +kani-cov merge my-coverage/*.kaniraw -o my-coverage.kanicov |
| 211 | +``` |
| 212 | + |
| 213 | +Let's say the user is first interested in reading a coverage summary through the terminal. |
| 214 | +They will have to use the `report` command for that: |
| 215 | + |
| 216 | +```sh |
| 217 | +kani-cov report my-coverage/default.kanimap -instr-profile=my-coverage.kanicov --show-summary |
| 218 | +``` |
| 219 | + |
| 220 | +The command could print a coverage summary like: |
| 221 | + |
| 222 | +``` |
| 223 | +| Filename | Regions | Missed Regions | Cover | Functions | ... |
| 224 | +| -------- | ------- | -------------- | ----- | --------- | ... |
| 225 | +| main.rs | 9 | 3 | 66.67 | 3 | ... |
| 226 | +[...] |
| 227 | +``` |
| 228 | + |
| 229 | +Now, let's say the user wants to produce an HTML report of the coverage results. |
| 230 | +They will have to use the `show` command for that: |
| 231 | + |
| 232 | +```sh |
| 233 | +kani-cov show my-coverage/default.kanimap -format=html -instr-profile=my-coverage.kanicov -o coverage-report |
| 234 | +``` |
| 235 | + |
| 236 | +This time, the command will generate a `coverage-report` folder including a |
| 237 | +browsable HTML webpage that highlights the regions covered in the source |
| 238 | +according to the coverage results in `my-coverage.kanicov`. |
| 239 | + |
| 240 | +[^note-export]: The `llvm-cov` tool includes the option [`gcov`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-gcov) to export into GCC's coverage format [Gcov](https://en.wikipedia.org/wiki/Gcov), |
| 241 | +and the option [`export`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-export) to export into the LCOV format. |
| 242 | +I'd strongly recommend against adding format-specific options to `kani-cov` |
| 243 | +unless there are technical reasons to do so. |
| 244 | + |
| 245 | +[^note-exclude]: Options to exclude certain coverage results (e.g, from the standard library) will likely be part of this option. |
| 246 | + |
| 247 | +#### Integration with the Kani VS Code Extension |
| 248 | + |
| 249 | +We will update the coverage feature of the |
| 250 | +[Kani VS Code Extension](https://github.com/model-checking/kani-vscode-extension) |
| 251 | +to follow this new coverage workflow. |
| 252 | +In other words, the extension will first run Kani with the `-cov` option and |
| 253 | +use `kani-cov` to produce a `.kanicov` file with the coverage results. |
| 254 | +The extension will consume the source-based code coverage results and |
| 255 | +highlight region coverage in the source code seen from VS Code. |
| 256 | + |
| 257 | +We could also consider other coverage-related features in order to enhance the |
| 258 | +experience through the Kani VS Code Extension. For example, we could |
| 259 | +automatically show the percentage of covered regions in the status bar by |
| 260 | +additionally extracting a summary of the coverage results. |
| 261 | + |
| 262 | +Finally, we could also consider an integration with other code coverage tools. |
| 263 | +For example, if we wanted to integrate with the VS Code extensions |
| 264 | +[Code Coverage](https://marketplace.visualstudio.com/items?itemName=markis.code-coverage) or |
| 265 | +[Coverage Gutters](https://marketplace.visualstudio.com/items?itemName=ryanluker.vscode-coverage-gutters), |
| 266 | +we would only need to extend `kani-cov` to export coverage results to the LCOV |
| 267 | +format or integrate Kani with LLVM tools as discussed in [Integration with LLVM](#integration-with-llvm). |
| 268 | + |
| 269 | +## Detailed Design |
| 270 | + |
| 271 | +THIS SECTION INTENTIONALLY LEFT BLANK. |
| 272 | + |
| 273 | +## Rationale and alternatives |
| 274 | + |
| 275 | +### Other coverage implementations |
| 276 | + |
| 277 | +In a previous version of this feature, we used an ad-hoc coverage implementation. |
| 278 | +In addition to being very inefficient[^note-benchmarks], the line-based coverage |
| 279 | +results were not trivial to interpret by users. |
| 280 | +At the moment, there's only another unstable, GCC-compatible code coverage implementation |
| 281 | +based on the Gcov format. The Gcov format is line-based so it's not able |
| 282 | +to report region coverage results. |
| 283 | +In other words, it's not as advanced nor precise as the source-based implementation. |
| 284 | + |
| 285 | +[^note-benchmarks]: Actual performance benchmarks to follow in [#3119](https://github.com/model-checking/kani/pull/3119). |
| 286 | + |
| 287 | +## Open questions |
| 288 | + |
| 289 | + - Do we want to instrument dependencies by default? Preliminary benchmarking results show a slowdown of 100% and greater. |
| 290 | + More evaluations are required to determine how we handle instrumentation for dependencies, and what options we might want |
| 291 | + to provide to users. |
| 292 | + - How do we handle features/options for `kani-cov`? In particular, do we need more details in this RFC? |
| 293 | + |
| 294 | +## Future possibilities |
| 295 | + |
| 296 | +### Integration with LLVM |
| 297 | + |
| 298 | +We don't pursue an integration with the LLVM framework in this RFC. |
| 299 | +We recommend against doing so at this time due to various technical limitations. |
| 300 | +In a future revision, I'll explain these limitations and how we can make steps to overcome them. |
0 commit comments