Skip to content

GSoC 2020 Project Idea: Improve CVE Binary Tool Output #267

Closed
@terriko

Description

@terriko

The CVE Binary tool team is hoping to participate in Google Summer of Code (GSoC) under the Python Software Foundation umbrella. You can read all about what this means at http://python-gsoc.org/. This issue, and any others tagged 'gsoc' are not generally available bugs, but related to project ideas for GSoC.

Project Idea : Improve CVE Binary Tool Output

Project description: The CVE Binary Tool has a couple of issues related to output that could be combined into a single project:

#262 - Machine readable output. Currently the CVE Binary Tool prints information about CVEs found to the console. We'd like it to be easier for machines to parse. That issue talks about doing it in a CSV (comma-separated value) format. Once that works, you might also want to get it working with JSON or even provide prettier HTML reports with colours and additional data.

#332 - Generate full reports with CVE descriptions, etc. We don't currently store these in the database and probably don't want to for speed/space reasons, so you'd have to grab from the json. This was a feature that used to exist in cve-bin-tool before it was open sourced. The idea, I believe, is that you'd have something you could easily attach to an email or send in a meeting agenda so decisions could be made prioritizing fixes. In practice it wasn't getting used much which is why it wound up dropped before release, but it could still be useful for folk who need more info to send to their colleagues.

#413 -The csv2cve utility currently outputs a bare list of CVE numbers, while the main cve-bin-tool outputs product, version, cve_number, severity. I think it would be nice if csv2cve did the same or perhaps even better, use vendor, product, version, cve_number, severity since the vendor information is easily available.

Some older output issues that are now resolved (but might be interesting reading for the type of output fixes we want):
#182 - Unify logging vs verbose/quiet flags. Currently the CVE binary tool has both print and log statements. We'd like to switch everything to use the log system. (Solved in #276 -- thanks @PrajwalM2212)

#197 - Improve NVD output so error messages go to stderr instead of stdout. Solving #182 will probably solve this one, but as an easier bug, you could start by switching all the print statements containing errors to print to stderr. (Solved in #276 -- thanks @PrajwalM2212)

#286 - Bring back the --quiet flag (or make the equivalent --log command work like the --quiet flag did) (Solved in #290 -- thanks @PrajwalM2212)

Skills: Python, git, experience with common output formats like json and csv a bonus

Difficulty level: Intermediate

Related Readings/Links: How to add new checkers

Potential mentors: @terriko @pdxjohnny

Getting Started: Python requires that all students submit a code sample as part of your application.

One possible good first pull request for this project: Fixing the "critical" output to be "warning" output in #306

If the bugs above are already resoled, try adding a test! There are two types of easy tests you might want to try first: CVE mapping test and CVE file test. Note: the way we add tests has changed recently, so please make sure to read the instructions!

Here's the file mapping test instructions cut and pasted:

To make the basic test suite run quickly, we use "faked" binary files to test the CVE mappings. However, we want to be able to test real files to test that the signatures work on real-world data.
We have a function that takes a url, and package name and a version, and downloads the file, runs the scanner against it, and makes sure it is the file that you've specified. But we need more tests!

  • Existing tests are in test/
  • You can see the scanner tests in 'tests/test_scanner.py'
  • To add a new test, find an appropriate publicly available file (linux distribution packages and public releases of the packages itself are ideal). You should add the details of the new test case in the @pytest.mark.parametrize decorator of test_files test
  • Make sure to hide it behind the LONG_TESTS flag so we aren't doing huge number of downloads for every test suite run
    @pytest.mark.parametrize(
        "url, filename, package, version",
        list(
            itertools.chain(
                [
                    (
                        "https://archives.fedoraproject.org/pub/archive/fedora/linux"
                        "/releases/20/Everything/x86_64/os/Packages/c/",
                        "curl-7.32.0-3.fc20.x86_64.rpm",
                        "curl",
                        "7.32.0",
                    ),
                    (
                        "http://mirror.centos.org/centos/7/os/x86_64/Packages/",
                        "expat-2.1.0-10.el7_3.i686.rpm",
                        "expat",
                        "2.1.0",
                    ),
                    (
                        "http://http.us.debian.org/debian/pool/main/e/expat/",
                        "libexpat1_2.2.0-2+deb9u3_amd64.deb",
                        "expat",
                        "2.2.0",
                    ),
                    (
                        "http://archive.ubuntu.com/ubuntu/pool/universe/f/ffmpeg/",
                        "ffmpeg_4.1.1-1_amd64.deb",
                        "ffmpeg",
                        "4.1.1",
                    ),
                    .....
                    .....
                    .....
    @unittest.skipUnless(os.getenv("LONG_TESTS") == "1", "Skipping long tests")
    def test_files(self, url, filename, package, version):
        self._file_test(url, filename, package, version) 

Ideally, we should have at least one such test for each checker, and it would be nice to have some different sources for each as well. For example, for packages available in common Linux distributions, we might want to have one from fedora, one from debian, and one direct from upstream to show that we detect all those versions.

Extra credit: Got your test working and want to try something more? You can also try adding a checker before the project starts. See the related readings above for instructions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    gsocTasks related to our participation in Google Summer of Code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions