Skip to content

Add ML-DSA test vectors (FIPS 204 standard). #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gendx
Copy link

@gendx gendx commented May 28, 2025

This PR adds test vectors for ML-DSA, for the FIPS 204 standard. It's a restart of #112 without the pre-standard versions (round 3, ipd).

These tests aim to cover the following cases:

  • Baseline (signing a "Hello world" message).
  • Keys and signatures of the wrong length.
  • Signature with bit flips.
  • Signature hints with a non-canonical encoding:
    • hints that aren't sorted,
    • hints are not strictly sorted (i.e. the same hint is repeated),
    • too many hints (potentially causing a buffer overflow),
    • non-zero padding after the hints.
  • Secret keys with the s1 or s2 vectors out of the [-eta, eta] range.
  • Public key with the t1 component set to zero (allowing trivial forgeries, but the verification algorithm should still accept signatures for this key).
  • Boundary conditions in the signature rejection loop (aiming to detect incorrect comparisons):
    • z_max equals gamma1 - beta - 1 and gamma1 - beta,
    • r0_max equals gamma2 - beta - 1 and gamma2 - beta,
    • h_ones equals omega and omega + 1,
    • in the case of ML-DSA-44, |ct0|_max equals gamma2 - 1 and gamma2.
  • A "large" number of SHAKE bytes and of SHAKE blocks generated in the expand_a, expand_s, rej_ntt_poly and rej_bounded_poly functions.
  • Boundary conditions in arithmetic functions:
    • power_2_round function: when the remainder (found in t0) is equal to 4096 or -4095,
    • decompose (via high_bits or low_bits): when the condition r_plus - r_0 = q - 1 happens.

Beyond the baseline API, also covered are:

  • The context value: empty (default), regular length, or context too long.

What isn't covered:

  • The randomized variant (only the deterministic variant is covered).
  • The pre-hased variant "HashML-DSA" (only the regular variant is covered).

Notable difference(s) from the previous pull request (#112):

  • The algorithm now includes the level (e.g. "algorithm": "ML-DSA-44") to make it easier to consume without looking at file names.
  • The 32-byte seed used to generate the private key is included (privateSeed field) when meaningful and known. It's notably absent for:
    • cases that check for an invalid encoding of the private key (e.g. wrong size, out-of-range values),
    • test vectors of ML-DSA-44 where no seed is known (by construction) for the boundary condition of |ct0|_max (note that it's possible to generate similar vectors with a known seed but using a lot of brute-force).

@cpu
Copy link
Member

cpu commented May 28, 2025

@gendx Was it intentional to put the test vectors in the schema dir? I think we'd probably want them in testvectors_v1 (?) or another top-level tests directory.

@gendx gendx mentioned this pull request May 28, 2025
@gendx
Copy link
Author

gendx commented May 28, 2025

@cpu No that was a typo. Moved them to the suitable directory now :)

@cpu cpu self-requested a review June 3, 2025 17:26
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done a deep dive on the test vector content yet, but had some feedback on the schemas and splitting these up further based on which have seeds available. I also filed #147 to remind myself to put more of this into docs.

Thanks for picking this up again. I hope we can get it merged sooner than later.

"$ref": "#/definitions/MlDsaSignTestVector"
}
}
}
Copy link
Member

@cpu cpu Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each object in your new schemas can you please add additionalProperties: false to avoid allowing extra unspec'd fields, and a required list with the required fields:

      "required": [ ... ],
      "additionalProperties": false

"privateSeed": {
"type": "string",
"format": "HexBytes",
"description": "[optional] 32-byte seed that generated the private key (absent if unknown or if the private key is malformed)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split the tests that don't have a seed from the ones that do at the schema/testcase level instead of making it an optional field in one common test type.

"definitions": {
"MlDsaSignTestGroup": {
"type": "object",
"properties": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a source property ref'ing the common.json definition:

        "source": {
          "$ref": "common.json#/definitions/Source"
        },

Comment on lines +74 to +77
"generatorVersion": {
"type": "string",
"description": "the version of the test vectors."
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be marked deprecated "deprecated": true to match the other schemas. I think it's probably premature to avoid including it all-together (?)

@@ -0,0 +1,98 @@
{
"type": "object",
"definitions": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of my comments from the other schema file also apply here but I won't duplicate them :-)

"sig": {
"type": "string",
"format": "HexBytes",
"description": "The encoded signature (empty in case of failure)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe clearer as:

Suggested change
"description": "The encoded signature (empty in case of failure)"
"description": "The encoded signature (empty in case of expected failure)"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants