Nunchaku uses pytest as its testing framework.
After installing nunchaku
as described in the README, you can install the test dependencies with:
pip install -r tests/requirements.txt
HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/flux/test_flux_memory.py
HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/flux --ignore=tests/flux/test_flux_memory.py
HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/sana
Note:
$YOUR_HF_TOKEN
refers to your Hugging Face access token, required to download models and datasets. You can create one at huggingface.co/settings/tokens. If you've already logged in usinghuggingface-cli login
, you can skip setting this environment variable.
Some tests generate images using the original 16-bit models. You can cache these results to speed up future test runs by setting the environment variable NUNCHAKU_TEST_CACHE_ROOT
. If not set, the images will be saved in test_results/ref
.
When adding a new feature, please include corresponding test cases in the tests
directory. Please avoid modifying existing tests.
To test visual output correctness, you can:
-
Generate reference images: Use the original 16-bit model to produce a small number of reference images (e.g., 4).
-
Generate comparison images: Run your method using the same inputs and seeds to ensure deterministic outputs. You can control the seed by setting the
generator
parameter in the diffusers pipeline. -
Compute similarity: Evaluate the similarity between your outputs and the reference images using the LPIPS metric. Use the
compute_lpips
function provided intests/flux/utils.py
:lpips = compute_lpips(dir1, dir2)
Here,
dir1
should point to the directory containing the reference images, anddir2
should contain the images generated by your method.
To pass the test, the LPIPS score must be below a predefined threshold—typically < 0.3. We recommend first running the comparison locally to observe the LPIPS value, and then setting the threshold slightly above that value to allow for minor variations. Since the test is based on a small sample of images, slight fluctuations are expected; a margin of +0.04 is generally sufficient.
This contribution guide is adapted from SGLang. We thank them for the inspiration.