ClearML logging of visualization in RewardTrainer evaluation #3602

ioverho · 2025-06-16T14:05:56Z

What does this PR do?

Adds very basic support for logging the output of RewardTrainer.visualize_samples with ClearML.

Also adds the reward margin to this output, and allows the user to specify how many samples should be visualized using the TrainingArgs.

Added the num_print_samples parameter, which should control the number of samples that get printed during evaluation. It can be set to 0 to skip printing altogether.

Checks for a num_print_samples from the function signature, and then from the trainer args.
Adds the margin to the table, if it is available
Reports the table to ClearML, if it's available, using the new functions imported from trainer.utils

None, I just use ClearML instead of wandb or comet-ml, and I wanted the visualizations logged.

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

ioverho and others added 3 commits June 16, 2025 15:52

added clearml logging to RewardTrainer

553bf23

Merge branch 'huggingface:main' into clearml_reward_trainer

f8b18a8

check if clearml is available

bd591d3

ioverho marked this pull request as ready for review June 16, 2025 14:56