Closed
Description
I am training a reward model using the Llama model, but the output dimensions of the reward model's j and k are different, which results in an inability to calculate loss and leads to a "mismatch size" error.
File "examples/stack_llama/scripts/reward_modeling_sutpc.py", line 300, in <module>
trainer.train(script_args.resume_from_checkpoint)
File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1929, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 2699, in training_step
loss = self.compute_loss(model, inputs)
File "examples/stack_llama/scripts/reward_modeling_sutpc.py", line 284, in compute_loss
loss = -nn.functional.logsigmoid(rewards_j - rewards_k).mean()
RuntimeError: The size of tensor a (445) must match the size of tensor b (281) at non-singleton dimension 1
The reward_modeling.py was used with slight modifications, only replacing AutoTokenizer with LlamaTokenizer.
Metadata
Metadata
Assignees
Labels
No labels