-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Nan loss after several training iteration #33
Comments
Hi, Did you change the hyper parameters for training, like number of iterations or batch size? If you change the batch size, you need to adapt the learning rate / number of iterations accordingly. |
Hi, all the losses keep updating after I set a smaller learning rate for the model. It seems that your suggestion works. Many thanks! |
Great, thanks! I'll update the README with more details. |
Hi, because the objects in my own dataset is small, so i made two change: training after a few iterations , all the losses in this model become nan. |
@guanbin1994 actually Ideally we would like to remove |
I just use Res101 and FPN, so what should i do? |
You should keep the same |
I also had this problem and I solved it. However, the above changes made the expectation time for training (i.e., |
❓ Questions and Help
Hi. I am training maskrcnn model with e2e_faster_rcnn_R_50_FPN_1x.yaml on coco dataset. However, after a few iterations (460), all the losses in this model become nan. I have checked the default number of class in config file, but the result (81) seems correct for coco dataset. Can anyone give me some suggestion about how to solve that? Many thanks!
The text was updated successfully, but these errors were encountered: