Skip to content

Commit 73bee9a

Browse files
committed
Revert change of batch size in SSD LT3 to 64 due to convergence problem
Signed-off-by: Janusz Lisiecki <[email protected]>
1 parent 89026ae commit 73bee9a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

qa/TL3_SSD_convergence/test_pytorch.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ export NCCL_NVLS_ENABLE=0
4242

4343
# Prevent OOM due to fragmentation on 16G machines
4444
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:4096
45-
torchrun --nproc_per_node=${NUM_GPUS} main.py --backbone resnet50 --warmup 300 --bs 256 --eval-batch-size 8 --data /coco --data ${DATA_DIR} --data_pipeline dali --target 0.25 2>&1 | tee $LOG
45+
torchrun --nproc_per_node=${NUM_GPUS} main.py --backbone resnet50 --warmup 300 --bs 64 --eval-batch-size 8 --data /coco --data ${DATA_DIR} --data_pipeline dali --target 0.25 2>&1 | tee $LOG
4646
((IS_TMP_DIR)) && rm -rf ${DATA_DIR}
4747

4848
RET=${PIPESTATUS[0]}

0 commit comments

Comments
 (0)