Does truncated_bptt_steps have any effect on the eval/test phases? #6483

colllin · 2021-03-11T16:47:36Z

colllin
Mar 11, 2021

I’m using the auto_scale_batch_size feature of the trainer, and specifying the flag truncated_bptt_steps=32. What I’m seeing is that that batch size scaler settles on batch_size=16, but then during the training phase, I’m seeing GPU memory usage at ~2.2GB on a device with ~12GB of GPU memory. My dataset’s sequence len (time dimension) is currently configured to 256. It seems like the selected batch_size might be governed by a non-truncated sequence in the validation loop.

Am I right about what’s going on? If so, then would it be reasonable to also implement truncated_bptt_steps to take effect in the evaluation loop? If so... should I contribute this and copy the training-side implementation and tests?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does truncated_bptt_steps have any effect on the eval/test phases? #6483

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Does truncated_bptt_steps have any effect on the eval/test phases? #6483

Uh oh!

colllin Mar 11, 2021

Replies: 0 comments

colllin
Mar 11, 2021