What batch size number other than 1024 have you tried when training a DeiT or ViT model? #1608
Unanswered
Phuoc-Hoan-Le
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What batch size number other than batch size of 1024 have you tried when training a DeiT or ViT model? In the paper, DeiT (https://arxiv.org/abs/2012.12877), they used a batch size of 1024 and they mentioned that the learning rate should be scaled according to the batch size.
However, I was wondering if you guys have any experience or successfully train a DeiT model with a batch size that is even less than 512? If yes, what accuracy did you achieve?
Beta Was this translation helpful? Give feedback.
All reactions