Skip to content

norm_act_layer altered training dynamics for mobilenetv2_120d ? #1447

Answered by rwightman
AffineParameter asked this question in Q&A
Discussion options

You must be logged in to vote

@AffineParameter please see #1444 and #1254 ... does that answer the issue? (ie are you using sync BN?) .. in general I would avoid syncbn unless you really need it (are down in very low batch sizes like < 16),

the torch native sync bn conversion hack does not work with norm + act layers, so I've added a timm version (works for native AMP + syncbn, but I haven't added support for APEX)

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@AffineParameter
Comment options

@AffineParameter
Comment options

@rwightman
Comment options

@AffineParameter
Comment options

@zifuwanggg
Comment options

Answer selected by AffineParameter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants