This repository was archived by the owner on Jul 1, 2023. It is now read-only.
Batchnorm changes: fix axis handling and drop workaround for AD crasher #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
axis
argument in the batch normalization layers in tf.keras and tf.layers refers to the second of the two axes that should be normalized over (see https://fenghz.github.io/images/2018-4-15/Batch_Norm_Picture.png), and defaults to the last axis (as it typically represents channels), while the first axis is always0
. We should match that semantics. We can also drop an AD workaround, enabling correct inference behavior.