Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
As titled. The new
_safe_softmax
function is meant to avoid NaN issues mostly in training. For inference, we shouldn't need it so we swap with the regular softmax, which will prevent the decomposition that introduces the unsupported ops (eq
,logical_not
andany
). See https://www.internalfb.com/code/fbsource/fbcode/caffe2/torch/_decomp/decompositions.py?lines=425.Note that it needed some changes to
run_and_verify
since we now need some aten IR changes. I will fix it in another diff, whererun_and_verify
will use a nop quantizer instead. This way the code path will be the same for fp32 and quantized. But let's make CI green first!We will also need to formalize better how to apply passes on the initial graph module (aten IR passes as opposed to edge IR passes). Seems like lifted constants and other things like that can create issues, but unless we see errors, let's wait until the IR changes from PT/ET are in first.
Reviewed By: hsharma35
Differential Revision: D61639074