-
Notifications
You must be signed in to change notification settings - Fork 608
Fix executorch kv cache incompatibility with to_executorch lowering #7279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7279
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6fe376d with merge base 25d8f15 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
d538d43
to
ee2eb15
Compare
ee2eb15
to
46ea733
Compare
5dcb8f7
to
f723fe1
Compare
f723fe1
to
9e68531
Compare
693bbbc
to
0597d3a
Compare
0597d3a
to
8145cda
Compare
exir/emit/_emitter.py
Outdated
if initialize_buffer: | ||
assert is_mutable_buffer | ||
spec.const = True | ||
else: | ||
spec.const = not (is_user_input or is_mutable_buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests for this logic; tests that would have broken before this fix, and would have caught this kv cache incompatibility
@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
228fe5c
to
93f99ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the emitter tests!
# Test that the mutable buffer is uninitialized and starts with default zeros. | ||
torch.allclose( | ||
method_regular.execute((example_inputs))[0], | ||
torch.ones(10, dtype=torch.int64), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be zeros
, based on the comment? If not, please update the comment to clarify why this is ones
. And if it should be zeros
, did this test fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's because in the forward of the model we do self.cache_pos += 1
, I'll specify this
@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
4035d22
to
1bdc50b
Compare
1bdc50b
to
6fe376d
Compare
@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary
Fix the Llama 3.2 vision text decoder prefill issue by marking the kv cache as an initialized mutable buffer in a custom pass
Test plan