fix: return the out tensor rather then the functions return value #2361

drbh · 2024-08-05T19:10:19Z

This PR fixes a bug with the output value from flash attention. Currently when using flash attention v1 the output of the attention function is the [0] indexed value returned from flash_attn_cuda.fwd. This tensor is not the correct shape or value and the correct return value is the out tensor.

fix: return the out tensor rather then the functions return value

516b43f

danieldk approved these changes Aug 6, 2024

View reviewed changes

Narsil merged commit 29b8d19 into main Aug 6, 2024
9 checks passed

Narsil deleted the hotfix-adjust-flash-attn-out branch August 6, 2024 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: return the out tensor rather then the functions return value #2361

fix: return the out tensor rather then the functions return value #2361

Uh oh!

drbh commented Aug 5, 2024

Uh oh!

Uh oh!

Uh oh!

fix: return the out tensor rather then the functions return value #2361

fix: return the out tensor rather then the functions return value #2361

Uh oh!

Conversation

drbh commented Aug 5, 2024

Uh oh!

Uh oh!

Uh oh!