Replies: 2 comments 3 replies
-
So, if it doesn't lead to performance improvement, why would we want to change the code? The proposed version is 2-3 times longer than the original and is also more difficult to read (at least according to my personal taste). |
Beta Was this translation helpful? Give feedback.
3 replies
-
Oh, I see. Thanks. That makes a lot of sense. Have you tried using the xcode profiler for metal? I myself am new to xcode, so I don't know how to set it up. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I took a swing at converting some stuff from float to float4 etc here:
https://github.com/sroussey/llama.cpp/pull/1/files
But the speed seems to be the same. @ikawrakow have any thoughts?
Beta Was this translation helpful? Give feedback.
All reactions