-
Notifications
You must be signed in to change notification settings - Fork 12.2k
avx on Core i5-2400 #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avx on Core i5-2400 #562
Conversation
Allows to compile on i5 2400 und Ubuntu LTS 22.04.
The spec sheet for i5-2400 says that AVX should be supported on that processor? It's probably AVX2 what you want to disable, but keep AVX. mkdir build
cd build
cmake -DLLAMA_AVX2=OFF ..
cmake --build . --config Release Okay thats for cmake, it seems you are using make. However , your processor supports AVX but does not support F16C. It doesn't work with AVX enabled and F16C disabled? |
If there are any cases where F16C is used without checking for the flag, that should be fixed instead. |
I think #define GGML_F32Cx8_STORE(x, y) _mm_storeu_si128((__m128i *)(x), _mm256_cvtps_ph(y, 0)) If I am not mistaken |
You are correct. #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
both are F16C/CVT16 instructions. using the SSE versions instead would fix that #define GGML_F32Cx4_LOAD(x) __sse_f16x4_load(x)
#define GGML_F32Cx4_STORE(x, y) __sse_f16x4_store(x, y) |
Superseded by #563 |
* .sh script V1 * koboldcpp.sh polish * koboldcpp.sh dist generator * Include html's in dist * RWKV in Linux Dist * Lower dependency requirements * Eliminate wget dependency * More distinct binary name I know its technically amd64, but I don't want to cause confusion among nvidia users. * Use System OpenCL Unsure how this will behave in the pyinstaller build, but pocl ended up CPU only. With a bit of luck the pyinstaller uses the one from the actual system if compiled in a system without opencl, while conda now includes it for that specific system. * Add cblas dependency Missing this causes compile failures on some system's * ICD workaround Ideally we find a better solution, but conda forces ICD and needs this for the successful compile. However, pyinstaller then embeds the ICD causing it to be limited to the system it was compiled for. By temporarily removing the ICD pyinstaller can't find it and everything remains functional. Ideally we do this on a pyinstaller level, but I could not find any good options to do so yet. * Fix & Nocuda --------- Co-authored-by: root <root@DESKTOP-DQ1QRAG>
Good evening,
I needed to deactivate avx compile flag for my Core i5-2400 to get it compile on Linux. Somebody else did implement the missing fp16 functions as code, but I was unable to find the right place for it in the macro hell.
I think this would be even better than just deactivating avx but I cannot find this code/comment anymore.
Would be cool, if your code would run on my CPU.
Thanks + best regards, Rainer