avx on Core i5-2400 #562

kaufmannr · 2023-03-27T19:29:06Z

Good evening,

I needed to deactivate avx compile flag for my Core i5-2400 to get it compile on Linux. Somebody else did implement the missing fp16 functions as code, but I was unable to find the right place for it in the macro hell.
I think this would be even better than just deactivating avx but I cannot find this code/comment anymore.

Would be cool, if your code would run on my CPU.

Thanks + best regards, Rainer

Allows to compile on i5 2400 und Ubuntu LTS 22.04.

anzz1 · 2023-03-27T19:33:48Z

The spec sheet for i5-2400 says that AVX should be supported on that processor?

It's probably AVX2 what you want to disable, but keep AVX.
Try:

mkdir build
cd build
cmake -DLLAMA_AVX2=OFF ..
cmake --build . --config Release

Okay thats for cmake, it seems you are using make.

However , your processor supports AVX but does not support F16C.
You are doing it the wrong way around, checking for F16C first and then for AVX?
Is there a problem of checking them separately like it currently does?
You should be able to enable AVX and only disable F16C , to enable AVX features.

It doesn't work with AVX enabled and F16C disabled?

slaren · 2023-03-27T19:39:58Z

If there are any cases where F16C is used without checking for the flag, that should be fixed instead.

slaren · 2023-03-27T19:43:08Z

I think ggml.c:1131 may be the culprit:

#define GGML_F32Cx8_STORE(x, y) _mm_storeu_si128((__m128i *)(x), _mm256_cvtps_ph(y, 0))

If I am not mistaken _mm256_cvtps_ph is only available with F16C, but this code only checks for AVX.

anzz1 · 2023-03-27T19:48:48Z

I think ggml.c:1131 may be the culprit:
#define GGML_F32Cx8_STORE(x, y) _mm_storeu_si128((__m128i *)(x), _mm256_cvtps_ph(y, 0))
If I am not mistaken _mm256_cvtps_ph is only available with F16C, but this code only checks for AVX.

You are correct.
And load too

#define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))

_mm256_cvtph_ps() = VCVTPH2PS
_mm256_cvtps_ph() = VCVTPS2PH

both are F16C/CVT16 instructions.
and should be guarded with __F16C__ check

using the SSE versions instead would fix that

#define GGML_F32Cx4_LOAD(x)     __sse_f16x4_load(x)
#define GGML_F32Cx4_STORE(x, y) __sse_f16x4_store(x, y)

slaren · 2023-03-27T21:39:17Z

Superseded by #563

* .sh script V1 * koboldcpp.sh polish * koboldcpp.sh dist generator * Include html's in dist * RWKV in Linux Dist * Lower dependency requirements * Eliminate wget dependency * More distinct binary name I know its technically amd64, but I don't want to cause confusion among nvidia users. * Use System OpenCL Unsure how this will behave in the pyinstaller build, but pocl ended up CPU only. With a bit of luck the pyinstaller uses the one from the actual system if compiled in a system without opencl, while conda now includes it for that specific system. * Add cblas dependency Missing this causes compile failures on some system's * ICD workaround Ideally we find a better solution, but conda forces ICD and needs this for the successful compile. However, pyinstaller then embeds the ICD causing it to be limited to the system it was compiled for. By temporarily removing the ICD pyinstaller can't find it and everything remains functional. Ideally we do this on a pyinstaller level, but I could not find any good options to do so yet. * Fix & Nocuda --------- Co-authored-by: root <root@DESKTOP-DQ1QRAG>

embedded-IoT-kaufmann added 5 commits March 25, 2023 16:54

Enable avx for Linux only if also fp16c available.

2279cd2

Allows to compile on i5 2400 und Ubuntu LTS 22.04.

Merge branch 'ggerganov:master' into master

098eb92

Merge branch 'ggerganov:master' into master

7a7e0ac

Merge branch 'ggerganov:master' into master

8e1fb49

Merge branch 'ggerganov:master' into master

926e49e

slaren mentioned this pull request Mar 27, 2023

Fix usage of F16C intrinsics in AVX code #563

Merged

slaren closed this Mar 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

avx on Core i5-2400 #562

avx on Core i5-2400 #562

Uh oh!

kaufmannr commented Mar 27, 2023

Uh oh!

anzz1 commented Mar 27, 2023 •

edited

Loading

Uh oh!

slaren commented Mar 27, 2023 •

edited

Loading

Uh oh!

slaren commented Mar 27, 2023

Uh oh!

anzz1 commented Mar 27, 2023 •

edited

Loading

Uh oh!

slaren commented Mar 27, 2023

Uh oh!

Uh oh!

avx on Core i5-2400 #562

avx on Core i5-2400 #562

Uh oh!

Conversation

kaufmannr commented Mar 27, 2023

Uh oh!

anzz1 commented Mar 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Mar 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Mar 27, 2023

Uh oh!

anzz1 commented Mar 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Mar 27, 2023

Uh oh!

Uh oh!

anzz1 commented Mar 27, 2023 •

edited

Loading

slaren commented Mar 27, 2023 •

edited

Loading

anzz1 commented Mar 27, 2023 •

edited

Loading