Fix regression of model loading performance when using mlock. #2204

spencersutton · 2023-07-12T22:29:57Z

Performance of subsequent runs when using mlock suffered a large regression from 2347463.

It appears that changing the mmap flag from MAP_SHARED to MAP_PRIVATE causes the model to be loaded into memory fresh each time. This PR changes llama_mmap to take a new parameter called has_lora which if true calls mmap with MAP_PRIVATE and PROT_READ | PROT_WRITE otherwise uses MAP_SHARED and PROT_READ.

Caveats:

I've only tested this on mac OS
I haven't tested extensively actually using a LoRa
Maybe use_mmap/use_mlock/has_lora should be combined into an enum.
There is probably a better name for the parameter than has_lora.

howard0su · 2023-07-12T23:50:28Z

I will revert the change. I didn't test the mlock before I commit.

ggerganov · 2023-07-14T17:25:07Z

Do we still want this PR after the revert?

howard0su · 2023-07-17T09:45:41Z

No.

mofosyne · 2024-05-10T13:08:25Z

closing. Observed #2206 as the fix that renders this PR obsolete.

spencersutton added 3 commits July 12, 2023 17:56

Set different mmap flags for lora/non-lora

f6c4e8d

Rename parameter

c14cde1

Change comment

421cc6c

spencersutton changed the title ~~Fix model loading performance when using mlock.~~ Fix regression of model loading performance when using mlock. Jul 12, 2023

Use loader field for clarity.

7e01bc5

howard0su mentioned this pull request Jul 12, 2023

Revert "Support using mmap when applying LoRA (#2095)" #2206

Merged

mofosyne added performance Speed related topics bugfix fixes an issue or bug Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 10, 2024

mofosyne closed this May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix regression of model loading performance when using mlock. #2204

Fix regression of model loading performance when using mlock. #2204

Uh oh!

spencersutton commented Jul 12, 2023 •

edited

Loading

Uh oh!

howard0su commented Jul 12, 2023

Uh oh!

ggerganov commented Jul 14, 2023

Uh oh!

howard0su commented Jul 17, 2023

Uh oh!

mofosyne commented May 10, 2024

Uh oh!

Uh oh!

Fix regression of model loading performance when using mlock. #2204

Fix regression of model loading performance when using mlock. #2204

Uh oh!

Conversation

spencersutton commented Jul 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howard0su commented Jul 12, 2023

Uh oh!

ggerganov commented Jul 14, 2023

Uh oh!

howard0su commented Jul 17, 2023

Uh oh!

mofosyne commented May 10, 2024

Uh oh!

Uh oh!

spencersutton commented Jul 12, 2023 •

edited

Loading