Skip to content

Fix regression of model loading performance when using mlock. #2204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

spencersutton
Copy link
Contributor

@spencersutton spencersutton commented Jul 12, 2023

Performance of subsequent runs when using mlock suffered a large regression from 2347463.

It appears that changing the mmap flag from MAP_SHARED to MAP_PRIVATE causes the model to be loaded into memory fresh each time. This PR changes llama_mmap to take a new parameter called has_lora which if true calls mmap with MAP_PRIVATE and PROT_READ | PROT_WRITE otherwise uses MAP_SHARED and PROT_READ.

Caveats:

  • I've only tested this on mac OS
  • I haven't tested extensively actually using a LoRa
  • Maybe use_mmap/use_mlock/has_lora should be combined into an enum.
  • There is probably a better name for the parameter than has_lora.

@spencersutton spencersutton changed the title Fix model loading performance when using mlock. Fix regression of model loading performance when using mlock. Jul 12, 2023
@howard0su
Copy link
Collaborator

I will revert the change. I didn't test the mlock before I commit.

@ggerganov
Copy link
Member

Do we still want this PR after the revert?

@howard0su
Copy link
Collaborator

No.

@mofosyne mofosyne added performance Speed related topics bugfix fixes an issue or bug Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 10, 2024
@mofosyne mofosyne closed this May 10, 2024
@mofosyne
Copy link
Collaborator

closing. Observed #2206 as the fix that renders this PR obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug performance Speed related topics Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants