Skip to content

Commit 337546e

Browse files
surenbaghdasaryantorvalds
authored andcommitted
mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
Race between process_mrelease and exit_mmap, where free_pgtables is called while __oom_reap_task_mm is in progress, leads to kernel crash during pte_offset_map_lock call. oom-reaper avoids this race by setting MMF_OOM_VICTIM flag and causing exit_mmap to take and release mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock. Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to fix this race, however that would be considered a hack. Fix this race by elevating mm->mm_users and preventing exit_mmap from executing until process_mrelease is finished. Patch slightly refactors the code to adapt for a possible mmget_not_zero failure. This fix has considerable negative impact on process_mrelease performance and will likely need later optimization. Link: https://lkml.kernel.org/r/[email protected] Fixes: 884a7e5 ("mm: introduce process_mrelease system call") Signed-off-by: Suren Baghdasaryan <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jann Horn <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Florian Weimer <[email protected]> Cc: Jan Engelhardt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent eac96c3 commit 337546e

File tree

1 file changed

+12
-11
lines changed

1 file changed

+12
-11
lines changed

mm/oom_kill.c

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1150,7 +1150,7 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
11501150
struct task_struct *task;
11511151
struct task_struct *p;
11521152
unsigned int f_flags;
1153-
bool reap = true;
1153+
bool reap = false;
11541154
struct pid *pid;
11551155
long ret = 0;
11561156

@@ -1177,15 +1177,15 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
11771177
goto put_task;
11781178
}
11791179

1180-
mm = p->mm;
1181-
mmgrab(mm);
1182-
1183-
/* If the work has been done already, just exit with success */
1184-
if (test_bit(MMF_OOM_SKIP, &mm->flags))
1185-
reap = false;
1186-
else if (!task_will_free_mem(p)) {
1187-
reap = false;
1188-
ret = -EINVAL;
1180+
if (mmget_not_zero(p->mm)) {
1181+
mm = p->mm;
1182+
if (task_will_free_mem(p))
1183+
reap = true;
1184+
else {
1185+
/* Error only if the work has not been done already */
1186+
if (!test_bit(MMF_OOM_SKIP, &mm->flags))
1187+
ret = -EINVAL;
1188+
}
11891189
}
11901190
task_unlock(p);
11911191

@@ -1201,7 +1201,8 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
12011201
mmap_read_unlock(mm);
12021202

12031203
drop_mm:
1204-
mmdrop(mm);
1204+
if (mm)
1205+
mmput(mm);
12051206
put_task:
12061207
put_task_struct(task);
12071208
put_pid:

0 commit comments

Comments
 (0)