Skip to content

Commit 38e0885

Browse files
lorenzo-stoakestorvalds
authored andcommitted
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page PMDs respectively as requiring balancing upon a subsequent page fault. User-defined PROT_NONE memory regions which also have this flag set will not normally invoke the NUMA balancing code as do_page_fault() will send a segfault to the process before handle_mm_fault() is even called. However if access_remote_vm() is invoked to access a PROT_NONE region of memory, handle_mm_fault() is called via faultin_page() and __get_user_pages() without any access checks being performed, meaning the NUMA balancing logic is incorrectly invoked on a non-NUMA memory region. A simple means of triggering this problem is to access PROT_NONE mmap'd memory using /proc/self/mem which reliably results in the NUMA handling functions being invoked when CONFIG_NUMA_BALANCING is set. This issue was reported in bugzilla (issue 99101) which includes some simple repro code. There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page() added at commit c0e7cad to avoid accidentally provoking strange behaviour by attempting to apply NUMA balancing to pages that are in fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro. This patch moves the PROT_NONE check into mm/memory.c rather than invoking BUG_ON() as faulting in these pages via faultin_page() is a valid reason for reaching the NUMA check with the PROT_NONE page table flag set and is therefore not always a bug. Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101 Reported-by: Trevor Saunders <[email protected]> Signed-off-by: Lorenzo Stoakes <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 831e45d commit 38e0885

File tree

2 files changed

+7
-8
lines changed

2 files changed

+7
-8
lines changed

mm/huge_memory.c

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1138,9 +1138,6 @@ int do_huge_pmd_numa_page(struct fault_env *fe, pmd_t pmd)
11381138
bool was_writable;
11391139
int flags = 0;
11401140

1141-
/* A PROT_NONE fault should not end up here */
1142-
BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));
1143-
11441141
fe->ptl = pmd_lock(vma->vm_mm, fe->pmd);
11451142
if (unlikely(!pmd_same(pmd, *fe->pmd)))
11461143
goto out_unlock;

mm/memory.c

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3351,9 +3351,6 @@ static int do_numa_page(struct fault_env *fe, pte_t pte)
33513351
bool was_writable = pte_write(pte);
33523352
int flags = 0;
33533353

3354-
/* A PROT_NONE fault should not end up here */
3355-
BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));
3356-
33573354
/*
33583355
* The "pte" at this point cannot be used safely without
33593356
* validation through pte_unmap_same(). It's of NUMA type but
@@ -3458,6 +3455,11 @@ static int wp_huge_pmd(struct fault_env *fe, pmd_t orig_pmd)
34583455
return VM_FAULT_FALLBACK;
34593456
}
34603457

3458+
static inline bool vma_is_accessible(struct vm_area_struct *vma)
3459+
{
3460+
return vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE);
3461+
}
3462+
34613463
/*
34623464
* These routines also need to handle stuff like marking pages dirty
34633465
* and/or accessed for architectures that don't do it in hardware (most
@@ -3524,7 +3526,7 @@ static int handle_pte_fault(struct fault_env *fe)
35243526
if (!pte_present(entry))
35253527
return do_swap_page(fe, entry);
35263528

3527-
if (pte_protnone(entry))
3529+
if (pte_protnone(entry) && vma_is_accessible(fe->vma))
35283530
return do_numa_page(fe, entry);
35293531

35303532
fe->ptl = pte_lockptr(fe->vma->vm_mm, fe->pmd);
@@ -3590,7 +3592,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
35903592

35913593
barrier();
35923594
if (pmd_trans_huge(orig_pmd) || pmd_devmap(orig_pmd)) {
3593-
if (pmd_protnone(orig_pmd))
3595+
if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
35943596
return do_huge_pmd_numa_page(&fe, orig_pmd);
35953597

35963598
if ((fe.flags & FAULT_FLAG_WRITE) &&

0 commit comments

Comments
 (0)