Skip to content

Commit 35e3517

Browse files
MiaoheLinakpm00
authored andcommitted
fork: defer linking file vma until vma is fully initialized
Thorvald reported a WARNING [1]. And the root cause is below race: CPU 1 CPU 2 fork hugetlbfs_fallocate dup_mmap hugetlbfs_punch_hole i_mmap_lock_write(mapping); vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree. i_mmap_unlock_write(mapping); hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem! i_mmap_lock_write(mapping); hugetlb_vmdelete_list vma_interval_tree_foreach hugetlb_vma_trylock_write -- Vma_lock is cleared. tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem! hugetlb_vma_unlock_write -- Vma_lock is assigned!!! i_mmap_unlock_write(mapping); hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside i_mmap_rwsem lock while vma lock can be used in the same time. Fix this by deferring linking file vma until vma is fully initialized. Those vmas should be initialized first before they can be used. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8d9bfb2 ("hugetlb: add vma based lock for pmd sharing") Signed-off-by: Miaohe Lin <[email protected]> Reported-by: Thorvald Natvig <[email protected]> Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1] Reviewed-by: Jane Chu <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Tycho Andersen <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 1f73784 commit 35e3517

File tree

1 file changed

+17
-16
lines changed

1 file changed

+17
-16
lines changed

kernel/fork.c

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -714,6 +714,23 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
714714
} else if (anon_vma_fork(tmp, mpnt))
715715
goto fail_nomem_anon_vma_fork;
716716
vm_flags_clear(tmp, VM_LOCKED_MASK);
717+
/*
718+
* Copy/update hugetlb private vma information.
719+
*/
720+
if (is_vm_hugetlb_page(tmp))
721+
hugetlb_dup_vma_private(tmp);
722+
723+
/*
724+
* Link the vma into the MT. After using __mt_dup(), memory
725+
* allocation is not necessary here, so it cannot fail.
726+
*/
727+
vma_iter_bulk_store(&vmi, tmp);
728+
729+
mm->map_count++;
730+
731+
if (tmp->vm_ops && tmp->vm_ops->open)
732+
tmp->vm_ops->open(tmp);
733+
717734
file = tmp->vm_file;
718735
if (file) {
719736
struct address_space *mapping = file->f_mapping;
@@ -730,25 +747,9 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
730747
i_mmap_unlock_write(mapping);
731748
}
732749

733-
/*
734-
* Copy/update hugetlb private vma information.
735-
*/
736-
if (is_vm_hugetlb_page(tmp))
737-
hugetlb_dup_vma_private(tmp);
738-
739-
/*
740-
* Link the vma into the MT. After using __mt_dup(), memory
741-
* allocation is not necessary here, so it cannot fail.
742-
*/
743-
vma_iter_bulk_store(&vmi, tmp);
744-
745-
mm->map_count++;
746750
if (!(tmp->vm_flags & VM_WIPEONFORK))
747751
retval = copy_page_range(tmp, mpnt);
748752

749-
if (tmp->vm_ops && tmp->vm_ops->open)
750-
tmp->vm_ops->open(tmp);
751-
752753
if (retval) {
753754
mpnt = vma_next(&vmi);
754755
goto loop_out;

0 commit comments

Comments
 (0)