Skip to content

Commit e46e7b7

Browse files
gormanmtorvalds
authored andcommitted
mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies
The optimistic fast path may use cpuset_current_mems_allowed instead of of a NULL nodemask supplied by the caller for cpuset allocations. The preferred zone is calculated on this basis for statistic purposes and as a starting point in the zonelist iterator. However, if the context can ignore memory policies due to being atomic or being able to ignore watermarks then the starting point in the zonelist iterator is no longer correct. This patch resets the zonelist iterator in the allocator slowpath if the context can ignore memory policies. This will alter the zone used for statistics but only after it is known that it makes sense for that context. Resetting it before entering the slowpath would potentially allow an ALLOC_CPUSET allocation to be accounted for against the wrong zone. Note that while nodemask is not explicitly set to the original nodemask, it would only have been overwritten if cpuset_enabled() and it was reset before the slowpath was entered. Link: http://lkml.kernel.org/r/[email protected] Fixes: c33d6c0 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Signed-off-by: Mel Gorman <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 0d0bd89 commit e46e7b7

File tree

1 file changed

+16
-7
lines changed

1 file changed

+16
-7
lines changed

mm/page_alloc.c

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3604,6 +3604,17 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
36043604
*/
36053605
alloc_flags = gfp_to_alloc_flags(gfp_mask);
36063606

3607+
/*
3608+
* Reset the zonelist iterators if memory policies can be ignored.
3609+
* These allocations are high priority and system rather than user
3610+
* orientated.
3611+
*/
3612+
if ((alloc_flags & ALLOC_NO_WATERMARKS) || !(alloc_flags & ALLOC_CPUSET)) {
3613+
ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
3614+
ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
3615+
ac->high_zoneidx, ac->nodemask);
3616+
}
3617+
36073618
/* This is the last chance, in general, before the goto nopage. */
36083619
page = get_page_from_freelist(gfp_mask, order,
36093620
alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
@@ -3612,12 +3623,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
36123623

36133624
/* Allocate without watermarks if the context allows */
36143625
if (alloc_flags & ALLOC_NO_WATERMARKS) {
3615-
/*
3616-
* Ignore mempolicies if ALLOC_NO_WATERMARKS on the grounds
3617-
* the allocation is high priority and these type of
3618-
* allocations are system rather than user orientated
3619-
*/
3620-
ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
36213626
page = get_page_from_freelist(gfp_mask, order,
36223627
ALLOC_NO_WATERMARKS, ac);
36233628
if (page)
@@ -3816,7 +3821,11 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
38163821
/* Dirty zone balancing only done in the fast path */
38173822
ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE);
38183823

3819-
/* The preferred zone is used for statistics later */
3824+
/*
3825+
* The preferred zone is used for statistics but crucially it is
3826+
* also used as the starting point for the zonelist iterator. It
3827+
* may get reset for allocations that ignore memory policies.
3828+
*/
38203829
ac.preferred_zoneref = first_zones_zonelist(ac.zonelist,
38213830
ac.high_zoneidx, ac.nodemask);
38223831
if (!ac.preferred_zoneref) {

0 commit comments

Comments
 (0)