Skip to content

Commit 97a16fc

Browse files
gormanmtorvalds
authored andcommitted
mm, page_alloc: only enforce watermarks for order-0 allocations
The primary purpose of watermarks is to ensure that reclaim can always make forward progress in PF_MEMALLOC context (kswapd and direct reclaim). These assume that order-0 allocations are all that is necessary for forward progress. High-order watermarks serve a different purpose. Kswapd had no high-order awareness before they were introduced (https://lkml.kernel.org/r/[email protected]). This was particularly important when there were high-order atomic requests. The watermarks both gave kswapd awareness and made a reserve for those atomic requests. There are two important side-effects of this. The most important is that a non-atomic high-order request can fail even though free pages are available and the order-0 watermarks are ok. The second is that high-order watermark checks are expensive as the free list counts up to the requested order must be examined. With the introduction of MIGRATE_HIGHATOMIC it is no longer necessary to have high-order watermarks. Kswapd and compaction still need high-order awareness which is handled by checking that at least one suitable high-order page is free. With the patch applied, there was little difference in the allocation failure rates as the atomic reserves are small relative to the number of allocation attempts. The expected impact is that there will never be an allocation failure report that shows suitable pages on the free lists. The one potential side-effect of this is that in a vanilla kernel, the watermark checks may have kept a free page for an atomic allocation. Now, we are 100% relying on the HighAtomic reserves and an early allocation to have allocated them. If the first high-order atomic allocation is after the system is already heavily fragmented then it'll fail. [[email protected]: simplify __zone_watermark_ok(), per Vlastimil] Signed-off-by: Mel Gorman <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Vitaly Wool <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 0aaa29a commit 97a16fc

File tree

1 file changed

+39
-14
lines changed

1 file changed

+39
-14
lines changed

mm/page_alloc.c

Lines changed: 39 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2322,16 +2322,18 @@ static inline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
23222322
#endif /* CONFIG_FAIL_PAGE_ALLOC */
23232323

23242324
/*
2325-
* Return true if free pages are above 'mark'. This takes into account the order
2326-
* of the allocation.
2325+
* Return true if free base pages are above 'mark'. For high-order checks it
2326+
* will return true of the order-0 watermark is reached and there is at least
2327+
* one free page of a suitable size. Checking now avoids taking the zone lock
2328+
* to check in the allocation paths if no pages are free.
23272329
*/
23282330
static bool __zone_watermark_ok(struct zone *z, unsigned int order,
23292331
unsigned long mark, int classzone_idx, int alloc_flags,
23302332
long free_pages)
23312333
{
23322334
long min = mark;
23332335
int o;
2334-
long free_cma = 0;
2336+
const int alloc_harder = (alloc_flags & ALLOC_HARDER);
23352337

23362338
/* free_pages may go negative - that's OK */
23372339
free_pages -= (1 << order) - 1;
@@ -2344,30 +2346,53 @@ static bool __zone_watermark_ok(struct zone *z, unsigned int order,
23442346
* the high-atomic reserves. This will over-estimate the size of the
23452347
* atomic reserve but it avoids a search.
23462348
*/
2347-
if (likely(!(alloc_flags & ALLOC_HARDER)))
2349+
if (likely(!alloc_harder))
23482350
free_pages -= z->nr_reserved_highatomic;
23492351
else
23502352
min -= min / 4;
23512353

23522354
#ifdef CONFIG_CMA
23532355
/* If allocation can't use CMA areas don't use free CMA pages */
23542356
if (!(alloc_flags & ALLOC_CMA))
2355-
free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
2357+
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
23562358
#endif
23572359

2358-
if (free_pages - free_cma <= min + z->lowmem_reserve[classzone_idx])
2360+
/*
2361+
* Check watermarks for an order-0 allocation request. If these
2362+
* are not met, then a high-order request also cannot go ahead
2363+
* even if a suitable page happened to be free.
2364+
*/
2365+
if (free_pages <= min + z->lowmem_reserve[classzone_idx])
23592366
return false;
2360-
for (o = 0; o < order; o++) {
2361-
/* At the next order, this order's pages become unavailable */
2362-
free_pages -= z->free_area[o].nr_free << o;
23632367

2364-
/* Require fewer higher order pages to be free */
2365-
min >>= 1;
2368+
/* If this is an order-0 request then the watermark is fine */
2369+
if (!order)
2370+
return true;
2371+
2372+
/* For a high-order request, check at least one suitable page is free */
2373+
for (o = order; o < MAX_ORDER; o++) {
2374+
struct free_area *area = &z->free_area[o];
2375+
int mt;
2376+
2377+
if (!area->nr_free)
2378+
continue;
2379+
2380+
if (alloc_harder)
2381+
return true;
23662382

2367-
if (free_pages <= min)
2368-
return false;
2383+
for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
2384+
if (!list_empty(&area->free_list[mt]))
2385+
return true;
2386+
}
2387+
2388+
#ifdef CONFIG_CMA
2389+
if ((alloc_flags & ALLOC_CMA) &&
2390+
!list_empty(&area->free_list[MIGRATE_CMA])) {
2391+
return true;
2392+
}
2393+
#endif
23692394
}
2370-
return true;
2395+
return false;
23712396
}
23722397

23732398
bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,

0 commit comments

Comments
 (0)