Skip to content

Commit 79f5f8f

Browse files
osalvadorvilardagatorvalds
authored andcommitted
mm,hwpoison: rework soft offline for in-use pages
This patch changes the way we set and handle in-use poisoned pages. Until now, poisoned pages were released to the buddy allocator, trusting that the checks that take place at allocation time would act as a safe net and would skip that page. This has proved to be wrong, as we got some pfn walkers out there, like compaction, that all they care is the page to be in a buddy freelist. Although this might not be the only user, having poisoned pages in the buddy allocator seems a bad idea as we should only have free pages that are ready and meant to be used as such. Before explaining the taken approach, let us break down the kind of pages we can soft offline. - Anonymous THP (after the split, they end up being 4K pages) - Hugetlb - Order-0 pages (that can be either migrated or invalited) * Normal pages (order-0 and anon-THP) - If they are clean and unmapped page cache pages, we invalidate then by means of invalidate_inode_page(). - If they are mapped/dirty, we do the isolate-and-migrate dance. Either way, do not call put_page directly from those paths. Instead, we keep the page and send it to page_handle_poison to perform the right handling. page_handle_poison sets the HWPoison flag and does the last put_page. Down the chain, we placed a check for HWPoison page in free_pages_prepare, that just skips any poisoned page, so those pages do not end up in any pcplist/freelist. After that, we set the refcount on the page to 1 and we increment the poisoned pages counter. If we see that the check in free_pages_prepare creates trouble, we can always do what we do for free pages: - wait until the page hits buddy's freelists - take it off, and flag it The downside of the above approach is that we could race with an allocation, so by the time we want to take the page off the buddy, the page has been already allocated so we cannot soft offline it. But the user could always retry it. * Hugetlb pages - We isolate-and-migrate them After the migration has been successful, we call dissolve_free_huge_page, and we set HWPoison on the page if we succeed. Hugetlb has a slightly different handling though. While for non-hugetlb pages we cared about closing the race with an allocation, doing so for hugetlb pages requires quite some additional and intrusive code (we would need to hook in free_huge_page and some other places). So I decided to not make the code overly complicated and just fail normally if the page we allocated in the meantime. We can always build on top of this. As a bonus, because of the way we handle now in-use pages, we no longer need the put-as-isolation-migratetype dance, that was guarding for poisoned pages to end up in pcplists. Signed-off-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Aristeu Rozanski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Dmitry Yakunin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Qian Cai <[email protected]> Cc: Tony Luck <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
1 parent 06be6ff commit 79f5f8f

File tree

4 files changed

+28
-70
lines changed

4 files changed

+28
-70
lines changed

include/linux/page-flags.h

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -431,14 +431,9 @@ PAGEFLAG_FALSE(Uncached)
431431
PAGEFLAG(HWPoison, hwpoison, PF_ANY)
432432
TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
433433
#define __PG_HWPOISON (1UL << PG_hwpoison)
434-
extern bool set_hwpoison_free_buddy_page(struct page *page);
435434
extern bool take_page_off_buddy(struct page *page);
436435
#else
437436
PAGEFLAG_FALSE(HWPoison)
438-
static inline bool set_hwpoison_free_buddy_page(struct page *page)
439-
{
440-
return 0;
441-
}
442437
#define __PG_HWPOISON 0
443438
#endif
444439

mm/memory-failure.c

Lines changed: 14 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,11 @@ int sysctl_memory_failure_recovery __read_mostly = 1;
6565

6666
atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
6767

68-
static void page_handle_poison(struct page *page)
68+
static void page_handle_poison(struct page *page, bool release)
6969
{
7070
SetPageHWPoison(page);
71+
if (release)
72+
put_page(page);
7173
page_ref_inc(page);
7274
num_poisoned_pages_inc();
7375
}
@@ -1765,19 +1767,13 @@ static int soft_offline_huge_page(struct page *page, int flags)
17651767
ret = -EIO;
17661768
} else {
17671769
/*
1768-
* We set PG_hwpoison only when the migration source hugepage
1769-
* was successfully dissolved, because otherwise hwpoisoned
1770-
* hugepage remains on free hugepage list, then userspace will
1771-
* find it as SIGBUS by allocation failure. That's not expected
1772-
* in soft-offlining.
1770+
* We set PG_hwpoison only when we were able to take the page
1771+
* off the buddy.
17731772
*/
1774-
ret = dissolve_free_huge_page(page);
1775-
if (!ret) {
1776-
if (set_hwpoison_free_buddy_page(page))
1777-
num_poisoned_pages_inc();
1778-
else
1779-
ret = -EBUSY;
1780-
}
1773+
if (!dissolve_free_huge_page(page) && take_page_off_buddy(page))
1774+
page_handle_poison(page, false);
1775+
else
1776+
ret = -EBUSY;
17811777
}
17821778
return ret;
17831779
}
@@ -1812,10 +1808,8 @@ static int __soft_offline_page(struct page *page, int flags)
18121808
* would need to fix isolation locking first.
18131809
*/
18141810
if (ret == 1) {
1815-
put_page(page);
18161811
pr_info("soft_offline: %#lx: invalidated\n", pfn);
1817-
SetPageHWPoison(page);
1818-
num_poisoned_pages_inc();
1812+
page_handle_poison(page, true);
18191813
return 0;
18201814
}
18211815

@@ -1846,7 +1840,9 @@ static int __soft_offline_page(struct page *page, int flags)
18461840
list_add(&page->lru, &pagelist);
18471841
ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
18481842
MIGRATE_SYNC, MR_MEMORY_FAILURE);
1849-
if (ret) {
1843+
if (!ret) {
1844+
page_handle_poison(page, true);
1845+
} else {
18501846
if (!list_empty(&pagelist))
18511847
putback_movable_pages(&pagelist);
18521848

@@ -1865,27 +1861,16 @@ static int __soft_offline_page(struct page *page, int flags)
18651861
static int soft_offline_in_use_page(struct page *page, int flags)
18661862
{
18671863
int ret;
1868-
int mt;
18691864
struct page *hpage = compound_head(page);
18701865

18711866
if (!PageHuge(page) && PageTransHuge(hpage))
18721867
if (try_to_split_thp_page(page, "soft offline") < 0)
18731868
return -EBUSY;
18741869

1875-
/*
1876-
* Setting MIGRATE_ISOLATE here ensures that the page will be linked
1877-
* to free list immediately (not via pcplist) when released after
1878-
* successful page migration. Otherwise we can't guarantee that the
1879-
* page is really free after put_page() returns, so
1880-
* set_hwpoison_free_buddy_page() highly likely fails.
1881-
*/
1882-
mt = get_pageblock_migratetype(page);
1883-
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
18841870
if (PageHuge(page))
18851871
ret = soft_offline_huge_page(page, flags);
18861872
else
18871873
ret = __soft_offline_page(page, flags);
1888-
set_pageblock_migratetype(page, mt);
18891874
return ret;
18901875
}
18911876

@@ -1894,7 +1879,7 @@ static int soft_offline_free_page(struct page *page)
18941879
int rc = -EBUSY;
18951880

18961881
if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
1897-
page_handle_poison(page);
1882+
page_handle_poison(page, false);
18981883
rc = 0;
18991884
}
19001885

mm/migrate.c

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1223,16 +1223,11 @@ static int unmap_and_move(new_page_t get_new_page,
12231223
* we want to retry.
12241224
*/
12251225
if (rc == MIGRATEPAGE_SUCCESS) {
1226-
put_page(page);
1227-
if (reason == MR_MEMORY_FAILURE) {
1226+
if (reason != MR_MEMORY_FAILURE)
12281227
/*
1229-
* Set PG_HWPoison on just freed page
1230-
* intentionally. Although it's rather weird,
1231-
* it's how HWPoison flag works at the moment.
1228+
* We release the page in page_handle_poison.
12321229
*/
1233-
if (set_hwpoison_free_buddy_page(page))
1234-
num_poisoned_pages_inc();
1235-
}
1230+
put_page(page);
12361231
} else {
12371232
if (rc != -EAGAIN) {
12381233
if (likely(!__PageMovable(page))) {

mm/page_alloc.c

Lines changed: 11 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1174,6 +1174,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
11741174

11751175
trace_mm_page_free(page, order);
11761176

1177+
if (unlikely(PageHWPoison(page)) && !order) {
1178+
/*
1179+
* Do not let hwpoison pages hit pcplists/buddy
1180+
* Untie memcg state and reset page's owner
1181+
*/
1182+
if (memcg_kmem_enabled() && PageKmemcg(page))
1183+
__memcg_kmem_uncharge_page(page, order);
1184+
reset_page_owner(page, order);
1185+
return false;
1186+
}
1187+
11771188
/*
11781189
* Check tail pages before head page information is cleared to
11791190
* avoid checking PageCompound for order-0 pages.
@@ -8844,32 +8855,4 @@ bool take_page_off_buddy(struct page *page)
88448855
spin_unlock_irqrestore(&zone->lock, flags);
88458856
return ret;
88468857
}
8847-
8848-
/*
8849-
* Set PG_hwpoison flag if a given page is confirmed to be a free page. This
8850-
* test is performed under the zone lock to prevent a race against page
8851-
* allocation.
8852-
*/
8853-
bool set_hwpoison_free_buddy_page(struct page *page)
8854-
{
8855-
struct zone *zone = page_zone(page);
8856-
unsigned long pfn = page_to_pfn(page);
8857-
unsigned long flags;
8858-
unsigned int order;
8859-
bool hwpoisoned = false;
8860-
8861-
spin_lock_irqsave(&zone->lock, flags);
8862-
for (order = 0; order < MAX_ORDER; order++) {
8863-
struct page *page_head = page - (pfn & ((1 << order) - 1));
8864-
8865-
if (PageBuddy(page_head) && page_order(page_head) >= order) {
8866-
if (!TestSetPageHWPoison(page))
8867-
hwpoisoned = true;
8868-
break;
8869-
}
8870-
}
8871-
spin_unlock_irqrestore(&zone->lock, flags);
8872-
8873-
return hwpoisoned;
8874-
}
88758858
#endif

0 commit comments

Comments
 (0)