Skip to content

Commit 60f272f

Browse files
pizhenweiakpm00
authored andcommitted
mm/memory-failure.c: move clear_hwpoisoned_pages
Patch series "memory-failure: fix hwpoison_filter", v2. As well known, the memory failure mechanism handles memory corrupted event, and try to send SIGBUS to the user process which uses this corrupted page. For the virtualization case, QEMU catches SIGBUS and tries to inject MCE into the guest, and the guest handles memory failure again. Thus the guest gets the minimal effect from hardware memory corruption. The further step I'm working on: 1, try to modify code to decrease poisoned pages in a single place (mm/memofy-failure.c: simplify num_poisoned_pages_dec in this series). 2, try to use page_handle_poison() to handle SetPageHWPoison() and num_poisoned_pages_inc() together. It would be best to call num_poisoned_pages_inc() in a single place too. 3, introduce memory failure notifier list in memory-failure.c: notify the corrupted PFN to someone who registers this list. If I can complete [1] and [2] part, [3] will be quite easy(just call notifier list after increasing poisoned page). 4, introduce memory recover VQ for memory balloon device, and registers memory failure notifier list. During the guest kernel handles memory failure, balloon device gets notified by memory failure notifier list, and tells the host to recover the corrupted PFN(GPA) by the new VQ. 5, host side remaps the corrupted page(HVA), and tells the guest side to unpoison the PFN(GPA). Then the guest fixes the corrupted page(GPA) dynamically. This patch (of 5): clear_hwpoisoned_pages() clears HWPoison flag and decreases the number of poisoned pages, this actually works as part of memory failure. Move this function from sparse.c to memory-failure.c, finally there is no CONFIG_MEMORY_FAILURE in sparse.c. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent cd8c1fd commit 60f272f

File tree

3 files changed

+32
-27
lines changed

3 files changed

+32
-27
lines changed

mm/internal.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -711,6 +711,9 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask)
711711
}
712712
#endif
713713

714+
/*
715+
* mm/memory-failure.c
716+
*/
714717
extern int hwpoison_filter(struct page *p);
715718

716719
extern u32 hwpoison_filter_dev_major;
@@ -720,6 +723,14 @@ extern u64 hwpoison_filter_flags_value;
720723
extern u64 hwpoison_filter_memcg;
721724
extern u32 hwpoison_filter_enable;
722725

726+
#ifdef CONFIG_MEMORY_FAILURE
727+
void clear_hwpoisoned_pages(struct page *memmap, int nr_pages);
728+
#else
729+
static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
730+
{
731+
}
732+
#endif
733+
723734
extern unsigned long __must_check vm_mmap_pgoff(struct file *, unsigned long,
724735
unsigned long, unsigned long,
725736
unsigned long, unsigned long);

mm/memory-failure.c

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2392,3 +2392,24 @@ int soft_offline_page(unsigned long pfn, int flags)
23922392

23932393
return ret;
23942394
}
2395+
2396+
void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
2397+
{
2398+
int i;
2399+
2400+
/*
2401+
* A further optimization is to have per section refcounted
2402+
* num_poisoned_pages. But that would need more space per memmap, so
2403+
* for now just do a quick global check to speed up this routine in the
2404+
* absence of bad pages.
2405+
*/
2406+
if (atomic_long_read(&num_poisoned_pages) == 0)
2407+
return;
2408+
2409+
for (i = 0; i < nr_pages; i++) {
2410+
if (PageHWPoison(&memmap[i])) {
2411+
num_poisoned_pages_dec();
2412+
ClearPageHWPoison(&memmap[i]);
2413+
}
2414+
}
2415+
}

mm/sparse.c

Lines changed: 0 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -922,33 +922,6 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
922922
return 0;
923923
}
924924

925-
#ifdef CONFIG_MEMORY_FAILURE
926-
static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
927-
{
928-
int i;
929-
930-
/*
931-
* A further optimization is to have per section refcounted
932-
* num_poisoned_pages. But that would need more space per memmap, so
933-
* for now just do a quick global check to speed up this routine in the
934-
* absence of bad pages.
935-
*/
936-
if (atomic_long_read(&num_poisoned_pages) == 0)
937-
return;
938-
939-
for (i = 0; i < nr_pages; i++) {
940-
if (PageHWPoison(&memmap[i])) {
941-
num_poisoned_pages_dec();
942-
ClearPageHWPoison(&memmap[i]);
943-
}
944-
}
945-
}
946-
#else
947-
static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
948-
{
949-
}
950-
#endif
951-
952925
void sparse_remove_section(struct mem_section *ms, unsigned long pfn,
953926
unsigned long nr_pages, unsigned long map_offset,
954927
struct vmem_altmap *altmap)

0 commit comments

Comments
 (0)