Skip to content

Commit ac90c56

Browse files
cmzxoakpm00
authored andcommitted
mm/ksm: refactor out try_to_merge_with_zero_page()
Patch series "mm/ksm: cmp_and_merge_page() optimizations and cleanup", v2. This series mainly optimizes cmp_and_merge_page() to have more efficient separate code flow for ksm page and non-ksm anon page. - ksm page: don't need to calculate the checksum obviously. - anon page: don't need to search stable tree if changing fast and try to merge with zero page before searching ksm page on stable tree. Please see the patch-2 for details. Patch-3 is cleanup also a little optimization for the chain()/chain_prune interfaces, which made the stable_tree_search()/stable_tree_insert() over complex. I have done simple testing using "hackbench -g 1 -l 300000" (maybe I need to use a better workload) on my machine, have seen a little CPU usage decrease of ksmd and some improvements of cmp_and_merge_page() latency: We can see the latency of cmp_and_merge_page() when handling non-ksm anon pages has been improved. This patch (of 3): In preparation for later changes, refactor out a new function called try_to_merge_with_zero_page(), which tries to merge with zero page. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chengming Zhou <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Stefan Roesch <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 003af99 commit ac90c56

File tree

2 files changed

+40
-31
lines changed

2 files changed

+40
-31
lines changed

mm/hugetlb.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2666,7 +2666,6 @@ static int gather_surplus_pages(struct hstate *h, long delta)
26662666
retry:
26672667
spin_unlock_irq(&hugetlb_lock);
26682668
for (i = 0; i < needed; i++) {
2669-
folio = NULL;
26702669
for_each_node_mask(node, cpuset_current_mems_allowed) {
26712670
if (!mbind_nodemask || node_isset(node, *mbind_nodemask)) {
26722671
folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),

mm/ksm.c

Lines changed: 40 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1527,6 +1527,44 @@ static int try_to_merge_one_page(struct vm_area_struct *vma,
15271527
return err;
15281528
}
15291529

1530+
/*
1531+
* This function returns 0 if the pages were merged or if they are
1532+
* no longer merging candidates (e.g., VMA stale), -EFAULT otherwise.
1533+
*/
1534+
static int try_to_merge_with_zero_page(struct ksm_rmap_item *rmap_item,
1535+
struct page *page)
1536+
{
1537+
struct mm_struct *mm = rmap_item->mm;
1538+
int err = -EFAULT;
1539+
1540+
/*
1541+
* Same checksum as an empty page. We attempt to merge it with the
1542+
* appropriate zero page if the user enabled this via sysfs.
1543+
*/
1544+
if (ksm_use_zero_pages && (rmap_item->oldchecksum == zero_checksum)) {
1545+
struct vm_area_struct *vma;
1546+
1547+
mmap_read_lock(mm);
1548+
vma = find_mergeable_vma(mm, rmap_item->address);
1549+
if (vma) {
1550+
err = try_to_merge_one_page(vma, page,
1551+
ZERO_PAGE(rmap_item->address));
1552+
trace_ksm_merge_one_page(
1553+
page_to_pfn(ZERO_PAGE(rmap_item->address)),
1554+
rmap_item, mm, err);
1555+
} else {
1556+
/*
1557+
* If the vma is out of date, we do not need to
1558+
* continue.
1559+
*/
1560+
err = 0;
1561+
}
1562+
mmap_read_unlock(mm);
1563+
}
1564+
1565+
return err;
1566+
}
1567+
15301568
/*
15311569
* try_to_merge_with_ksm_page - like try_to_merge_two_pages,
15321570
* but no new kernel page is allocated: kpage must already be a ksm page.
@@ -2302,7 +2340,6 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item,
23022340
*/
23032341
static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_item)
23042342
{
2305-
struct mm_struct *mm = rmap_item->mm;
23062343
struct ksm_rmap_item *tree_rmap_item;
23072344
struct page *tree_page = NULL;
23082345
struct ksm_stable_node *stable_node;
@@ -2371,36 +2408,9 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
23712408
return;
23722409
}
23732410

2374-
/*
2375-
* Same checksum as an empty page. We attempt to merge it with the
2376-
* appropriate zero page if the user enabled this via sysfs.
2377-
*/
2378-
if (ksm_use_zero_pages && (checksum == zero_checksum)) {
2379-
struct vm_area_struct *vma;
2411+
if (!try_to_merge_with_zero_page(rmap_item, page))
2412+
return;
23802413

2381-
mmap_read_lock(mm);
2382-
vma = find_mergeable_vma(mm, rmap_item->address);
2383-
if (vma) {
2384-
err = try_to_merge_one_page(vma, page,
2385-
ZERO_PAGE(rmap_item->address));
2386-
trace_ksm_merge_one_page(
2387-
page_to_pfn(ZERO_PAGE(rmap_item->address)),
2388-
rmap_item, mm, err);
2389-
} else {
2390-
/*
2391-
* If the vma is out of date, we do not need to
2392-
* continue.
2393-
*/
2394-
err = 0;
2395-
}
2396-
mmap_read_unlock(mm);
2397-
/*
2398-
* In case of failure, the page was not really empty, so we
2399-
* need to continue. Otherwise we're done.
2400-
*/
2401-
if (!err)
2402-
return;
2403-
}
24042414
tree_rmap_item =
24052415
unstable_tree_search_insert(rmap_item, page, &tree_page);
24062416
if (tree_rmap_item) {

0 commit comments

Comments
 (0)