Skip to content

Commit 1c610d5

Browse files
aryabinintorvalds
authored andcommitted
mm/vmscan: wake up flushers for legacy cgroups too
Commit 726d061 ("mm: vmscan: kick flushers when we encounter dirty pages on the LRU") added flusher invocation to shrink_inactive_list() when many dirty pages on the LRU are encountered. However, shrink_inactive_list() doesn't wake up flushers for legacy cgroup reclaim, so the next commit bbef938 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") removed the only source of flusher's wake up in legacy mem cgroup reclaim path. This leads to premature OOM if there is too many dirty pages in cgroup: # mkdir /sys/fs/cgroup/memory/test # echo $$ > /sys/fs/cgroup/memory/test/tasks # echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes # dd if=/dev/zero of=tmp_file bs=1M count=100 Killed dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0 Call Trace: dump_stack+0x46/0x65 dump_header+0x6b/0x2ac oom_kill_process+0x21c/0x4a0 out_of_memory+0x2a5/0x4b0 mem_cgroup_out_of_memory+0x3b/0x60 mem_cgroup_oom_synchronize+0x2ed/0x330 pagefault_out_of_memory+0x24/0x54 __do_page_fault+0x521/0x540 page_fault+0x45/0x50 Task in /test killed as a result of limit of /test memory: usage 51200kB, limit 51200kB, failcnt 73 memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0 kmem: usage 296kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Wake up flushers in legacy cgroup reclaim too. Link: http://lkml.kernel.org/r/[email protected] Fixes: bbef938 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") Signed-off-by: Andrey Ryabinin <[email protected]> Tested-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent f59f1ca commit 1c610d5

File tree

1 file changed

+16
-15
lines changed

1 file changed

+16
-15
lines changed

mm/vmscan.c

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1779,6 +1779,20 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
17791779
if (stat.nr_writeback && stat.nr_writeback == nr_taken)
17801780
set_bit(PGDAT_WRITEBACK, &pgdat->flags);
17811781

1782+
/*
1783+
* If dirty pages are scanned that are not queued for IO, it
1784+
* implies that flushers are not doing their job. This can
1785+
* happen when memory pressure pushes dirty pages to the end of
1786+
* the LRU before the dirty limits are breached and the dirty
1787+
* data has expired. It can also happen when the proportion of
1788+
* dirty pages grows not through writes but through memory
1789+
* pressure reclaiming all the clean cache. And in some cases,
1790+
* the flushers simply cannot keep up with the allocation
1791+
* rate. Nudge the flusher threads in case they are asleep.
1792+
*/
1793+
if (stat.nr_unqueued_dirty == nr_taken)
1794+
wakeup_flusher_threads(WB_REASON_VMSCAN);
1795+
17821796
/*
17831797
* Legacy memcg will stall in page writeback so avoid forcibly
17841798
* stalling here.
@@ -1791,22 +1805,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
17911805
if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
17921806
set_bit(PGDAT_CONGESTED, &pgdat->flags);
17931807

1794-
/*
1795-
* If dirty pages are scanned that are not queued for IO, it
1796-
* implies that flushers are not doing their job. This can
1797-
* happen when memory pressure pushes dirty pages to the end of
1798-
* the LRU before the dirty limits are breached and the dirty
1799-
* data has expired. It can also happen when the proportion of
1800-
* dirty pages grows not through writes but through memory
1801-
* pressure reclaiming all the clean cache. And in some cases,
1802-
* the flushers simply cannot keep up with the allocation
1803-
* rate. Nudge the flusher threads in case they are asleep, but
1804-
* also allow kswapd to start writing pages during reclaim.
1805-
*/
1806-
if (stat.nr_unqueued_dirty == nr_taken) {
1807-
wakeup_flusher_threads(WB_REASON_VMSCAN);
1808+
/* Allow kswapd to start writing pages during reclaim. */
1809+
if (stat.nr_unqueued_dirty == nr_taken)
18081810
set_bit(PGDAT_DIRTY, &pgdat->flags);
1809-
}
18101811

18111812
/*
18121813
* If kswapd scans pages marked marked for immediate

0 commit comments

Comments
 (0)