Skip to content

Commit 172b06c

Browse files
rgushchingregkh
authored andcommitted
mm: slowly shrink slabs with a relatively small number of objects
9092c71 ("mm: use sc->priority for slab shrink targets") changed the way that the target slab pressure is calculated and made it priority-based: delta = freeable >> priority; delta *= 4; do_div(delta, shrinker->seeks); The problem is that on a default priority (which is 12) no pressure is applied at all, if the number of potentially reclaimable objects is less than 4096 (1<<12). This causes the last objects on slab caches of no longer used cgroups to (almost) never get reclaimed. It's obviously a waste of memory. It can be especially painful, if these stale objects are holding a reference to a dying cgroup. Slab LRU lists are reparented on memcg offlining, but corresponding objects are still holding a reference to the dying cgroup. If we don't scan these objects, the dying cgroup can't go away. Most likely, the parent cgroup hasn't any directly charged objects, only remaining objects from dying children cgroups. So it can easily hold a reference to hundreds of dying cgroups. If there are no big spikes in memory pressure, and new memory cgroups are created and destroyed periodically, this causes the number of dying cgroups grow steadily, causing a slow-ish and hard-to-detect memory "leak". It's not a real leak, as the memory can be eventually reclaimed, but it could not happen in a real life at all. I've seen hosts with a steadily climbing number of dying cgroups, which doesn't show any signs of a decline in months, despite the host is loaded with a production workload. It is an obvious waste of memory, and to prevent it, let's apply a minimal pressure even on small shrinker lists. E.g. if there are freeable objects, let's scan at least min(freeable, scan_batch) objects. This fix significantly improves a chance of a dying cgroup to be reclaimed, and together with some previous patches stops the steady growth of the dying cgroups number on some of our hosts. Link: http://lkml.kernel.org/r/[email protected] Fixes: 9092c71 ("mm: use sc->priority for slab shrink targets") Signed-off-by: Roman Gushchin <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Josef Bacik <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent 3bf181b commit 172b06c

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

mm/vmscan.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -476,6 +476,17 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
476476
delta = freeable >> priority;
477477
delta *= 4;
478478
do_div(delta, shrinker->seeks);
479+
480+
/*
481+
* Make sure we apply some minimal pressure on default priority
482+
* even on small cgroups. Stale objects are not only consuming memory
483+
* by themselves, but can also hold a reference to a dying cgroup,
484+
* preventing it from being reclaimed. A dying cgroup with all
485+
* corresponding structures like per-cpu stats and kmem caches
486+
* can be really big, so it may lead to a significant waste of memory.
487+
*/
488+
delta = max_t(unsigned long long, delta, min(freeable, batch_size));
489+
479490
total_scan += delta;
480491
if (total_scan < 0) {
481492
pr_err("shrink_slab: %pF negative objects to delete nr=%ld\n",

0 commit comments

Comments
 (0)