diff options
author | Vlastimil Babka <vbabka@suse.cz> | 2022-01-07 11:13:28 +0100 |
---|---|---|
committer | Vlastimil Babka <vbabka@suse.cz> | 2022-01-07 11:13:28 +0100 |
commit | 9d6c59c1c0d62a314a2b46839699b200cccd2d08 (patch) | |
tree | b198ed2a4f2f6c050eb7e0d0225d5e4b19b570a7 /mm/memcontrol.c | |
parent | eb52c0fc2331f8ad1f5f9fd79ba9ce90681ed50b (diff) | |
parent | b01af5c0b0414f96e6c3891e704d1c40faa18813 (diff) |
Merge branch 'for-5.17/struct-slab' into for-linus
Series "Separate struct slab from struct page" v4
This is originally an offshoot of the folio work by Matthew. One of the more
complex parts of the struct page definition are the parts used by the slab
allocators. It would be good for the MM in general if struct slab were its own
data type, and it also helps to prevent tail pages from slipping in anywhere.
As Matthew requested in his proof of concept series, I have taken over the
development of this series, so it's a mix of patches from him (often modified
by me) and my own.
One big difference is the use of coccinelle to perform the relatively trivial
parts of the conversions automatically and at once, instead of a larger number
of smaller incremental reviewable steps. Thanks to Julia Lawall and Luis
Chamberlain for all their help!
Another notable difference is (based also on review feedback) I don't represent
with a struct slab the large kmalloc allocations which are not really a slab,
but use page allocator directly. When going from an object address to a struct
slab, the code tests first folio slab flag, and only if it's set it converts to
struct slab. This makes the struct slab type stronger.
Finally, although Matthew's version didn't use any of the folio work, the
initial support has been merged meanwhile so my version builds on top of it
where appropriate. This eliminates some of the redundant compound_head()
being performed e.g. when testing the slab flag.
To sum up, after this series, struct page fields used by slab allocators are
moved from struct page to a new struct slab, that uses the same physical
storage. The availability of the fields is further distinguished by the
selected slab allocator implementation. The advantages include:
- Similar to folios, if the slab is of order > 0, struct slab always is
guaranteed to be the head page. Additionally it's guaranteed to be an actual
slab page, not a large kmalloc. This removes uncertainty and potential for
bugs.
- It's not possible to accidentally use fields of the slab implementation that's
not configured.
- Other subsystems cannot use slab's fields in struct page anymore (some
existing non-slab usages had to be adjusted in this series), so slab
implementations have more freedom in rearranging them in the struct slab.
Link: https://lore.kernel.org/all/20220104001046.12263-1-vbabka@suse.cz/
Diffstat (limited to 'mm/memcontrol.c')
-rw-r--r-- | mm/memcontrol.c | 55 |
1 files changed, 30 insertions, 25 deletions
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2ed5f2a0879d..4a7b3ebf8e48 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2816,31 +2816,31 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, rcu_read_unlock(); } -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, - gfp_t gfp, bool new_page) +int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, + gfp_t gfp, bool new_slab) { - unsigned int objects = objs_per_slab_page(s, page); + unsigned int objects = objs_per_slab(s, slab); unsigned long memcg_data; void *vec; gfp &= ~OBJCGS_CLEAR_MASK; vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp, - page_to_nid(page)); + slab_nid(slab)); if (!vec) return -ENOMEM; memcg_data = (unsigned long) vec | MEMCG_DATA_OBJCGS; - if (new_page) { + if (new_slab) { /* - * If the slab page is brand new and nobody can yet access - * it's memcg_data, no synchronization is required and - * memcg_data can be simply assigned. + * If the slab is brand new and nobody can yet access its + * memcg_data, no synchronization is required and memcg_data can + * be simply assigned. */ - page->memcg_data = memcg_data; - } else if (cmpxchg(&page->memcg_data, 0, memcg_data)) { + slab->memcg_data = memcg_data; + } else if (cmpxchg(&slab->memcg_data, 0, memcg_data)) { /* - * If the slab page is already in use, somebody can allocate - * and assign obj_cgroups in parallel. In this case the existing + * If the slab is already in use, somebody can allocate and + * assign obj_cgroups in parallel. In this case the existing * objcg vector should be reused. */ kfree(vec); @@ -2865,38 +2865,43 @@ int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, */ struct mem_cgroup *mem_cgroup_from_obj(void *p) { - struct page *page; + struct folio *folio; if (mem_cgroup_disabled()) return NULL; - page = virt_to_head_page(p); + folio = virt_to_folio(p); /* * Slab objects are accounted individually, not per-page. * Memcg membership data for each individual object is saved in - * the page->obj_cgroups. + * slab->memcg_data. */ - if (page_objcgs_check(page)) { - struct obj_cgroup *objcg; + if (folio_test_slab(folio)) { + struct obj_cgroup **objcgs; + struct slab *slab; unsigned int off; - off = obj_to_index(page->slab_cache, page, p); - objcg = page_objcgs(page)[off]; - if (objcg) - return obj_cgroup_memcg(objcg); + slab = folio_slab(folio); + objcgs = slab_objcgs(slab); + if (!objcgs) + return NULL; + + off = obj_to_index(slab->slab_cache, slab, p); + if (objcgs[off]) + return obj_cgroup_memcg(objcgs[off]); return NULL; } /* - * page_memcg_check() is used here, because page_has_obj_cgroups() - * check above could fail because the object cgroups vector wasn't set - * at that moment, but it can be set concurrently. + * page_memcg_check() is used here, because in theory we can encounter + * a folio where the slab flag has been cleared already, but + * slab->memcg_data has not been freed yet * page_memcg_check(page) will guarantee that a proper memory * cgroup pointer or NULL will be returned. */ - return page_memcg_check(page); + return page_memcg_check(folio_page(folio, 0)); } __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) |