<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/mm/memory_hotplug.c, branch v6.2-rc3</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<id>https://git.kobert.dev/pm24.git/atom?h=v6.2-rc3</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom?h=v6.2-rc3'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2022-10-03T21:03:04Z</updated>
<entry>
<title>mm: add pageblock_aligned() macro</title>
<updated>2022-10-03T21:03:04Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2022-09-07T06:08:44Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=ee0913c4719610204315a0d8a35122c6233249e0'/>
<id>urn:sha1:ee0913c4719610204315a0d8a35122c6233249e0</id>
<content type='text'>
Add pageblock_aligned() and use it to simplify code.

Link: https://lkml.kernel.org/r/20220907060844.126891-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix null-ptr-deref in kswapd_is_running()</title>
<updated>2022-09-12T03:26:04Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2022-08-27T11:19:59Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=b4a0215e11dcfe23a48c65c6d6c82c0c2c551a48'/>
<id>urn:sha1:b4a0215e11dcfe23a48c65c6d6c82c0c2c551a48</id>
<content type='text'>
kswapd_run/stop() will set pgdat-&gt;kswapd to NULL, which could race with
kswapd_is_running() in kcompactd(),

kswapd_run/stop()                       kcompactd()
                                          kswapd_is_running()
  pgdat-&gt;kswapd // error or nomal ptr
                                          verify pgdat-&gt;kswapd
                                            // load non-NULL
pgdat-&gt;kswapd
  pgdat-&gt;kswapd = NULL
                                          task_is_running(pgdat-&gt;kswapd)
                                            // Null pointer derefence

KASAN reports the null-ptr-deref shown below,

  vmscan: Failed to start kswapd on node 0
  ...
  BUG: KASAN: null-ptr-deref in kcompactd+0x440/0x504
  Read of size 8 at addr 0000000000000024 by task kcompactd0/37

  CPU: 0 PID: 37 Comm: kcompactd0 Kdump: loaded Tainted: G           OE     5.10.60 #1
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  Call trace:
   dump_backtrace+0x0/0x394
   show_stack+0x34/0x4c
   dump_stack+0x158/0x1e4
   __kasan_report+0x138/0x140
   kasan_report+0x44/0xdc
   __asan_load8+0x94/0xd0
   kcompactd+0x440/0x504
   kthread+0x1a4/0x1f0
   ret_from_fork+0x10/0x18

At present kswapd/kcompactd_run() and kswapd/kcompactd_stop() are protected
by mem_hotplug_begin/done(), but without kcompactd(). There is no need to
involve memory hotplug lock in kcompactd(), so let's add a new mutex to
protect pgdat-&gt;kswapd accesses.

Also, because the kcompactd task will check the state of kswapd task, it's
better to call kcompactd_stop() before kswapd_stop() to reduce lock
conflicts.

[akpm@linux-foundation.org: add comments]
Link: https://lkml.kernel.org/r/20220827111959.186838-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: kill is_memblock_offlined()</title>
<updated>2022-09-12T03:26:04Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2022-08-27T11:20:43Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=639118d1571f70b1157b4bb5ac574b0ab0f38099'/>
<id>urn:sha1:639118d1571f70b1157b4bb5ac574b0ab0f38099</id>
<content type='text'>
Directly check state of struct memory_block, no need a single function.

Link: https://lkml.kernel.org/r/20220827112043.187028-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: use is_zone_movable_page() helper</title>
<updated>2022-07-30T01:07:20Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2022-07-26T13:11:35Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=07252dfea2c7089bca68949710268cbbb0ce509e'/>
<id>urn:sha1:07252dfea2c7089bca68949710268cbbb0ce509e</id>
<content type='text'>
Use is_zone_movable_page() helper to simplify code.

Link: https://lkml.kernel.org/r/20220726131135.146912-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta@amd.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memory_hotplug: make hugetlb_optimize_vmemmap compatible with memmap_on_memory</title>
<updated>2022-07-04T01:08:49Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-06-17T13:56:50Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=66361095129b3b5d065e6c09cf0c085ef4a8c40f'/>
<id>urn:sha1:66361095129b3b5d065e6c09cf0c085ef4a8c40f</id>
<content type='text'>
For now, the feature of hugetlb_free_vmemmap is not compatible with the
feature of memory_hotplug.memmap_on_memory, and hugetlb_free_vmemmap takes
precedence over memory_hotplug.memmap_on_memory.  However, someone wants
to make memory_hotplug.memmap_on_memory takes precedence over
hugetlb_free_vmemmap since memmap_on_memory makes it more likely to
succeed memory hotplug in close-to-OOM situations.  So the decision of
making hugetlb_free_vmemmap take precedence is not wise and elegant.

The proper approach is to have hugetlb_vmemmap.c do the check whether the
section which the HugeTLB pages belong to can be optimized.  If the
section's vmemmap pages are allocated from the added memory block itself,
hugetlb_free_vmemmap should refuse to optimize the vmemmap, otherwise, do
the optimization.  Then both kernel parameters are compatible.  So this
patch introduces VmemmapSelfHosted to mask any non-optimizable vmemmap
pages.  The hugetlb_vmemmap can use this flag to detect if a vmemmap page
can be optimized.

[songmuchun@bytedance.com: walk vmemmap page tables to avoid false-positive]
  Link: https://lkml.kernel.org/r/20220620110616.12056-3-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220617135650.74901-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Co-developed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Signed-off-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memory_hotplug: enumerate all supported section flags</title>
<updated>2022-07-04T01:08:49Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-06-17T13:56:49Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=ed7802dd48f7a507213cbb95bb4c6f1fe134eb5d'/>
<id>urn:sha1:ed7802dd48f7a507213cbb95bb4c6f1fe134eb5d</id>
<content type='text'>
Patch series "make hugetlb_optimize_vmemmap compatible with
memmap_on_memory", v3.

This series makes hugetlb_optimize_vmemmap compatible with
memmap_on_memory.


This patch (of 2):

We are almost running out of section flags, only one bit is available in
the worst case (powerpc with 256k pages).  However, there are still some
free bits (in -&gt;section_mem_map) on other architectures (e.g.  x86_64 has
10 bits available, arm64 has 8 bits available with worst case of 64K
pages).  We have hard coded those numbers in code, it is inconvenient to
use those bits on other architectures except powerpc.  So transfer those
section flags to enumeration to make it easy to add new section flags in
the future.  Also, move SECTION_TAINT_ZONE_DEVICE into the scope of
CONFIG_ZONE_DEVICE to save a bit on non-zone-device case.

[songmuchun@bytedance.com: replace enum with defines per David]
  Link: https://lkml.kernel.org/r/20220620110616.12056-2-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220617135650.74901-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220617135650.74901-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/migration: return errno when isolate_huge_page failed</title>
<updated>2022-07-04T01:08:37Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2022-05-30T11:30:15Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=7ce82f4c3f3ead13a9d9498768e3b1a79975c4d8'/>
<id>urn:sha1:7ce82f4c3f3ead13a9d9498768e3b1a79975c4d8</id>
<content type='text'>
We might fail to isolate huge page due to e.g.  the page is under
migration which cleared HPageMigratable.  We should return errno in this
case rather than always return 1 which could confuse the user, i.e.  the
caller might think all of the memory is migrated while the hugetlb page is
left behind.  We make the prototype of isolate_huge_page consistent with
isolate_lru_page as suggested by Huang Ying and rename isolate_huge_page
to isolate_hugetlb as suggested by Muchun to improve the readability.

Link: https://lkml.kernel.org/r/20220530113016.16663-4-linmiaohe@huawei.com
Fixes: e8db67eb0ded ("mm: migrate: move_pages() supports thp migration")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Suggested-by: Huang Ying &lt;ying.huang@intel.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt; (build error)
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: drop 'reason' argument from check_pfn_span()</title>
<updated>2022-06-17T02:48:28Z</updated>
<author>
<name>Anshuman Khandual</name>
<email>anshuman.khandual@arm.com</email>
</author>
<published>2022-05-31T09:04:41Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=943189db4f3ed1445dd630dc0b96e115357c4330'/>
<id>urn:sha1:943189db4f3ed1445dd630dc0b96e115357c4330</id>
<content type='text'>
In check_pfn_span(), a 'reason' string is being used to recreate the
caller function name, while printing the warning message.  It is really
unnecessary as the warning message could just be printed inside the caller
depending on the return code.  Currently there are just two callers for
check_pfn_span() i.e __add_pages() and __remove_pages().  Let's clean this
up.

Link: https://lkml.kernel.org/r/20220531090441.170650-1-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Acked-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl</title>
<updated>2022-05-13T23:48:56Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-05-13T23:48:56Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=78f39084b41d287aedb2ea55f2c1895cfa11d61a'/>
<id>urn:sha1:78f39084b41d287aedb2ea55f2c1895cfa11d61a</id>
<content type='text'>
We must add hugetlb_free_vmemmap=on (or "off") to the boot cmdline and
reboot the server to enable or disable the feature of optimizing vmemmap
pages associated with HugeTLB pages.  However, rebooting usually takes a
long time.  So add a sysctl to enable or disable the feature at runtime
without rebooting.  Why we need this?  There are 3 use cases.

1) The feature of minimizing overhead of struct page associated with
   each HugeTLB is disabled by default without passing
   "hugetlb_free_vmemmap=on" to the boot cmdline.  When we (ByteDance)
   deliver the servers to the users who want to enable this feature, they
   have to configure the grub (change boot cmdline) and reboot the
   servers, whereas rebooting usually takes a long time (we have thousands
   of servers).  It's a very bad experience for the users.  So we need a
   approach to enable this feature after rebooting.  This is a use case in
   our practical environment.

2) Some use cases are that HugeTLB pages are allocated 'on the fly'
   instead of being pulled from the HugeTLB pool, those workloads would be
   affected with this feature enabled.  Those workloads could be
   identified by the characteristics of they never explicitly allocating
   huge pages with 'nr_hugepages' but only set 'nr_overcommit_hugepages'
   and then let the pages be allocated from the buddy allocator at fault
   time.  We can confirm it is a real use case from the commit
   099730d67417.  For those workloads, the page fault time could be ~2x
   slower than before.  We suspect those users want to disable this
   feature if the system has enabled this before and they don't think the
   memory savings benefit is enough to make up for the performance drop.

3) If the workload which wants vmemmap pages to be optimized and the
   workload which wants to set 'nr_overcommit_hugepages' and does not want
   the extera overhead at fault time when the overcommitted pages be
   allocated from the buddy allocator are deployed in the same server. 
   The user could enable this feature and set 'nr_hugepages' and
   'nr_overcommit_hugepages', then disable the feature.  In this case, the
   overcommited HugeTLB pages will not encounter the extra overhead at
   fault time.

Link: https://lkml.kernel.org/r/20220512041142.39501-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Iurii Zaikin &lt;yzaikin@google.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on</title>
<updated>2022-05-13T23:48:56Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-05-13T23:48:56Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=6e02c46b4d970f24eb51197d451b00f08a8a4186'/>
<id>urn:sha1:6e02c46b4d970f24eb51197d451b00f08a8a4186</id>
<content type='text'>
Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap
on hot added memory.  If "hugetlb_free_vmemmap=on" and
memory_hotplug.memmap_on_memory" are both passed on the kernel command
line, optimizing hugetlb pages takes precedence.  However, the global
variable memmap_on_memory will still be set to 1, even though we will not
try to allocate memmap on hot added memory.

Also introduce mhp_memmap_on_memory() helper to move the definition of
"memmap_on_memory" to the scope of CONFIG_MHP_MEMMAP_ON_MEMORY.  In the
next patch, mhp_memmap_on_memory() will also be exported to be used in
hugetlb_vmemmap.c.

Link: https://lkml.kernel.org/r/20220512041142.39501-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Iurii Zaikin &lt;yzaikin@google.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
