<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/kernel, branch v6.12-rc2</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<id>https://git.kobert.dev/pm24.git/atom?h=v6.12-rc2</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom?h=v6.12-rc2'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2024-10-04T19:11:06Z</updated>
<entry>
<title>Merge tag 'trace-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace</title>
<updated>2024-10-04T19:11:06Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-10-04T19:11:06Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=622a3ed1accbb8e008a7247317bf3e8bc1fd7665'/>
<id>urn:sha1:622a3ed1accbb8e008a7247317bf3e8bc1fd7665</id>
<content type='text'>
Pull tracing fixes from Steven Rostedt:

 - Fix tp_printk command line option crashing the kernel

   With the code that can handle a buffer from a previous boot, the
   trace_check_vprintf() needed access to the delta of the address space
   used by the old buffer and the current buffer. To do so, the
   trace_array (tr) parameter was used. But when tp_printk is enabled on
   the kernel command line, no trace buffer is used and the trace event
   is sent directly to printk(). That meant the tr field of the iterator
   descriptor was NULL, and since tp_printk still uses
   trace_check_vprintf() it caused a NULL dereference.

 - Add ptrace.h include to x86 ftrace file for completeness

 - Fix rtla installation when done with out-of-tree build

 - Fix the help messages in rtla that were incorrect

 - Several fixes to fix races with the timerlat and hwlat code

   Several locking issues were discovered with the coordination between
   timerlat kthread creation and hotplug. As timerlat has callbacks from
   hotplug code to start kthreads when CPUs come online. There are also
   locking issues with grabbing the cpu_read_lock() and the locks within
   timerlat.

* tag 'trace-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/hwlat: Fix a race during cpuhp processing
  tracing/timerlat: Fix a race during cpuhp processing
  tracing/timerlat: Drop interface_lock in stop_kthread()
  tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
  x86/ftrace: Include &lt;asm/ptrace.h&gt;
  rtla: Fix the help text in osnoise and timerlat top tools
  tools/rtla: Fix installation from out-of-tree build
  tracing: Fix trace_check_vprintf() when tp_printk is used
</content>
</entry>
<entry>
<title>Merge tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab</title>
<updated>2024-10-04T19:05:39Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-10-04T19:05:39Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=f6785e0ccfdfc3d87aa8f1287a49cf8cae111d5f'/>
<id>urn:sha1:f6785e0ccfdfc3d87aa8f1287a49cf8cae111d5f</id>
<content type='text'>
Pull slab fixes from Vlastimil Babka:
 "Fixes for issues introduced in this merge window: kobject memory leak,
  unsupressed warning and possible lockup in new slub_kunit tests,
  misleading code in kvfree_rcu_queue_batch()"

* tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
  mm, slab: suppress warnings in test_leak_destroy kunit test
  rcu/kvfree: Refactor kvfree_rcu_queue_batch()
  mm, slab: fix use of SLAB_SUPPORTS_SYSFS in kmem_cache_release()
</content>
</entry>
<entry>
<title>Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-10-04T16:46:16Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-10-04T16:46:16Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=6cca11958870b9b1d64933ffe1a4c11b0e6e6bbb'/>
<id>urn:sha1:6cca11958870b9b1d64933ffe1a4c11b0e6e6bbb</id>
<content type='text'>
Pull close_range() fix from Al Viro:
 "Fix the logic in descriptor table trimming"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  close_range(): fix the logics in descriptor table trimming
</content>
</entry>
<entry>
<title>sched: psi: fix bogus pressure spikes from aggregation race</title>
<updated>2024-10-03T23:03:16Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2024-10-03T11:29:05Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=3840cbe24cf060ea05a585ca497814609f5d47d1'/>
<id>urn:sha1:3840cbe24cf060ea05a585ca497814609f5d47d1</id>
<content type='text'>
Brandon reports sporadic, non-sensical spikes in cumulative pressure
time (total=) when reading cpu.pressure at a high rate. This is due to
a race condition between reader aggregation and tasks changing states.

While it affects all states and all resources captured by PSI, in
practice it most likely triggers with CPU pressure, since scheduling
events are so frequent compared to other resource events.

The race context is the live snooping of ongoing stalls during a
pressure read. The read aggregates per-cpu records for stalls that
have concluded, but will also incorporate ad-hoc the duration of any
active state that hasn't been recorded yet. This is important to get
timely measurements of ongoing stalls. Those ad-hoc samples are
calculated on-the-fly up to the current time on that CPU; since the
stall hasn't concluded, it's expected that this is the minimum amount
of stall time that will enter the per-cpu records once it does.

The problem is that the path that concludes the state uses a CPU clock
read that is not synchronized against aggregators; the clock is read
outside of the seqlock protection. This allows aggregators to race and
snoop a stall with a longer duration than will actually be recorded.

With the recorded stall time being less than the last snapshot
remembered by the aggregator, a subsequent sample will underflow and
observe a bogus delta value, resulting in an erratic jump in pressure.

Fix this by moving the clock read of the state change into the seqlock
protection. This ensures no aggregation can snoop live stalls past the
time that's recorded when the state concludes.

Reported-by: Brandon Duffany &lt;brandon@buildbuddy.io&gt;
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219194
Link: https://lore.kernel.org/lkml/20240827121851.GB438928@cmpxchg.org/
Fixes: df77430639c9 ("psi: Reduce calls to sched_clock() in psi")
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tracing/hwlat: Fix a race during cpuhp processing</title>
<updated>2024-10-03T20:43:23Z</updated>
<author>
<name>Wei Li</name>
<email>liwei391@huawei.com</email>
</author>
<published>2024-09-24T09:45:14Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=2a13ca2e8abb12ee43ada8a107dadca83f140937'/>
<id>urn:sha1:2a13ca2e8abb12ee43ada8a107dadca83f140937</id>
<content type='text'>
The cpuhp online/offline processing race also exists in percpu-mode hwlat
tracer in theory, apply the fix too. That is:

    T1                       | T2
    [CPUHP_ONLINE]           | cpu_device_down()
     hwlat_hotplug_workfn()  |
                             |     cpus_write_lock()
                             |     takedown_cpu(1)
                             |     cpus_write_unlock()
    [CPUHP_OFFLINE]          |
        cpus_read_lock()     |
        start_kthread(1)     |
        cpus_read_unlock()   |

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20240924094515.3561410-5-liwei391@huawei.com
Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations")
Signed-off-by: Wei Li &lt;liwei391@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing/timerlat: Fix a race during cpuhp processing</title>
<updated>2024-10-03T20:43:22Z</updated>
<author>
<name>Wei Li</name>
<email>liwei391@huawei.com</email>
</author>
<published>2024-09-24T09:45:13Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=829e0c9f0855f26b3ae830d17b24aec103f7e915'/>
<id>urn:sha1:829e0c9f0855f26b3ae830d17b24aec103f7e915</id>
<content type='text'>
There is another found exception that the "timerlat/1" thread was
scheduled on CPU0, and lead to timer corruption finally:

```
ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220
WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0
Modules linked in:
CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:debug_print_object+0x7d/0xb0
...
Call Trace:
 &lt;TASK&gt;
 ? __warn+0x7c/0x110
 ? debug_print_object+0x7d/0xb0
 ? report_bug+0xf1/0x1d0
 ? prb_read_valid+0x17/0x20
 ? handle_bug+0x3f/0x70
 ? exc_invalid_op+0x13/0x60
 ? asm_exc_invalid_op+0x16/0x20
 ? debug_print_object+0x7d/0xb0
 ? debug_print_object+0x7d/0xb0
 ? __pfx_timerlat_irq+0x10/0x10
 __debug_object_init+0x110/0x150
 hrtimer_init+0x1d/0x60
 timerlat_main+0xab/0x2d0
 ? __pfx_timerlat_main+0x10/0x10
 kthread+0xb7/0xe0
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2d/0x40
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 &lt;/TASK&gt;
```

After tracing the scheduling event, it was discovered that the migration
of the "timerlat/1" thread was performed during thread creation. Further
analysis confirmed that it is because the CPU online processing for
osnoise is implemented through workers, which is asynchronous with the
offline processing. When the worker was scheduled to create a thread, the
CPU may has already been removed from the cpu_online_mask during the offline
process, resulting in the inability to select the right CPU:

T1                       | T2
[CPUHP_ONLINE]           | cpu_device_down()
osnoise_hotplug_workfn() |
                         |     cpus_write_lock()
                         |     takedown_cpu(1)
                         |     cpus_write_unlock()
[CPUHP_OFFLINE]          |
    cpus_read_lock()     |
    start_kthread(1)     |
    cpus_read_unlock()   |

To fix this, skip online processing if the CPU is already offline.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20240924094515.3561410-4-liwei391@huawei.com
Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Wei Li &lt;liwei391@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing/timerlat: Drop interface_lock in stop_kthread()</title>
<updated>2024-10-03T20:43:22Z</updated>
<author>
<name>Wei Li</name>
<email>liwei391@huawei.com</email>
</author>
<published>2024-09-24T09:45:12Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=b484a02c9cedf8703eff8f0756f94618004bd165'/>
<id>urn:sha1:b484a02c9cedf8703eff8f0756f94618004bd165</id>
<content type='text'>
stop_kthread() is the offline callback for "trace/osnoise:online", since
commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing
of kthread in stop_kthread()"), the following ABBA deadlock scenario is
introduced:

T1                            | T2 [BP]               | T3 [AP]
osnoise_hotplug_workfn()      | work_for_cpu_fn()     | cpuhp_thread_fun()
                              |   _cpu_down()         |   osnoise_cpu_die()
  mutex_lock(&amp;interface_lock) |                       |     stop_kthread()
                              |     cpus_write_lock() |       mutex_lock(&amp;interface_lock)
  cpus_read_lock()            |     cpuhp_kick_ap()   |

As the interface_lock here in just for protecting the "kthread" field of
the osn_var, use xchg() instead to fix this issue. Also use
for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take
cpu_read_lock() again.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20240924094515.3561410-3-liwei391@huawei.com
Fixes: 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()")
Signed-off-by: Wei Li &lt;liwei391@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline</title>
<updated>2024-10-03T20:43:22Z</updated>
<author>
<name>Wei Li</name>
<email>liwei391@huawei.com</email>
</author>
<published>2024-09-24T09:45:11Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=0bb0a5c12ecf36ad561542bbb95f96355e036a02'/>
<id>urn:sha1:0bb0a5c12ecf36ad561542bbb95f96355e036a02</id>
<content type='text'>
osnoise_hotplug_workfn() is the asynchronous online callback for
"trace/osnoise:online". It may be congested when a CPU goes online and
offline repeatedly and is invoked for multiple times after a certain
online.

This will lead to kthread leak and timer corruption. Add a check
in start_kthread() to prevent this situation.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20240924094515.3561410-2-liwei391@huawei.com
Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Wei Li &lt;liwei391@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Fix trace_check_vprintf() when tp_printk is used</title>
<updated>2024-10-03T20:43:22Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2024-10-03T14:49:25Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=50a3242d84ee1625b0bfef29b95f935958dccfbe'/>
<id>urn:sha1:50a3242d84ee1625b0bfef29b95f935958dccfbe</id>
<content type='text'>
When the tp_printk kernel command line is used, the trace events go
directly to printk(). It is still checked via the trace_check_vprintf()
function to make sure the pointers of the trace event are legit.

The addition of reading buffers from previous boots required adding a
delta between the addresses of the previous boot and the current boot so
that the pointers in the old buffer can still be used. But this required
adding a trace_array pointer to acquire the delta offsets.

The tp_printk code does not provide a trace_array (tr) pointer, so when
the offsets were examined, a NULL pointer dereference happened and the
kernel crashed.

If the trace_array does not exist, just default the delta offsets to zero,
as that also means the trace event is not being read from a previous boot.

Link: https://lore.kernel.org/all/Zv3z5UsG_jsO9_Tb@aschofie-mobl2.lan/

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20241003104925.4e1b1fd9@gandalf.local.home
Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions")
Reported-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Tested-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-work.unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-10-02T23:42:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-10-02T23:42:28Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=7ec462100ef9142344ddbf86f2c3008b97acddbe'/>
<id>urn:sha1:7ec462100ef9142344ddbf86f2c3008b97acddbe</id>
<content type='text'>
Pull generic unaligned.h cleanups from Al Viro:
 "Get rid of architecture-specific &lt;asm/unaligned.h&gt; includes, replacing
  them with a single generic &lt;linux/unaligned.h&gt; header file.

  It's the second largest (after asm/io.h) class of asm/* includes, and
  all but two architectures actually end up using exact same file.

  Massage the remaining two (arc and parisc) to do the same and just
  move the thing to from asm-generic/unaligned.h to linux/unaligned.h"

[ This is one of those things that we're better off doing outside the
  merge window, and would only cause extra conflict noise if it was in
  linux-next for the next release due to all the trivial #include line
  updates.  Rip off the band-aid.   - Linus ]

* tag 'pull-work.unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  move asm/unaligned.h to linux/unaligned.h
  arc: get rid of private asm/unaligned.h
  parisc: get rid of private asm/unaligned.h
</content>
</entry>
</feed>
