<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/arch/riscv/mm, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.</subtitle>
<id>https://git.kobert.dev/pm24.git/atom/arch/riscv/mm?h=master</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom/arch/riscv/mm?h=master'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2024-11-07T22:25:15Z</updated>
<entry>
<title>arch: introduce set_direct_map_valid_noflush()</title>
<updated>2024-11-07T22:25:15Z</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2024-10-23T16:27:08Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=0c6378a71574daa6cd1534ad42a956e3262756c7'/>
<id>urn:sha1:0c6378a71574daa6cd1534ad42a956e3262756c7</id>
<content type='text'>
Add an API that will allow updates of the direct/linear map for a set of
physically contiguous pages.

It will be used in the following patches.

Link: https://lkml.kernel.org/r/20241023162711.2579610-6-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Tested-by: kdevops &lt;kdevops@lists.linux.dev&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Brian Cain &lt;bcain@quicinc.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Cc: Vineet Gupta &lt;vgupta@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux</title>
<updated>2024-09-24T17:59:17Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-24T17:59:17Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=97d8894b6f4c44762fd48f5d29e73358d6181dbb'/>
<id>urn:sha1:97d8894b6f4c44762fd48f5d29e73358d6181dbb</id>
<content type='text'>
Pull RISC-V updates from Palmer Dabbelt:

 - Support using Zkr to seed KASLR

 - Support IPI-triggered CPU backtracing

 - Support for generic CPU vulnerabilities reporting to userspace

 - A few cleanups for missing licenses

 - The size limit on the XIP kernel has been removed

 - Support for tracing userspace stacks

 - Support for the Svvptc extension

 - Various cleanups and fixes throughout the tree

* tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits)
  crash: Fix riscv64 crash memory reserve dead loop
  perf/riscv-sbi: Add platform specific firmware event handling
  tools: Optimize ring buffer for riscv
  tools: Add riscv barrier implementation
  RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
  ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN
  ACPI: RISCV: Make acpi_numa_get_nid() to be static
  riscv: Randomize lower bits of stack address
  selftests: riscv: Allow mmap test to compile on 32-bit
  riscv: Make riscv_isa_vendor_ext_andes array static
  riscv: Use LIST_HEAD() to simplify code
  riscv: defconfig: Disable RZ/Five peripheral support
  RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
  riscv: avoid Imbalance in RAS
  riscv: cacheinfo: Add back init_cache_level() function
  riscv: Remove unused _TIF_WORK_MASK
  drivers/perf: riscv: Remove redundant macro check
  riscv: define ILLEGAL_POINTER_VALUE for 64bit
  ...
</content>
</entry>
<entry>
<title>Merge patch series "Svvptc extension to remove preventive sfence.vma"</title>
<updated>2024-09-16T03:58:24Z</updated>
<author>
<name>Palmer Dabbelt</name>
<email>palmer@rivosinc.com</email>
</author>
<published>2024-09-16T03:16:12Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=7e340f4fad46b766705be96f5d1c764a397a7a36'/>
<id>urn:sha1:7e340f4fad46b766705be96f5d1c764a397a7a36</id>
<content type='text'>
Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt; says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc</title>
<updated>2024-09-15T07:11:05Z</updated>
<author>
<name>Alexandre Ghiti</name>
<email>alexghiti@rivosinc.com</email>
</author>
<published>2024-07-17T06:01:25Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=7a21b2e370dab780ddb3aa80f2a4c8ff97bddccc'/>
<id>urn:sha1:7a21b2e370dab780ddb3aa80f2a4c8ff97bddccc</id>
<content type='text'>
The preventive sfence.vma were emitted because new mappings must be made
visible to the page table walker but Svvptc guarantees that it will
happen within a bounded timeframe, so no need to sfence.vma for the uarchs
that implement this extension, we will then take gratuitous (but very
unlikely) page faults, similarly to x86 and arm64.

This allows to drastically reduce the number of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: Stop emitting preventive sfence.vma for new vmalloc mappings</title>
<updated>2024-09-15T07:11:04Z</updated>
<author>
<name>Alexandre Ghiti</name>
<email>alexghiti@rivosinc.com</email>
</author>
<published>2024-07-17T06:01:24Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=503638e0babf364061bc50fca5103b00a56cc50a'/>
<id>urn:sha1:503638e0babf364061bc50fca5103b00a56cc50a</id>
<content type='text'>
In 6.5, we removed the vmalloc fault path because that can't work (see
[1] [2]). Then in order to make sure that new page table entries were
seen by the page table walker, we had to preventively emit a sfence.vma
on all harts [3] but this solution is very costly since it relies on IPI.

And even there, we could end up in a loop of vmalloc faults if a vmalloc
allocation is done in the IPI path (for example if it is traced, see
[4]), which could result in a kernel stack overflow.

Those preventive sfence.vma needed to be emitted because:

- if the uarch caches invalid entries, the new mapping may not be
  observed by the page table walker and an invalidation may be needed.
- if the uarch does not cache invalid entries, a reordered access
  could "miss" the new mapping and traps: in that case, we would actually
  only need to retry the access, no sfence.vma is required.

So this patch removes those preventive sfence.vma and actually handles
the possible (and unlikely) exceptions. And since the kernel stacks
mappings lie in the vmalloc area, this handling must be done very early
when the trap is taken, at the very beginning of handle_exception: this
also rules out the vmalloc allocations in the fault path.

Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1]
Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2]
Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3]
Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4]
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Reviewed-by: Yunhui Cui &lt;cuiyunhui@bytedance.com&gt;
Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: Remove redundant restriction on memory size</title>
<updated>2024-09-14T08:08:56Z</updated>
<author>
<name>Stuart Menefy</name>
<email>stuart.menefy@codasip.com</email>
</author>
<published>2024-06-24T12:17:23Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=d6a1928134a1c626ff369129d7a80951b2949a48'/>
<id>urn:sha1:d6a1928134a1c626ff369129d7a80951b2949a48</id>
<content type='text'>
The original reason for reserving the top 4GiB of the direct map
(space for modules/BPF/kernel) hasn't applied since the address
map was reworked for KASAN.

Signed-off-by: Stuart Menefy &lt;stuart.menefy@codasip.com&gt;
Reviewed-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>Merge patch series "remove size limit on XIP kernel"</title>
<updated>2024-09-12T14:23:05Z</updated>
<author>
<name>Palmer Dabbelt</name>
<email>palmer@rivosinc.com</email>
</author>
<published>2024-09-12T14:23:05Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=9ea7b92b77df7d2eee3c31ef4a19f0f12ec74190'/>
<id>urn:sha1:9ea7b92b77df7d2eee3c31ef4a19f0f12ec74190</id>
<content type='text'>
Nam Cao &lt;namcao@linutronix.de&gt; says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: drop the use of XIP_OFFSET in create_kernel_page_table()</title>
<updated>2024-09-12T14:23:01Z</updated>
<author>
<name>Nam Cao</name>
<email>namcao@linutronix.de</email>
</author>
<published>2024-06-07T20:22:12Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=a7cfb999433ad3a1aa7ca86ecdaf3e061ab7076a'/>
<id>urn:sha1:a7cfb999433ad3a1aa7ca86ecdaf3e061ab7076a</id>
<content type='text'>
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded value entirely, stop using
XIP_OFFSET in create_kernel_page_table(). Instead use _sdata and _start to
do the same thing.

Signed-off-by: Nam Cao &lt;namcao@linutronix.de&gt;
Reviewed-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Link: https://lore.kernel.org/r/4ea3f222a7eb9f91c04b155ff2e4d3ef19158acc.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: replace misleading va_kernel_pa_offset on XIP kernel</title>
<updated>2024-09-12T14:22:57Z</updated>
<author>
<name>Nam Cao</name>
<email>namcao@linutronix.de</email>
</author>
<published>2024-06-07T20:22:08Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=5cf089672119808c2f5b7035c91adcc0cc7287e1'/>
<id>urn:sha1:5cf089672119808c2f5b7035c91adcc0cc7287e1</id>
<content type='text'>
On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike
"normal" kernel, it is not the virtual-physical address offset of kernel
mapping, it is the offset of kernel mapping's first virtual address to
first physical address in DRAM, which is not meaningful because the
kernel's first physical address is not in DRAM.

For XIP kernel, there are 2 different offsets because the read-only part of
the kernel resides in ROM while the rest is in RAM. The offset to ROM is in
kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored
anywhere: it is calculated on-the-fly.

Remove this confusing "va_kernel_pa_offset" and add
"va_kernel_xip_data_pa_offset" as its replacement. This new variable is the
offset of virtual mapping of the kernel's data portion to the corresponding
physical addresses.

With the introduction of this new variable, also rename
va_kernel_xip_pa_offset -&gt; va_kernel_xip_text_pa_offset to make it clear
that this one is about the .text section.

Signed-off-by: Nam Cao &lt;namcao@linutronix.de&gt;
Reviewed-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF</title>
<updated>2024-09-11T03:38:46Z</updated>
<author>
<name>Charlie Jenkins</name>
<email>charlie@rivosinc.com</email>
</author>
<published>2024-09-03T22:52:34Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=7c1e5b9690b0e14acead4ff98d8a6c40f2dff54b'/>
<id>urn:sha1:7c1e5b9690b0e14acead4ff98d8a6c40f2dff54b</id>
<content type='text'>
The icache will be flushed in switch_to() if force_icache_flush is true,
or in flush_icache_deferred() if icache_stale_mask is set. Between
setting force_icache_flush to false and calculating the new
icache_stale_mask, preemption needs to be disabled. There are two
reasons for this:

1. If CPU migration happens between force_icache_flush = false, and the
   icache_stale_mask is set, an icache flush will not be emitted.
2. smp_processor_id() is used in set_icache_stale_mask() to mark the
   current CPU as not needing another flush since a flush will have
   happened either by userspace or by the kernel when performing the
   migration. smp_processor_id() is currently called twice with preemption
   enabled which causes a race condition. It allows
   icache_stale_mask to be populated with inconsistent CPU ids.

Resolve these two issues by setting the icache_stale_mask before setting
force_icache_flush to false, and using get_cpu()/put_cpu() to obtain the
smp_processor_id().

Signed-off-by: Charlie Jenkins &lt;charlie@rivosinc.com&gt;
Fixes: 6b9391b581fd ("riscv: Include riscv_set_icache_flush_ctx prctl")
Link: https://lore.kernel.org/r/20240903-fix_fencei_optimization-v2-1-8025f20171fc@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
</feed>
