diff options
| author | Michael Ellerman <mpe@ellerman.id.au> | 2023-10-17 23:15:27 +1100 | 
|---|---|---|
| committer | Michael Ellerman <mpe@ellerman.id.au> | 2023-10-18 09:45:52 +1100 | 
| commit | 20045f0155ab79f8beb840022ea86bff46167f79 (patch) | |
| tree | 830909497e3ee36316cdb36c8696c91d4c12ff9b /tools/perf/scripts/python/check-perf-trace.py | |
| parent | ff9e8f41513669e290f6e1904e1bc75950584491 (diff) | |
powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()
Sachin reported a warning when running the inject-ra-err selftest:
  # selftests: powerpc/mce: inject-ra-err
  Disabling lock debugging due to kernel taint
  MCE: CPU19: machine check (Severe)  Real address Load/Store (foreign/control memory) [Not recovered]
  MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48]
  MCE: CPU19: Initiator CPU
  MCE: CPU19: Unknown
  ------------[ cut here ]------------
  WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180
  CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G   M        E      6.6.0-rc3-00055-g9ed22ae6be81 #4
  Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
  ...
  NIP radix__tlb_flush+0x160/0x180
  LR  radix__tlb_flush+0x104/0x180
  Call Trace:
    radix__tlb_flush+0xf4/0x180 (unreliable)
    tlb_finish_mmu+0x15c/0x1e0
    exit_mmap+0x1a0/0x510
    __mmput+0x60/0x1e0
    exit_mm+0xdc/0x170
    do_exit+0x2bc/0x5a0
    do_group_exit+0x4c/0xc0
    sys_exit_group+0x28/0x30
    system_call_exception+0x138/0x330
    system_call_vectored_common+0x15c/0x2ec
And bisected it to commit e43c0a0c3c28 ("powerpc/64s/radix: combine
final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning
in radix__tlb_flush() if mm->context.copros is still elevated.
However it's possible for the copros count to be elevated if a process
exits without first closing file descriptors that are associated with a
copro, eg. VAS.
If the process exits with a VAS file still open, the release callback
is queued up for exit_task_work() via:
  exit_files()
    put_files_struct()
      close_files()
        filp_close()
          fput()
And called via:
  exit_task_work()
    ____fput()
      __fput()
        file->f_op->release(inode, file)
          coproc_release()
            vas_user_win_ops->close_win()
              vas_deallocate_window()
                mm_context_remove_vas_window()
                  mm_context_remove_copro()
But that is after exit_mm() has been called from do_exit() and triggered
the warning.
Fix it by dropping the warning, and always calling __flush_all_mm().
In the normal case of no copros, that will result in a call to
_tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code
does.
If the copros count is elevated then it will cause a global flush, which
should flush translations from any copros. Note that the process table
entry was cleared in arch_exit_mmap(), so copros should not be able to
fetch any new translations.
Fixes: e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
Diffstat (limited to 'tools/perf/scripts/python/check-perf-trace.py')
0 files changed, 0 insertions, 0 deletions
