<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/tools/perf/arch/x86/annotate, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.</subtitle>
<id>https://git.kobert.dev/pm24.git/atom/tools/perf/arch/x86/annotate?h=master</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom/tools/perf/arch/x86/annotate?h=master'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2024-11-09T16:39:13Z</updated>
<entry>
<title>perf disasm: Add e_machine/e_flags to struct arch</title>
<updated>2024-11-09T16:39:13Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-11-08T23:45:49Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=cd6c9dca9d4bf1d5a9d3606cf5cace513f6dc5ce'/>
<id>urn:sha1:cd6c9dca9d4bf1d5a9d3606cf5cace513f6dc5ce</id>
<content type='text'>
Currently functions like get_dwarf_regnum only work with the host
architecture. Carry the elf machine and flags in struct arch so that
in disassembly these can be used to allow cross platform disassembly.

Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Anup Patel &lt;anup@brainfault.org&gt;
Cc: Yang Jihong &lt;yangjihong@bytedance.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Shenlin Liang &lt;liangshenlin@eswincomputing.com&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Guilherme Amadio &lt;amadio@gentoo.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Chen Pei &lt;cp0613@linux.alibaba.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Cc: Aditya Gupta &lt;adityag@linux.ibm.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao &lt;maobibo@loongson.cn&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: Atish Patra &lt;atishp@rivosinc.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-5-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORT</title>
<updated>2024-10-18T17:17:40Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-10-17T00:13:53Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=8838abf6261444f7d8047c363f90cafbd2ff32c5'/>
<id>urn:sha1:8838abf6261444f7d8047c363f90cafbd2ff32c5</id>
<content type='text'>
In Makefile.config for unwinding the name dwarf implies either
libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really
just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Anup Patel &lt;anup@brainfault.org&gt;
Cc: Yang Jihong &lt;yangjihong@bytedance.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Shenlin Liang &lt;liangshenlin@eswincomputing.com&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Guilherme Amadio &lt;amadio@gentoo.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Chen Pei &lt;cp0613@linux.alibaba.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Cc: Aditya Gupta &lt;adityag@linux.ibm.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao &lt;maobibo@loongson.cn&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: Atish Patra &lt;atishp@rivosinc.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf annotate-data: Copy back variable types after move</title>
<updated>2024-08-22T15:38:18Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-08-21T23:26:28Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=1cfd01eb602d73b92df2ffc24196cd0a3dc3efb2'/>
<id>urn:sha1:1cfd01eb602d73b92df2ffc24196cd0a3dc3efb2</id>
<content type='text'>
In some cases, compilers don't set the location expression in DWARF
precisely.  For instance, it may assign a variable to a register after
copying it from a different register.  Then it should use the register
for the new type but still uses the old register.  This makes hard to
track the type information properly.

This is an example I found in __tcp_transmit_skb().  The first argument
(sk) of this function is a pointer to sock and there's a variable (tp)
for tcp_sock.

  static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
  				int clone_it, gfp_t gfp_mask, u32 rcv_nxt)
  {
  	...
  	struct tcp_sock *tp;

  	BUG_ON(!skb || !tcp_skb_pcount(skb));
  	tp = tcp_sk(sk);
  	prior_wstamp = tp-&gt;tcp_wstamp_ns;
  	tp-&gt;tcp_wstamp_ns = max(tp-&gt;tcp_wstamp_ns, tp-&gt;tcp_clock_cache);
  	...

So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk.
But it turned out to be the same value because tcp_sock embeds sock as
the first member.  The sk is located in reg5 (RDI) and tp is in reg3
(RBX).  The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is
0x750.  So you need to use RBX (reg3) to access the fields in the
tcp_sock.  But the code used RDI (reg5) as it has the same value.

  $ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750
	u64                tcp_wstamp_ns;        /* 0x748   0x8 */
	u64                tcp_clock_cache;      /* 0x750   0x8 */

And this is the disassembly of the part of the function.

  &lt;__tcp_transmit_skb&gt;:
  ...
  44:  mov    %rdi, %rbx
  47:  mov    0x748(%rdi), %rsi
  4e:  mov    0x750(%rdi), %rax
  55:  cmp    %rax, %rsi

Because compiler put the debug info to RBX, it only knows RDI is a
pointer to sock and accessing those two fields resulted in error
due to offset being beyond the type size.

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  CU for net/ipv4/tcp_output.c (die:0x817f543)
  frame base: cfa=0 fbreg=6
  scope: [1/1] (die:81aac3e)
  bb: [0 - 30]
  var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
  var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg1 type='int' size=0x4 (die:0x818059e)
  var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
  var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)                   &lt;&lt;&lt;--- the first argument ('sk' at %RDI)
  mov [19] reg8 -&gt; -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
  mov [20] stack canary -&gt; reg0
  mov [29] reg0 -&gt; -0x30(stack) stack canary
  bb: [36 - 3e]
  mov [36] reg4 -&gt; reg15 type='struct sk_buff*' size=0x8 (die:0x8181360)
  bb: [44 - 63]
  mov [44] reg5 -&gt; reg3 type='struct sock*' size=0x8 (die:0x8181a0c)          &lt;&lt;&lt;--- calling tcp_sk()
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)              &lt;&lt;&lt;--- new variable ('tp' at %RBX)
  var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -&gt; -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size    &lt;&lt;&lt;--- access with old variable
  final result: offset bigger than size

While it's a fault in the compiler, we could work around this issue by
using the type of new variable when it's copied directly.  So I've added
copied_from field in the register state to track those direct register
to register copies.  After that new register gets a new type and the old
register still has the same type, it'll update (copy it back) the type
of the old register.

For example, if we can update type of reg5 at __tcp_transmit_skb+0x47,
we can find the target type of the instruction at 0x63 like below:

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  ...
  bb: [44 - 63]
  mov [44] reg5 -&gt; reg3 type='struct sock*' size=0x8 (die:0x8181a0c)
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)
  var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead)     &lt;&lt;&lt;--- here
  mov [47] 0x748(reg5) -&gt; reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [4e] 0x750(reg5) -&gt; reg0 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -&gt; -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good!           &lt;&lt;&lt;--- new type
  found by insn track: 0x748(reg5) type-offset=0x748
  final result:  type='struct tcp_sock' size=0xa98 (die:0x819eeb2)

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate-data: Fix percpu pointer check</title>
<updated>2024-08-21T14:30:38Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-08-21T06:54:08Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=4d6d6e0f61e2267103e9b013d2a82d04ff278127'/>
<id>urn:sha1:4d6d6e0f61e2267103e9b013d2a82d04ff278127</id>
<content type='text'>
In check_matching_type(), it checks the type state of the register in a
wrong order.  When it's the percpu pointer, it should check the type for
the pointer, but it checks the CFA bit first and thought it has no type
in the stack slot.  This resulted in no type info.

  -----------------------------------------------------------
  find data type for 0x28(reg1) at hrtimer_reprogram+0x88
  CU for kernel/time/hrtimer.c (die:0x18f219f)
  frame base: cfa=1 fbreg=7
  ...
  add [72] percpu 0x24500 -&gt; reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
  bb: [7a - 7e]
  bb: [80 - 86]                        (here)
  bb: [88 - 88]                         vvv
  chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information
  no type information

Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the
correct type.  But when it checks the final result, it wrongly thought
it was stack variable because it checks the cfa bit first.

After changing the order of state check:
  -----------------------------------------------------------
  find data type for 0x28(reg1) at hrtimer_reprogram+0x88
  CU for kernel/time/hrtimer.c (die:0x18f219f)
  frame base: cfa=1 fbreg=7
  ...                                     (here)
                                        vvvvvvvvvv
  chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good!
  found by insn track: 0x28(reg1) type-offset=0x28
  final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate-data: Fix missing constant copy</title>
<updated>2024-08-21T14:27:18Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-08-21T06:54:06Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=922ec313f061cf75157d8e93fe16bc6397f6ea25'/>
<id>urn:sha1:922ec313f061cf75157d8e93fe16bc6397f6ea25</id>
<content type='text'>
I found it missed to copy the immediate constant when it moves the
register value.  This could result in a wrong type inference since the
address for the per-cpu variable would be 0 always.

Fixes: eb9190afaed6afd5 ("perf annotate-data: Handle ADD instructions")
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240821065408.285548-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking</title>
<updated>2024-07-31T19:12:59Z</updated>
<author>
<name>Athira Rajeev</name>
<email>atrajeev@linux.vnet.ibm.com</email>
</author>
<published>2024-07-18T08:43:45Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=782959ac248ac3cbac80f7476d4e0410662ff400'/>
<id>urn:sha1:782959ac248ac3cbac80f7476d4e0410662ff400</id>
<content type='text'>
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".

Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/&lt;platform&gt;/annotate/instructions.c" and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Reviewed-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Reviewed-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Tested-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Akanksha J N &lt;akanksha@linux.ibm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Disha Goel &lt;disgoel@linux.vnet.ibm.com&gt;
Cc: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Cc: Segher Boessenkool &lt;segher@kernel.crashing.org&gt;
Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate: Add more x86 mov instruction cases</title>
<updated>2023-09-15T23:46:40Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-09-08T05:22:16Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=486021e04b24f8dffe54ec282ce5d4cc3e71133f'/>
<id>urn:sha1:486021e04b24f8dffe54ec282ce5d4cc3e71133f</id>
<content type='text'>
Instructions with sign- and zero- extention like movsbl and movzwq were
not handled properly.  As it can check different size suffix (-b, -w, -l
or -q) we can omit that and add the common parts even though some
combinations are not possible.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20230908052216.566148-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf annotate: Remove x86 instructions with suffix</title>
<updated>2023-06-09T13:56:05Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-05-24T20:50:54Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=b541a91793fe124b199dc734aa5d7712d2993f06'/>
<id>urn:sha1:b541a91793fe124b199dc734aa5d7712d2993f06</id>
<content type='text'>
Now the suffix is handled in the general code.  Let's get rid of them.

Reviewed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20230524205054.3087004-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate: Handle "decq", "incq", "testq", "tzcnt" instructions on x86</title>
<updated>2023-05-15T20:50:01Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-05-11T06:27:23Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=983034cd0d212b23a63efb48ecc47d55d70ee301'/>
<id>urn:sha1:983034cd0d212b23a63efb48ecc47d55d70ee301</id>
<content type='text'>
I found that the "decq", "incq", "testq", "tzcnt" instructions didn't
parse the operands properly.  Add them to the "x86__instructions" table
to fix the issue.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20230511062725.514752-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf annotate: Add fusion logic for AMD microarchs</title>
<updated>2021-09-15T20:54:52Z</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@amd.com</email>
</author>
<published>2021-09-11T04:38:54Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=3149733584c8f0ab828eada539df7aa488c023a9'/>
<id>urn:sha1:3149733584c8f0ab828eada539df7aa488c023a9</id>
<content type='text'>
AMD family 15h and above microarchs fuse a subset of cmp/test/ALU
instructions with branch instructions[1][2]. Add perf annotate
fused instruction support for these microarchs.

Before:
         │       testb  $0x80,0x51(%rax)
         │    ┌──jne    5b3
    0.78 │    │  mov    %r13,%rdi
         │    │→ callq  mark_page_accessed
    1.08 │5b3:└─→mov    0x8(%r13),%rax

After:
         │    ┌──testb  $0x80,0x51(%rax)
         │    ├──jne    5b3
    0.78 │    │  mov    %r13,%rdi
         │    │→ callq  mark_page_accessed
    1.08 │5b3:└─→mov    0x8(%r13),%rax

[1] https://bugzilla.kernel.org/attachment.cgi?id=298553
[2] https://bugzilla.kernel.org/attachment.cgi?id=298555

Committer testing:

On a:

  $ grep -m1 "model name" /proc/cpuinfo
  model name	: AMD Ryzen 9 3900X 12-Core Processor
  $

  Samples: 44K of event 'cycles', 4000 Hz, Event count (approx.): 7533249650
  _int_malloc  /usr/lib64/libc-2.33.so [Percent: local period]
  Percent│    ┌──test   %eax,%eax
         │    ├──jne    884
         │    │↓ jmpq   943
         │    │  nop
         │878:│  add    $0x10,%rdx
    0.64 │    │  add    %eax,%eax
    0.57 │    │↓ je     cc9
    0.77 │884:└─→test   %esi,%eax
         │     ↑ je     878
         │       mov    0x18(%rdx),%r15

Reported-by: Kim Phillips &lt;kim.phillips@amd.com&gt;
Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https //lore.kernel.org/r/20210911043854.8373-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
