diff options
author | Eric Dumazet <edumazet@google.com> | 2021-11-12 08:19:50 -0800 |
---|---|---|
committer | Borislav Petkov <bp@suse.de> | 2021-12-08 11:26:09 +0100 |
commit | 3411506550b1f714a52b5db087666c08658d2698 (patch) | |
tree | e96cd40dcbad1e8346aa181e51537bcab2ff2a41 /Makefile | |
parent | 0fcfb00b28c0b7884635dacf38e46d60bf3d4eb1 (diff) |
x86/csum: Rewrite/optimize csum_partial()
With more NICs supporting CHECKSUM_COMPLETE, and IPv6 being widely
used csum_partial() is heavily used with small amount of bytes, and is
consuming many cycles.
IPv6 header size, for instance, is 40 bytes.
Another thing to consider is that NET_IP_ALIGN is 0 on x86, meaning
that network headers are not word-aligned, unless the driver forces
this.
This means that csum_partial() fetches one u16 to 'align the buffer',
then performs three u64 additions with carry in a loop, then a
remaining u32, then a remaining u16.
With this new version, it performs a loop only for the 64 bytes blocks,
then the remaining is bisected.
Testing on various CPUs, all of them show a big reduction in
csum_partial() cost (by 50 to 80 %)
Before:
4.16% [kernel] [k] csum_partial
After:
0.83% [kernel] [k] csum_partial
If run in a loop 1,000,000 times:
Before:
26,922,913 cycles # 3846130.429 GHz
80,302,961 instructions # 2.98 insn per cycle
21,059,816 branches # 3008545142.857 M/sec
2,896 branch-misses # 0.01% of all branches
After:
17,960,709 cycles # 3592141.800 GHz
41,292,805 instructions # 2.30 insn per cycle
11,058,119 branches # 2211623800.000 M/sec
2,997 branch-misses # 0.03% of all branches
[ bp: Massage, merge in subsequent fixes into a single patch:
- um compilation error due to missing load_unaligned_zeropad():
- Reported-by: kernel test robot <lkp@intel.com>
- Link: https://lkml.kernel.org/r/20211118175239.1525650-1-eric.dumazet@gmail.com
- Fix initial seed for odd buffers
- Reported-by: Noah Goldstein <goldstein.w.n@gmail.com>
- Link: https://lkml.kernel.org/r/20211125141817.3541501-1-eric.dumazet@gmail.com
]
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20211112161950.528886-1-eric.dumazet@gmail.com
Diffstat (limited to 'Makefile')
0 files changed, 0 insertions, 0 deletions