summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2024-10-24Merge tag 'ipsec-2024-10-22' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2024-10-22 1) Fix routing behavior that relies on L4 information for xfrm encapsulated packets. From Eyal Birger. 2) Remove leftovers of pernet policy_inexact lists. From Florian Westphal. 3) Validate new SA's prefixlen when the selector family is not set from userspace. From Sabrina Dubroca. 4) Fix a kernel-infoleak when dumping an auth algorithm. From Petr Vaganov. Please pull or let me know if there are problems. ipsec-2024-10-22 * tag 'ipsec-2024-10-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: fix one more kernel-infoleak in algo dumping xfrm: validate new SA's prefixlen using SA family when sel.family is unset xfrm: policy: remove last remnants of pernet inexact list xfrm: respect ip protocols rules criteria when performing dst lookups xfrm: extract dst lookup parameters into a struct ==================== Link: https://patch.msgid.link/20241022092226.654370-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-15udp: Compute L4 checksum as usual when not segmenting the skbJakub Sitnicki
If: 1) the user requested USO, but 2) there is not enough payload for GSO to kick in, and 3) the egress device doesn't offer checksum offload, then we want to compute the L4 checksum in software early on. In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard. Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241011-uso-swcsum-fixup-v2-1-6e1ddc199af9@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09netfilter: fib: check correct rtable in vrf setupsFlorian Westphal
We need to init l3mdev unconditionally, else main routing table is searched and incorrect result is returned unless strict (iif keyword) matching is requested. Next patch adds a selftest for this. Fixes: 2a8a7c0eaa87 ("netfilter: nft_fib: Fix for rpath check with VRF devices") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1761 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-10-03Merge tag 'net-6.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from ieee802154, bluetooth and netfilter. Current release - regressions: - eth: mlx5: fix wrong reserved field in hca_cap_2 in mlx5_ifc - eth: am65-cpsw: fix forever loop in cleanup code Current release - new code bugs: - eth: mlx5: HWS, fixed double-free in error flow of creating SQ Previous releases - regressions: - core: avoid potential underflow in qdisc_pkt_len_init() with UFO - core: test for not too small csum_start in virtio_net_hdr_to_skb() - vrf: revert "vrf: remove unnecessary RCU-bh critical section" - bluetooth: - fix uaf in l2cap_connect - fix possible crash on mgmt_index_removed - dsa: improve shutdown sequence - eth: mlx5e: SHAMPO, fix overflow of hd_per_wq - eth: ip_gre: fix drops of small packets in ipgre_xmit Previous releases - always broken: - core: fix gso_features_check to check for both dev->gso_{ipv4_,}max_size - core: fix tcp fraglist segmentation after pull from frag_list - netfilter: nf_tables: prevent nf_skb_duplicated corruption - sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start - mac802154: fix potential RCU dereference issue in mac802154_scan_worker - eth: fec: restart PPS after link state change" * tag 'net-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (48 commits) sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems doc: net: napi: Update documentation for napi_schedule_irqoff net/ncsi: Disable the ncsi work before freeing the associated structure net: phy: qt2025: Fix warning: unused import DeviceId gso: fix udp gso fraglist segmentation after pull from frag_list bridge: mcast: Fail MDB get request on empty entry vrf: revert "vrf: Remove unnecessary RCU-bh critical section" net: ethernet: ti: am65-cpsw: Fix forever loop in cleanup code net: phy: realtek: Check the index value in led_hw_control_get ppp: do not assume bh is held in ppp_channel_bridge_input() selftests: rds: move include.sh to TEST_FILES net: test for not too small csum_start in virtio_net_hdr_to_skb() net: gso: fix tcp fraglist segmentation after pull from frag_list ipv4: ip_gre: Fix drops of small packets in ipgre_xmit net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check net: add more sanity checks to qdisc_pkt_len_init() net: avoid potential underflow in qdisc_pkt_len_init() with UFO net: ethernet: ti: cpsw_ale: Fix warning on some platforms net: microchip: Make FDMA config symbol invisible ...
2024-10-03Merge tag 'nf-24-10-02' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix incorrect documentation in uapi/linux/netfilter/nf_tables.h regarding flowtable hooks, from Phil Sutter. 2) Fix nft_audit.sh selftests with newer nft binaries, due to different (valid) audit output, also from Phil. 3) Disable BH when duplicating packets via nf_dup infrastructure, otherwise race on nf_skb_duplicated for locally generated traffic. From Eric. 4) Missing return in callback of selftest C program, from zhang jiao. netfilter pull request 24-10-02 * tag 'nf-24-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: netfilter: Add missing return value netfilter: nf_tables: prevent nf_skb_duplicated corruption selftests: netfilter: Fix nft_audit.sh for newer nft binaries netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED ==================== Link: https://patch.msgid.link/20241002202421.1281311-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-02net: gso: fix tcp fraglist segmentation after pull from frag_listFelix Fietkau
Detect tcp gso fraglist skbs with corrupted geometry (see below) and pass these to skb_segment instead of skb_segment_list, as the first can segment them correctly. Valid SKB_GSO_FRAGLIST skbs - consist of two or more segments - the head_skb holds the protocol headers plus first gso_size - one or more frag_list skbs hold exactly one segment - all but the last must be gso_size Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can modify these skbs, breaking these invariants. In extreme cases they pull all data into skb linear. For TCP, this causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at tcp_hdr(seg->next). Detect invalid geometry due to pull, by checking head_skb size. Don't just drop, as this may blackhole a destination. Convert to be able to pass to regular skb_segment. Approach and description based on a patch by Willem de Bruijn. Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/ Link: https://lore.kernel.org/netdev/20240922150450.3873767-1-willemdebruijn.kernel@gmail.com/ Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets") Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240926085315.51524-1-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-27netfilter: nf_tables: prevent nf_skb_duplicated corruptionEric Dumazet
syzbot found that nf_dup_ipv4() or nf_dup_ipv6() could write per-cpu variable nf_skb_duplicated in an unsafe way [1]. Disabling preemption as hinted by the splat is not enough, we have to disable soft interrupts as well. [1] BUG: using __this_cpu_write() in preemptible [00000000] code: syz.4.282/6316 caller is nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87 CPU: 0 UID: 0 PID: 6316 Comm: syz.4.282 Not tainted 6.11.0-rc7-syzkaller-00104-g7052622fccb1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 check_preemption_disabled+0x10e/0x120 lib/smp_processor_id.c:49 nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87 nft_dup_ipv4_eval+0x1db/0x300 net/ipv4/netfilter/nft_dup_ipv4.c:30 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288 nft_do_chain_ipv4+0x202/0x320 net/netfilter/nft_chain_filter.c:23 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook+0x2c4/0x450 include/linux/netfilter.h:269 NF_HOOK_COND include/linux/netfilter.h:302 [inline] ip_output+0x185/0x230 net/ipv4/ip_output.c:433 ip_local_out net/ipv4/ip_output.c:129 [inline] ip_send_skb+0x74/0x100 net/ipv4/ip_output.c:1495 udp_send_skb+0xacf/0x1650 net/ipv4/udp.c:981 udp_sendmsg+0x1c21/0x2a60 net/ipv4/udp.c:1269 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597 ___sys_sendmsg net/socket.c:2651 [inline] __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737 __do_sys_sendmmsg net/socket.c:2766 [inline] __se_sys_sendmmsg net/socket.c:2763 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f4ce4f7def9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f4ce5d4a038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00007f4ce5135f80 RCX: 00007f4ce4f7def9 RDX: 0000000000000001 RSI: 0000000020005d40 RDI: 0000000000000006 RBP: 00007f4ce4ff0b76 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f4ce5135f80 R15: 00007ffd4cbc6d68 </TASK> Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-26Merge tag 'nf-24-09-26' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net v2: with kdoc fixes per Paolo Abeni. The following patchset contains Netfilter fixes for net: Patch #1 and #2 handle an esoteric scenario: Given two tasks sending UDP packets to one another, two packets of the same flow in each direction handled by different CPUs that result in two conntrack objects in NEW state, where reply packet loses race. Then, patch #3 adds a testcase for this scenario. Series from Florian Westphal. 1) NAT engine can falsely detect a port collision if it happens to pick up a reply packet as NEW rather than ESTABLISHED. Add extra code to detect this and suppress port reallocation in this case. 2) To complete the clash resolution in the reply direction, extend conntrack logic to detect clashing conntrack in the reply direction to existing entry. 3) Adds a test case. Then, an assorted list of fixes follow: 4) Add a selftest for tproxy, from Antonio Ojea. 5) Guard ctnetlink_*_size() functions under #if defined(CONFIG_NETFILTER_NETLINK_GLUE_CT) || defined(CONFIG_NF_CONNTRACK_EVENTS) From Andy Shevchenko. 6) Use -m socket --transparent in iptables tproxy documentation. From XIE Zhibang. 7) Call kfree_rcu() when releasing flowtable hooks to address race with netlink dump path, from Phil Sutter. 8) Fix compilation warning in nf_reject with CONFIG_BRIDGE_NETFILTER=n. From Simon Horman. 9) Guard ctnetlink_label_size() under CONFIG_NF_CONNTRACK_EVENTS which is its only user, to address a compilation warning. From Simon Horman. 10) Use rcu-protected list iteration over basechain hooks from netlink dump path. 11) Fix memcg for nf_tables, use GFP_KERNEL_ACCOUNT is not complete. 12) Remove old nfqueue conntrack clash resolution. Instead trying to use same destination address consistently which requires double DNAT, use the existing clash resolution which allows clashing packets go through with different destination. Antonio Ojea originally reported an issue from the postrouting chain, I proposed a fix: https://lore.kernel.org/netfilter-devel/ZuwSwAqKgCB2a51-@calendula/T/ which he reported it did not work for him. 13) Adds a selftest for patch 12. 14) Fixes ipvs.sh selftest. netfilter pull request 24-09-26 * tag 'nf-24-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: netfilter: Avoid hanging ipvs.sh kselftest: add test for nfqueue induced conntrack race netfilter: nfnetlink_queue: remove old clash resolution logic netfilter: nf_tables: missing objects with no memcg accounting netfilter: nf_tables: use rcu chain hook list iterator from netlink dump path netfilter: ctnetlink: compile ctnetlink_label_size with CONFIG_NF_CONNTRACK_EVENTS netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n netfilter: nf_tables: Keep deleted flowtable hooks until after RCU docs: tproxy: ignore non-transparent sockets in iptables netfilter: ctnetlink: Guard possible unused functions selftests: netfilter: nft_tproxy.sh: add tcp tests selftests: netfilter: add reverse-clash resolution test case netfilter: conntrack: add clash resolution for reverse collisions netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash ==================== Link: https://patch.msgid.link/20240926110717.102194-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-26netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=nSimon Horman
If CONFIG_BRIDGE_NETFILTER is not enabled, which is the case for x86_64 defconfig, then building nf_reject_ipv4.c and nf_reject_ipv6.c with W=1 using gcc-14 results in the following warnings, which are treated as errors: net/ipv4/netfilter/nf_reject_ipv4.c: In function 'nf_send_reset': net/ipv4/netfilter/nf_reject_ipv4.c:243:23: error: variable 'niph' set but not used [-Werror=unused-but-set-variable] 243 | struct iphdr *niph; | ^~~~ cc1: all warnings being treated as errors net/ipv6/netfilter/nf_reject_ipv6.c: In function 'nf_send_reset6': net/ipv6/netfilter/nf_reject_ipv6.c:286:25: error: variable 'ip6h' set but not used [-Werror=unused-but-set-variable] 286 | struct ipv6hdr *ip6h; | ^~~~ cc1: all warnings being treated as errors Address this by reducing the scope of these local variables to where they are used, which is code only compiled when CONFIG_BRIDGE_NETFILTER enabled. Compile tested and run through netfilter selftests. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Closes: https://lore.kernel.org/netfilter-devel/20240906145513.567781-1-andriy.shevchenko@linux.intel.com/ Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-23xfrm: respect ip protocols rules criteria when performing dst lookupsEyal Birger
The series in the "fixes" tag added the ability to consider L4 attributes in routing rules. The dst lookup on the outer packet of encapsulated traffic in the xfrm code was not adapted to this change, thus routing behavior that relies on L4 information is not respected. Pass the ip protocol information when performing dst lookups. Fixes: a25724b05af0 ("Merge branch 'fib_rules-support-sport-dport-and-proto-match'") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Tested-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-09-23xfrm: extract dst lookup parameters into a structEyal Birger
Preparation for adding more fields to dst lookup functions without changing their signatures. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-09-22net: ipv6: select DST_CACHE from IPV6_RPL_LWTUNNELThomas Weißschuh
The rpl sr tunnel code contains calls to dst_cache_*() which are only present when the dst cache is built. Select DST_CACHE to build the dst cache, similar to other kconfig options in the same file. Compiling the rpl sr tunnel without DST_CACHE will lead to linker errors. Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-19netfilter: nf_reject_ipv6: fix nf_reject_ip6_tcphdr_put()Eric Dumazet
syzbot reported that nf_reject_ip6_tcphdr_put() was possibly sending garbage on the four reserved tcp bits (th->res1) Use skb_put_zero() to clear the whole TCP header, as done in nf_reject_ip_tcphdr_put() BUG: KMSAN: uninit-value in nf_reject_ip6_tcphdr_put+0x688/0x6c0 net/ipv6/netfilter/nf_reject_ipv6.c:255 nf_reject_ip6_tcphdr_put+0x688/0x6c0 net/ipv6/netfilter/nf_reject_ipv6.c:255 nf_send_reset6+0xd84/0x15b0 net/ipv6/netfilter/nf_reject_ipv6.c:344 nft_reject_inet_eval+0x3c1/0x880 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x438/0x22a0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x41a/0x4f0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0x29b/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5775 process_backlog+0x4ad/0xa50 net/core/dev.c:6108 __napi_poll+0xe7/0x980 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0xa5a/0x19b0 net/core/dev.c:6963 handle_softirqs+0x1ce/0x800 kernel/softirq.c:554 __do_softirq+0x14/0x1a kernel/softirq.c:588 do_softirq+0x9a/0x100 kernel/softirq.c:455 __local_bh_enable_ip+0x9f/0xb0 kernel/softirq.c:382 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline] __dev_queue_xmit+0x2692/0x5610 net/core/dev.c:4450 dev_queue_xmit include/linux/netdevice.h:3105 [inline] neigh_resolve_output+0x9ca/0xae0 net/core/neighbour.c:1565 neigh_output include/net/neighbour.h:542 [inline] ip6_finish_output2+0x2347/0x2ba0 net/ipv6/ip6_output.c:141 __ip6_finish_output net/ipv6/ip6_output.c:215 [inline] ip6_finish_output+0xbb8/0x14b0 net/ipv6/ip6_output.c:226 NF_HOOK_COND include/linux/netfilter.h:303 [inline] ip6_output+0x356/0x620 net/ipv6/ip6_output.c:247 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] ip6_xmit+0x1ba6/0x25d0 net/ipv6/ip6_output.c:366 inet6_csk_xmit+0x442/0x530 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x3b07/0x4880 net/ipv4/tcp_output.c:1466 tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline] tcp_connect+0x35b6/0x7130 net/ipv4/tcp_output.c:4143 tcp_v6_connect+0x1bcc/0x1e40 net/ipv6/tcp_ipv6.c:333 __inet_stream_connect+0x2ef/0x1730 net/ipv4/af_inet.c:679 inet_stream_connect+0x6a/0xd0 net/ipv4/af_inet.c:750 __sys_connect_file net/socket.c:2061 [inline] __sys_connect+0x606/0x690 net/socket.c:2078 __do_sys_connect net/socket.c:2088 [inline] __se_sys_connect net/socket.c:2085 [inline] __x64_sys_connect+0x91/0xe0 net/socket.c:2085 x64_sys_call+0x27a5/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:43 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was stored to memory at: nf_reject_ip6_tcphdr_put+0x60c/0x6c0 net/ipv6/netfilter/nf_reject_ipv6.c:249 nf_send_reset6+0xd84/0x15b0 net/ipv6/netfilter/nf_reject_ipv6.c:344 nft_reject_inet_eval+0x3c1/0x880 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x438/0x22a0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x41a/0x4f0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0x29b/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5775 process_backlog+0x4ad/0xa50 net/core/dev.c:6108 __napi_poll+0xe7/0x980 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0xa5a/0x19b0 net/core/dev.c:6963 handle_softirqs+0x1ce/0x800 kernel/softirq.c:554 __do_softirq+0x14/0x1a kernel/softirq.c:588 Uninit was stored to memory at: nf_reject_ip6_tcphdr_put+0x2ca/0x6c0 net/ipv6/netfilter/nf_reject_ipv6.c:231 nf_send_reset6+0xd84/0x15b0 net/ipv6/netfilter/nf_reject_ipv6.c:344 nft_reject_inet_eval+0x3c1/0x880 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x438/0x22a0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x41a/0x4f0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0x29b/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5775 process_backlog+0x4ad/0xa50 net/core/dev.c:6108 __napi_poll+0xe7/0x980 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0xa5a/0x19b0 net/core/dev.c:6963 handle_softirqs+0x1ce/0x800 kernel/softirq.c:554 __do_softirq+0x14/0x1a kernel/softirq.c:588 Uninit was created at: slab_post_alloc_hook mm/slub.c:3998 [inline] slab_alloc_node mm/slub.c:4041 [inline] kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4084 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:583 __alloc_skb+0x363/0x7b0 net/core/skbuff.c:674 alloc_skb include/linux/skbuff.h:1320 [inline] nf_send_reset6+0x98d/0x15b0 net/ipv6/netfilter/nf_reject_ipv6.c:327 nft_reject_inet_eval+0x3c1/0x880 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x438/0x22a0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x41a/0x4f0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0x29b/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5775 process_backlog+0x4ad/0xa50 net/core/dev.c:6108 __napi_poll+0xe7/0x980 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0xa5a/0x19b0 net/core/dev.c:6963 handle_softirqs+0x1ce/0x800 kernel/softirq.c:554 __do_softirq+0x14/0x1a kernel/softirq.c:588 Fixes: c8d7b98bec43 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240913170615.3670897-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Merge in late fixes to prepare for the 6.12 net-next PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13ipv6: fib_rules: Add DSCP selector supportIdo Schimmel
Implement support for the new DSCP selector that allows IPv6 FIB rules to match on the entire DSCP field. This is done despite the fact that the above can be achieved using the existing TOS selector, so that user space program will be able to work with IPv4 and IPv6 rules in the same way. Differentiate between both selectors by adding a new bit in the IPv6 FIB rule structure that is only set when the 'FRA_DSCP' attribute is specified by user space. Reject rules that use both selectors. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240911093748.3662015-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13ipv6: avoid possible NULL deref in rt6_uncached_list_flush_dev()Eric Dumazet
Blamed commit accidentally removed a check for rt->rt6i_idev being NULL, as spotted by syzbot: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 10998 Comm: syz-executor Not tainted 6.11.0-rc6-syzkaller-00208-g625403177711 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 RIP: 0010:rt6_uncached_list_flush_dev net/ipv6/route.c:177 [inline] RIP: 0010:rt6_disable_ip+0x33e/0x7e0 net/ipv6/route.c:4914 Code: 41 80 3c 04 00 74 0a e8 90 d0 9b f7 48 8b 7c 24 08 48 8b 07 48 89 44 24 10 4c 89 f0 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 4c 89 f7 e8 64 d0 9b f7 48 8b 44 24 18 49 39 06 RSP: 0018:ffffc900047374e0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff1100fdf8f33 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88807efc78c0 RBP: ffffc900047375d0 R08: 0000000000000003 R09: fffff520008e6e8c R10: dffffc0000000000 R11: fffff520008e6e8c R12: 1ffff1100fdf8f18 R13: ffff88807efc7998 R14: 0000000000000000 R15: ffff88807efc7930 FS: 0000000000000000(0000) GS:ffff8880b8900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020002a80 CR3: 0000000022f62000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> addrconf_ifdown+0x15d/0x1bd0 net/ipv6/addrconf.c:3856 addrconf_notify+0x3cb/0x1020 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93 call_netdevice_notifiers_extack net/core/dev.c:2032 [inline] call_netdevice_notifiers net/core/dev.c:2046 [inline] unregister_netdevice_many_notify+0xd81/0x1c40 net/core/dev.c:11352 unregister_netdevice_many net/core/dev.c:11414 [inline] unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11289 unregister_netdevice include/linux/netdevice.h:3129 [inline] __tun_detach+0x6b9/0x1600 drivers/net/tun.c:685 tun_detach drivers/net/tun.c:701 [inline] tun_chr_close+0x108/0x1b0 drivers/net/tun.c:3510 __fput+0x24a/0x8a0 fs/file_table.c:422 task_work_run+0x24f/0x310 kernel/task_work.c:228 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0xa2f/0x27f0 kernel/exit.c:882 do_group_exit+0x207/0x2c0 kernel/exit.c:1031 __do_sys_exit_group kernel/exit.c:1042 [inline] __se_sys_exit_group kernel/exit.c:1040 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1040 x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f1acc77def9 Code: Unable to access opcode bytes at 0x7f1acc77decf. RSP: 002b:00007ffeb26fa738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1acc77def9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000043 RBP: 00007f1acc7dd508 R08: 00007ffeb26f84d7 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000003 R14: 00000000ffffffff R15: 00007ffeb26fa8e0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:rt6_uncached_list_flush_dev net/ipv6/route.c:177 [inline] RIP: 0010:rt6_disable_ip+0x33e/0x7e0 net/ipv6/route.c:4914 Code: 41 80 3c 04 00 74 0a e8 90 d0 9b f7 48 8b 7c 24 08 48 8b 07 48 89 44 24 10 4c 89 f0 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 4c 89 f7 e8 64 d0 9b f7 48 8b 44 24 18 49 39 06 RSP: 0018:ffffc900047374e0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff1100fdf8f33 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88807efc78c0 RBP: ffffc900047375d0 R08: 0000000000000003 R09: fffff520008e6e8c R10: dffffc0000000000 R11: fffff520008e6e8c R12: 1ffff1100fdf8f18 R13: ffff88807efc7998 R14: 0000000000000000 R15: ffff88807efc7930 FS: 0000000000000000(0000) GS:ffff8880b8900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020002a80 CR3: 0000000022f62000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: e332bc67cf5e ("ipv6: Don't call with rt6_uncached_list_flush_dev") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20240913083147.3095442-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13net: ipv6: rpl_iptunnel: Fix memory leak in rpl_inputJustin Iurman
Free the skb before returning from rpl_input when skb_cow_head() fails. Use a "drop" label and goto instructions. Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240911174557.11536-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11net: support non paged skb fragsMina Almasry
Make skb_frag_page() fail in the case where the frag is not backed by a page, and fix its relevant callers to handle this case. Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240910171458.219195-8-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-06Merge tag 'nf-next-24-09-06' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: Patch #1 adds ctnetlink support for kernel side filtering for deletions, from Changliang Wu. Patch #2 updates nft_counter support to Use u64_stats_t, from Sebastian Andrzej Siewior. Patch #3 uses kmemdup_array() in all xtables frontends, from Yan Zhen. Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead opencoded casting, from Shen Lichuan. Patch #5 removes unused argument in nftables .validate interface, from Florian Westphal. Patch #6 is a oneliner to correct a typo in nftables kdoc, from Simon Horman. Patch #7 fixes missing kdoc in nftables, also from Simon. Patch #8 updates nftables to handle timeout less than CONFIG_HZ. Patch #9 rejects element expiration if timeout is zero, otherwise it is silently ignored. Patch #10 disallows element expiration larger than timeout. Patch #11 removes unnecessary READ_ONCE annotation while mutex is held. Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset. Patch #13 annotates data-races around element expiration. Patch #14 allocates timeout and expiration in one single set element extension, they are tighly couple, no reason to keep them separated anymore. Patch #15 updates nftables to interpret zero timeout element as never times out. Note that it is already possible to declare sets with elements that never time out but this generalizes to all kind of set with timeouts. Patch #16 supports for element timeout and expiration updates. * tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: set element timeout update support netfilter: nf_tables: zero timeout means element never times out netfilter: nf_tables: consolidate timeout extension for elements netfilter: nf_tables: annotate data-races around element expiration netfilter: nft_dynset: annotate data-races around set timeout netfilter: nf_tables: remove annotation to access set timeout while holding lock netfilter: nf_tables: reject expiration higher than timeout netfilter: nf_tables: reject element expiration with no timeout netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire netfilter: nf_tables: Add missing Kernel doc netfilter: nf_tables: Correct spelling in nf_tables.h netfilter: nf_tables: drop unused 3rd argument from validate callback ops netfilter: conntrack: Convert to use ERR_CAST() netfilter: Use kmemdup_array instead of kmemdup for multiple allocation netfilter: nft_counter: Use u64_stats_t for statistic. netfilter: ctnetlink: support CTA_FILTER for flush ==================== Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-06net/ipv6: make use of the helper macro LIST_HEAD()Hongbo Li
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Here we can simplify the code. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://patch.msgid.link/20240904093243.3345012-5-lihongbo22@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/phy/phy_device.c 2560db6ede1a ("net: phy: Fix missing of_node_put() for leds") 1dce520abd46 ("net: phy: Use for_each_available_child_of_node_scoped()") https://lore.kernel.org/20240904115823.74333648@canb.auug.org.au Adjacent changes: drivers/net/ethernet/xilinx/xilinx_axienet.h drivers/net/ethernet/xilinx/xilinx_axienet_main.c 858430db28a5 ("net: xilinx: axienet: Fix race in axienet_stop") 76abb5d675c4 ("net: xilinx: axienet: Add statistics support") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05ila: call nf_unregister_net_hooks() soonerEric Dumazet
syzbot found an use-after-free Read in ila_nf_input [1] Issue here is that ila_xlat_exit_net() frees the rhashtable, then call nf_unregister_net_hooks(). It should be done in the reverse way, with a synchronize_rcu(). This is a good match for a pre_exit() method. [1] BUG: KASAN: use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline] BUG: KASAN: use-after-free in __rhashtable_lookup include/linux/rhashtable.h:604 [inline] BUG: KASAN: use-after-free in rhashtable_lookup include/linux/rhashtable.h:646 [inline] BUG: KASAN: use-after-free in rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672 Read of size 4 at addr ffff888064620008 by task ksoftirqd/0/16 CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.11.0-rc4-syzkaller-00238-g2ad6d23f465a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 rht_key_hashfn include/linux/rhashtable.h:159 [inline] __rhashtable_lookup include/linux/rhashtable.h:604 [inline] rhashtable_lookup include/linux/rhashtable.h:646 [inline] rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672 ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:132 [inline] ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline] ila_nf_input+0x1fe/0x3c0 net/ipv6/ila/ila_xlat.c:190 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK+0x29e/0x450 include/linux/netfilter.h:312 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5775 process_backlog+0x662/0x15b0 net/core/dev.c:6108 __napi_poll+0xcb/0x490 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:6963 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 run_ksoftirqd+0xca/0x130 kernel/softirq.c:928 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x64620 flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) page_type: 0xbfffffff(buddy) raw: 00fff00000000000 ffffea0000959608 ffffea00019d9408 0000000000000000 raw: 0000000000000000 0000000000000003 00000000bfffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as freed page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 5242, tgid 5242 (syz-executor), ts 73611328570, free_ts 618981657187 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493 prep_new_page mm/page_alloc.c:1501 [inline] get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline] alloc_pages_node_noprof include/linux/gfp.h:296 [inline] ___kmalloc_large_node+0x8b/0x1d0 mm/slub.c:4103 __kmalloc_large_node_noprof+0x1a/0x80 mm/slub.c:4130 __do_kmalloc_node mm/slub.c:4146 [inline] __kmalloc_node_noprof+0x2d2/0x440 mm/slub.c:4164 __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650 bucket_table_alloc lib/rhashtable.c:186 [inline] rhashtable_init_noprof+0x534/0xa60 lib/rhashtable.c:1071 ila_xlat_init_net+0xa0/0x110 net/ipv6/ila/ila_xlat.c:613 ops_init+0x359/0x610 net/core/net_namespace.c:139 setup_net+0x515/0xca0 net/core/net_namespace.c:343 copy_net_ns+0x4e2/0x7b0 net/core/net_namespace.c:508 create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228 ksys_unshare+0x619/0xc10 kernel/fork.c:3328 __do_sys_unshare kernel/fork.c:3399 [inline] __se_sys_unshare kernel/fork.c:3397 [inline] __x64_sys_unshare+0x38/0x40 kernel/fork.c:3397 page last free pid 11846 tgid 11846 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1094 [inline] free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612 __folio_put+0x2c8/0x440 mm/swap.c:128 folio_put include/linux/mm.h:1486 [inline] free_large_kmalloc+0x105/0x1c0 mm/slub.c:4565 kfree+0x1c4/0x360 mm/slub.c:4588 rhashtable_free_and_destroy+0x7c6/0x920 lib/rhashtable.c:1169 ila_xlat_exit_net+0x55/0x110 net/ipv6/ila/ila_xlat.c:626 ops_exit_list net/core/net_namespace.c:173 [inline] cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312 worker_thread+0x86d/0xd40 kernel/workqueue.c:3390 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Memory state around the buggy address: ffff88806461ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88806461ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888064620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888064620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888064620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20240904144418.1162839-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-04ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_bind_dev()Ido Schimmel
Unmask the upper DSCP bits when calling ip_route_output_ports() so that in the future it could perform the FIB lookup according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240903135327.2810535-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-04ip6_tunnel: Unmask upper DSCP bits in ip4ip6_err()Ido Schimmel
Unmask the upper DSCP bits when calling ip_route_output_ports() so that in the future it could perform the FIB lookup according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240903135327.2810535-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03ioam6: improve checks on user dataJustin Iurman
This patch improves two checks on user data. The first one prevents bit 23 from being set, as specified by RFC 9197 (Sec 4.4.1): Bit 23 Reserved; MUST be set to zero upon transmission and be ignored upon receipt. This bit is reserved to allow for future extensions of the IOAM Trace-Type bit field. The second one checks that the tunnel destination address != IPV6_ADDR_ANY, just like we already do for the tunnel source address. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20240830191919.51439-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_localAlexander Lobakin
"Interface can't change network namespaces" is rather an attribute, not a feature, and it can't be changed via Ethtool. Make it a "cold" private flag instead of a netdev_feature and free one more bit. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netdev_features: convert NETIF_F_LLTX to dev->lltxAlexander Lobakin
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netfilter: Use kmemdup_array instead of kmemdup for multiple allocationYan Zhen
When we are allocating an array, using kmemdup_array() to take care about multiplication and possible overflows. Also it makes auditing the code easier. Signed-off-by: Yan Zhen <yanzhen@vivo.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-31ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit()Ido Schimmel
The function calls flowi4_init_output() to initialize an IPv4 flow key with which it then performs a FIB lookup using ip_route_output_flow(). The 'tos' variable with which the TOS value in the IPv4 flow key (flowi4_tos) is initialized contains the full DS field. Unmask the upper DSCP bits so that in the future the FIB lookup could be performed according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-30icmp: move icmp_global.credit and icmp_global.stamp to per netns storageEric Dumazet
Host wide ICMP ratelimiter should be per netns, to provide better isolation. Following patch in this series makes the sysctl per netns. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240829144641.3880376-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30icmp: change the order of rate limitsEric Dumazet
ICMP messages are ratelimited : After the blamed commits, the two rate limiters are applied in this order: 1) host wide ratelimit (icmp_global_allow()) 2) Per destination ratelimit (inetpeer based) In order to avoid side-channels attacks, we need to apply the per destination check first. This patch makes the following change : 1) icmp_global_allow() checks if the host wide limit is reached. But credits are not yet consumed. This is deferred to 3) 2) The per destination limit is checked/updated. This might add a new node in inetpeer tree. 3) icmp_global_consume() consumes tokens if prior operations succeeded. This means that host wide ratelimit is still effective in keeping inetpeer tree small even under DDOS. As a bonus, I removed icmp_global.lock as the fast path can use a lock-free operation. Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Reported-by: Keyu Man <keyu.man@email.ucr.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240829144641.3880376-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-29tcp: add SO_PEEK_OFF socket option tor TCPv6Jon Maloy
When doing further testing of the recently added SO_PEEK_OFF feature for TCP I realized I had omitted to add it for TCP/IPv6. I do that here. Fixes: 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20240828183752.660267-2-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-29net/ipv6: replace deprecated strcpy with strscpyHongbo Li
The deprecated helper strcpy() performs no bounds checking on the destination buffer. This could result in linear overflows beyond the end of the buffer, leading to all kinds of misbehaviors. The safe replacement is strscpy() [1]. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://patch.msgid.link/20240828123224.3697672-3-lihongbo22@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28tcp: annotate data-races around tcptw->tw_rcv_nxtEric Dumazet
No lock protects tcp tw fields. tcptw->tw_rcv_nxt can be changed from twsk_rcv_nxt_update() while other threads might read this field. Add READ_ONCE()/WRITE_ONCE() annotations, and make sure tcp_timewait_state_process() reads tcptw->tw_rcv_nxt only once. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20240827015250.3509197-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28tcp: remove volatile qualifier on tw_substateEric Dumazet
Using a volatile qualifier for a specific struct field is unusual. Use instead READ_ONCE()/WRITE_ONCE() where necessary. tcp_timewait_state_process() can change tw_substate while other threads are reading this field. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20240827015250.3509197-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-26ipv6: avoid indirect calls for SOL_IP socket optionsEric Dumazet
ipv6_setsockopt() can directly call ip_setsockopt() instead of going through udp_prot.setsockopt() ipv6_getsockopt() can directly call ip_getsockopt() instead of going through udp_prot.getsockopt() These indirections predate git history, not sure why they were there. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240823140019.3727643-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-26ipv6: mcast: use min() to simplify the codeLi Zetao
When coping sockaddr in ip6_mc_msfget(), the time of copies depends on the minimum value between sl_count and gf_numsrc. Using min() here is very semantic. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822133908.1042240-7-lizetao1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-26Merge tag 'nf-next-24-08-23' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next: Patch #1 fix checksum calculation in nfnetlink_queue with SCTP, segment GSO packet since skb_zerocopy() does not support GSO_BY_FRAGS, from Antonio Ojea. Patch #2 extend nfnetlink_queue coverage to handle SCTP packets, from Antonio Ojea. Patch #3 uses consume_skb() instead of kfree_skb() in nfnetlink, from Donald Hunter. Patch #4 adds a dedicate commit list for sets to speed up intra-transaction lookups, from Florian Westphal. Patch #5 skips removal of element from abort path for the pipapo backend, ditching the shadow copy of this datastructure is sufficient. Patch #6 moves nf_ct_netns_get() out of nf_conncount_init() to let users of conncoiunt decide when to enable conntrack, this is needed by openvswitch, from Xin Long. Patch #7 pass context to all nft_parse_register_load() in preparation for the next patch. Patches #8 and #9 reject loads from uninitialized registers from control plane to remove register initialization from datapath. From Florian Westphal. * tag 'nf-next-24-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: don't initialize registers in nft_do_chain() netfilter: nf_tables: allow loads only when register is initialized netfilter: nf_tables: pass context structure to nft_parse_register_load netfilter: move nf_ct_netns_get out of nf_conncount_init netfilter: nf_tables: do not remove elements if set backend implements .abort netfilter: nf_tables: store new sets in dedicated list netfilter: nfnetlink: convert kfree_skb to consume_skb selftests: netfilter: nft_queue.sh: sctp coverage netfilter: nfnetlink_queue: unbreak SCTP traffic ==================== Link: https://patch.msgid.link/20240822221939.157858-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-23net/ipv6: delete redundant judgment statementsLi Zetao
The initial value of err is -ENOBUFS, and err is guaranteed to be less than 0 before all goto errout. Therefore, on the error path of errout, there is no need to repeatedly judge that err is less than 0, and delete redundant judgments to make the code more concise. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23ip6mr: delete redundant judgment statementsLi Zetao
The initial value of err is -ENOBUFS, and err is guaranteed to be less than 0 before all goto errout. Therefore, on the error path of errout, there is no need to repeatedly judge that err is less than 0, and delete redundant judgments to make the code more concise. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.h c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops") f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC") Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22net: ipv6: ioam6: new feature tunsrcJustin Iurman
This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e., "encap") mode of ioam6. Just like seg6 already does, except it is attached to a route. The "tunsrc" is optional: when not provided (by default), the automatic resolution is applied. Using "tunsrc" when possible has a benefit: performance. See the comparison: - before (= "encap" mode): https://ibb.co/bNCzvf7 - after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: ipv6: ioam6: code alignmentJustin Iurman
This patch prepares the next one by correcting the alignment of some lines. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-21ipv6: prevent possible UAF in ip6_xmit()Eric Dumazet
If skb_expand_head() returns NULL, skb has been freed and the associated dst/idev could also have been freed. We must use rcu_read_lock() to prevent a possible UAF. Fixes: 0c9f227bee11 ("ipv6: use skb_expand_head in ip6_xmit") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vasily Averin <vasily.averin@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: fix possible UAF in ip6_finish_output2()Eric Dumazet
If skb_expand_head() returns NULL, skb has been freed and associated dst/idev could also have been freed. We need to hold rcu_read_lock() to make sure the dst and associated idev are alive. Fixes: 5796015fa968 ("ipv6: allocate enough headroom in ip6_finish_output2()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vasily Averin <vasily.averin@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: prevent UAF in ip6_send_skb()Eric Dumazet
syzbot reported an UAF in ip6_send_skb() [1] After ip6_local_out() has returned, we no longer can safely dereference rt, unless we hold rcu_read_lock(). A similar issue has been fixed in commit a688caa34beb ("ipv6: take rcu lock in rawv6_send_hdrinc()") Another potential issue in ip6_finish_output2() is handled in a separate patch. [1] BUG: KASAN: slab-use-after-free in ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964 Read of size 8 at addr ffff88806dde4858 by task syz.1.380/6530 CPU: 1 UID: 0 PID: 6530 Comm: syz.1.380 Not tainted 6.11.0-rc3-syzkaller-00306-gdf6cbc62cc9b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964 rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588 rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 sock_write_iter+0x2dd/0x400 net/socket.c:1160 do_iter_readv_writev+0x60a/0x890 vfs_writev+0x37c/0xbb0 fs/read_write.c:971 do_writev+0x1b1/0x350 fs/read_write.c:1018 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f936bf79e79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f936cd7f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00007f936c115f80 RCX: 00007f936bf79e79 RDX: 0000000000000001 RSI: 0000000020000040 RDI: 0000000000000004 RBP: 00007f936bfe7916 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f936c115f80 R15: 00007fff2860a7a8 </TASK> Allocated by task 6530: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:312 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3988 [inline] slab_alloc_node mm/slub.c:4037 [inline] kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044 dst_alloc+0x12b/0x190 net/core/dst.c:89 ip6_blackhole_route+0x59/0x340 net/ipv6/route.c:2670 make_blackhole net/xfrm/xfrm_policy.c:3120 [inline] xfrm_lookup_route+0xd1/0x1c0 net/xfrm/xfrm_policy.c:3313 ip6_dst_lookup_flow+0x13e/0x180 net/ipv6/ip6_output.c:1257 rawv6_sendmsg+0x1283/0x23c0 net/ipv6/raw.c:898 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597 ___sys_sendmsg net/socket.c:2651 [inline] __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 45: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256 kasan_slab_free include/linux/kasan.h:184 [inline] slab_free_hook mm/slub.c:2252 [inline] slab_free mm/slub.c:4473 [inline] kmem_cache_free+0x145/0x350 mm/slub.c:4548 dst_destroy+0x2ac/0x460 net/core/dst.c:124 rcu_do_batch kernel/rcu/tree.c:2569 [inline] rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 Last potentially related work creation: kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541 __call_rcu_common kernel/rcu/tree.c:3106 [inline] call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210 refdst_drop include/net/dst.h:263 [inline] skb_dst_drop include/net/dst.h:275 [inline] nf_ct_frag6_queue net/ipv6/netfilter/nf_conntrack_reasm.c:306 [inline] nf_ct_frag6_gather+0xb9a/0x2080 net/ipv6/netfilter/nf_conntrack_reasm.c:485 ipv6_defrag+0x2c8/0x3c0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:67 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] __ip6_local_out+0x6fa/0x800 net/ipv6/output_core.c:143 ip6_local_out+0x26/0x70 net/ipv6/output_core.c:153 ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1959 rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588 rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 sock_write_iter+0x2dd/0x400 net/socket.c:1160 do_iter_readv_writev+0x60a/0x890 Fixes: 0625491493d9 ("ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDS") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: remove redundant checkXi Huang
err varibale will be set everytime,like -ENOBUFS and in if (err < 0), when code gets into this path. This check will just slowdown the execution and that's all. Signed-off-by: Xi Huang <xuiagnh@gmail.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20240820115442.49366-1-xuiagnh@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-20ip6_tunnel: Fix broken GROThomas Bogendoerfer
GRO code checks for matching layer 2 headers to see, if packet belongs to the same flow and because ip6 tunnel set dev->hard_header_len this check fails in cases, where it shouldn't. To fix this don't set hard_header_len, but use needed_headroom like ipv4/ip_tunnel.c does. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Link: https://patch.msgid.link/20240815151419.109864-1-tbogendoerfer@suse.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-20netfilter: nf_tables: pass context structure to nft_parse_register_loadFlorian Westphal
Mechanical transformation, no logical changes intended. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>