<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/include/net, branch cpufreq-rust</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.</subtitle>
<id>https://git.kobert.dev/pm24.git/atom/include/net?h=cpufreq-rust</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom/include/net?h=cpufreq-rust'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2025-02-06T02:32:06Z</updated>
<entry>
<title>net: sched: Fix truncation of offloaded action statistics</title>
<updated>2025-02-06T02:32:06Z</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@nvidia.com</email>
</author>
<published>2025-02-04T12:38:39Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=811b8f534fd85e17077bd2ac0413bcd16cc8fb9b'/>
<id>urn:sha1:811b8f534fd85e17077bd2ac0413bcd16cc8fb9b</id>
<content type='text'>
In case of tc offload, when user space queries the kernel for tc action
statistics, tc will query the offloaded statistics from device drivers.
Among other statistics, drivers are expected to pass the number of
packets that hit the action since the last query as a 64-bit number.

Unfortunately, tc treats the number of packets as a 32-bit number,
leading to truncation and incorrect statistics when the number of
packets since the last query exceeds 0xffffffff:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 58 sec used 0 sec
        Action statistics:
        Sent 1133877034176 bytes 536959475 pkt (dropped 0, overlimits 0 requeues 0)
[...]

According to the above, 2111-byte packets were redirected which is
impossible as only 64-byte packets were transmitted and the MTU was
1500.

Fix by treating packets as a 64-bit number:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 61 sec used 0 sec
        Action statistics:
        Sent 1370624380864 bytes 21416005951 pkt (dropped 0, overlimits 0 requeues 0)
[...]

Which shows that only 64-byte packets were redirected (1370624380864 /
21416005951 = 64).

Fixes: 380407023526 ("net/sched: Enable netdev drivers to update statistics of offloaded actions")
Reported-by: Joe Botha &lt;joe@atomic.ac&gt;
Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: Petr Machata &lt;petrm@nvidia.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://patch.msgid.link/20250204123839.1151804-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'ipsec-2025-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec</title>
<updated>2025-01-27T23:15:12Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2025-01-27T23:15:11Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=463ec95a162d5036f81831c2cd8d6d661b1e7b9f'/>
<id>urn:sha1:463ec95a162d5036f81831c2cd8d6d661b1e7b9f</id>
<content type='text'>
Steffen Klassert says:

====================
pull request (net): ipsec 2025-01-27

1) Fix incrementing the upper 32 bit sequence numbers for GSO skbs.
   From Jianbo Liu.

2) Fix an out-of-bounds read on xfrm state lookup.
   From Florian Westphal.

3) Fix secpath handling on packet offload mode.
   From Alexandre Cassen.

4) Fix the usage of skb-&gt;sk in the xfrm layer.

5) Don't disable preemption while looking up cache state
   to fix PREEMPT_RT.
   From Sebastian Sewior.

* tag 'ipsec-2025-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: Don't disable preemption while looking up cache state.
  xfrm: Fix the usage of skb-&gt;sk
  xfrm: delete intermediate secpath entry in packet offload mode
  xfrm: state: fix out-of-bounds read during lookup
  xfrm: replay: Fix the update of replay_esn-&gt;oseq_hi for GSO
====================

Link: https://patch.msgid.link/20250127060757.3946314-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: page_pool: don't try to stash the napi id</title>
<updated>2025-01-27T22:37:41Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2025-01-23T23:16:20Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=67e4bb2ced0f2d8fbd414f932daea94ba63ae4c4'/>
<id>urn:sha1:67e4bb2ced0f2d8fbd414f932daea94ba63ae4c4</id>
<content type='text'>
Page ppol tried to cache the NAPI ID in page pool info to avoid
having a dependency on the life cycle of the NAPI instance.
Since commit under Fixes the NAPI ID is not populated until
napi_enable() and there's a good chance that page pool is
created before NAPI gets enabled.

Protect the NAPI pointer with the existing page pool mutex,
the reading path already holds it. napi_id itself we need
to READ_ONCE(), it's protected by netdev_lock() which are
not holding in page pool.

Before this patch napi IDs were missing for mlx5:

 # ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get

 [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912},
  {'id': 143, 'ifindex': 2, 'inflight': 5568, 'inflight-mem': 22806528},
  {'id': 142, 'ifindex': 2, 'inflight': 5120, 'inflight-mem': 20971520},
  {'id': 141, 'ifindex': 2, 'inflight': 4992, 'inflight-mem': 20447232},
  ...

After:

 [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912,
   'napi-id': 565},
  {'id': 143, 'ifindex': 2, 'inflight': 4224, 'inflight-mem': 17301504,
   'napi-id': 525},
  {'id': 142, 'ifindex': 2, 'inflight': 4288, 'inflight-mem': 17563648,
   'napi-id': 524},
  ...

Fixes: 86e25f40aa1e ("net: napi: Add napi_config")
Reviewed-by: Mina Almasry &lt;almasrymina@google.com&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://patch.msgid.link/20250123231620.1086401-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next</title>
<updated>2025-01-22T16:28:57Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-22T16:28:57Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=0ad9617c78acbc71373fb341a6f75d4012b01d69'/>
<id>urn:sha1:0ad9617c78acbc71373fb341a6f75d4012b01d69</id>
<content type='text'>
Pull networking updates from Paolo Abeni:
 "This is slightly smaller than usual, with the most interesting work
  being still around RTNL scope reduction.

  Core:

   - More core refactoring to reduce the RTNL lock contention, including
     preparatory work for the per-network namespace RTNL lock, replacing
     RTNL lock with a per device-one to protect NAPI-related net device
     data and moving synchronize_net() calls outside such lock.

   - Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge
     and more specific TCP coverage.

   - Reduce network namespace tear-down time by removing per-subsystems
     synchronize_net() in tipc and sched.

   - Add flow label selector support for fib rules, allowing traffic
     redirection based on such header field.

  Netfilter:

   - Do not remove netdev basechain when last device is gone, allowing
     netdev basechains without devices.

   - Revisit the flowtable teardown strategy, dealing better with fin,
     reset and re-open events.

   - Scale-up IP-vs connection dumping by avoiding linear search on each
     restart.

  Protocols:

   - A significant XDP socket refactor, consolidating and optimizing
     several helpers into the core

   - Better scaling of ICMP rate-limiting, by removing false-sharing in
     inet peers handling.

   - Introduces netlink notifications for multicast IPv4 and IPv6
     address changes.

   - Add ipsec support for IP-TFS/AggFrag encapsulation, allowing
     aggregation and fragmentation of the inner IP.

   - Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to
     avoid local port exhaustion issues when the average connection
     lifetime is very short.

   - Support updating keys (re-keying) for connections using kernel TLS
     (for TLS 1.3 only).

   - Support ipv4-mapped ipv6 address clients in smc-r v2.

   - Add support for jumbo data packet transmission in RxRPC sockets,
     gluing multiple data packets in a single UDP packet.

   - Support RxRPC RACK-TLP to manage packet loss and retransmission in
     conjunction with the congestion control algorithm.

  Driver API:

   - Introduce a unified and structured interface for reporting PHY
     statistics, exposing consistent data across different H/W via
     ethtool.

   - Make timestamping selectable, allow the user to select the desired
     hwtstamp provider (PHY or MAC) administratively.

   - Add support for configuring a header-data-split threshold (HDS)
     value via ethtool, to deal with partial or buggy H/W
     implementation.

   - Consolidate DSA drivers Energy Efficiency Ethernet support.

   - Add EEE management to phylink, making use of the phylib
     implementation.

   - Add phylib support for in-band capabilities negotiation.

   - Simplify how phylib-enabled mac drivers expose the supported
     interfaces.

  Tests and tooling:

   - Make the YNL tool package-friendly to make it easier to deploy it
     separately from the kernel.

   - Increase TCP selftest coverage importing several packetdrill
     test-cases.

   - Regenerate the ethtool uapi header from the YNL spec, to ease
     maintenance and future development.

   - Add YNL support for decoding the link types used in net self-tests,
     allowing a single build to run both net and drivers/net.

  Drivers:

   - Ethernet high-speed NICs:
      - nVidia/Mellanox (mlx5):
         - add cross E-Switch QoS support
         - add SW Steering support for ConnectX-8
         - implement support for HW-Managed Flow Steering, improving the
           rule deletion/insertion rate
         - support for multi-host LAG
      - Intel (ixgbe, ice, igb):
         - ice: add support for devlink health events
         - ixgbe: add initial support for E610 chipset variant
         - igb: add support for AF_XDP zero-copy
      - Meta:
         - add support for basic RSS config
         - allow changing the number of channels
         - add hardware monitoring support
      - Broadcom (bnxt):
         - implement TCP data split and HDS threshold ethtool support,
           enabling Device Memory TCP.
      - Marvell Octeon:
         - implement egress ipsec offload support for the cn10k family
      - Hisilicon (HIBMC):
         - implement unicast MAC filtering

   - Ethernet NICs embedded and virtual:
      - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding
        contented atomic operations for drop counters
      - Freescale:
         - quicc: phylink conversion
         - enetc: support Tx and Rx checksum offload and improve TSO
           performances
      - MediaTek:
         - airoha: introduce support for ETS and HTB Qdisc offload
      - Microchip:
         - lan78XX USB: preparation work for phylink conversion
      - Synopsys (stmmac):
         - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45
         - refactor EEE support to leverage the new driver API
         - optimize DMA and cache access to increase raw RX performances
           by 40%
      - TI:
         - icssg-prueth: add multicast filtering support for VLAN
           interface
      - netkit:
         - add ability to configure head/tailroom
      - VXLAN:
         - accepts packets with user-defined reserved bit

   - Ethernet switches:
      - Microchip:
         - lan969x: add RGMII support
         - lan969x: improve TX and RX performance using the FDMA engine
      - nVidia/Mellanox:
         - move Tx header handling to PCI driver, to ease XDP support

   - Ethernet PHYs:
      - Texas Instruments DP83822:
         - add support for GPIO2 clock output
      - Realtek:
         - 8169: add support for RTL8125D rev.b
         - rtl822x: add hwmon support for the temperature sensor
      - Microchip:
         - add support for RDS PTP hardware
         - consolidate periodic output signal generation

   - CAN:
      - several DT-bindings to DT schema conversions
      - tcan4x5x:
         - add HW standby support
         - support nWKRQ voltage selection
      - kvaser:
         - allowing Bus Error Reporting runtime configuration

   - WiFi:
      - the on-going Multi-Link Operation (MLO) effort continues,
        affecting both the stack and in drivers
      - mac80211/cfg80211:
         - Emergency Preparedness Communication Services (EPCS) station
           mode support
         - support for adding and removing station links for MLO
         - add support for WiFi 7/EHT mesh over 320 MHz channels
         - report Tx power info for each link
      - RealTek (rtw88):
         - enable USB Rx aggregation and USB 3 to improve performance
         - LED support
      - RealTek (rtw89):
         - refactor power save to support Multi-Link Operations
         - add support for RTL8922AE-VS variant
      - MediaTek (mt76):
         - single wiphy multiband support (preparation for MLO)
         - p2p device support
         - add TP-Link TXE50UH USB adapter support
      - Qualcomm (ath10k):
         - support for the QCA6698AQ IP core
      - Qualcomm (ath12k):
         - enable MLO for QCN9274

   - Bluetooth:
      - Allow sysfs to trigger hdev reset, to allow recovering devices
        not responsive from user-space
      - MediaTek: add support for MT7922, MT7925, MT7921e devices
      - Realtek: add support for RTL8851BE devices
      - Qualcomm: add support for WCN785x devices
      - ISO: allow BIG re-sync"

* tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits)
  net/rose: prevent integer overflows in rose_setsockopt()
  net: phylink: fix regression when binding a PHY
  net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup
  net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup
  net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path
  ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL.
  ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL.
  ipv6: Move lifetime validation to inet6_rtm_newaddr().
  ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr().
  ipv6: Pass dev to inet6_addr_add().
  ipv6: Convert inet6_ioctl() to per-netns RTNL.
  ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup().
  ipv6: Hold rtnl_net_lock() in addrconf_dad_work().
  ipv6: Hold rtnl_net_lock() in addrconf_verify_work().
  ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL.
  ipv6: Add __in6_dev_get_rtnl_net().
  net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags
  net: mii: Fix the Speed display when the network cable is not connected
  sysctl net: Remove macro checks for CONFIG_SYSCTL
  eth: bnxt: update header sizing defaults
  ...
</content>
</entry>
<entry>
<title>Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm</title>
<updated>2025-01-22T04:03:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-22T04:03:04Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=f96a974170b749e3a56844e25b31d46a7233b6f6'/>
<id>urn:sha1:f96a974170b749e3a56844e25b31d46a7233b6f6</id>
<content type='text'>
Pull lsm updates from Paul Moore:

 - Improved handling of LSM "secctx" strings through lsm_context struct

   The LSM secctx string interface is from an older time when only one
   LSM was supported, migrate over to the lsm_context struct to better
   support the different LSMs we now have and make it easier to support
   new LSMs in the future.

   These changes explain the Rust, VFS, and networking changes in the
   diffstat.

 - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are
   enabled

   Small tweak to be a bit smarter about when we build the LSM's common
   audit helpers.

 - Check for absurdly large policies from userspace in SafeSetID

   SafeSetID policies rules are fairly small, basically just "UID:UID",
   it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which
   helps quiet a number of syzbot related issues. While work is being
   done to address the syzbot issues through other mechanisms, this is a
   trivial and relatively safe fix that we can do now.

 - Various minor improvements and cleanups

   A collection of improvements to the kernel selftests, constification
   of some function parameters, removing redundant assignments, and
   local variable renames to improve readability.

* tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lockdown: initialize local array before use to quiet static analysis
  safesetid: check size of policy writes
  net: corrections for security_secid_to_secctx returns
  lsm: rename variable to avoid shadowing
  lsm: constify function parameters
  security: remove redundant assignment to return variable
  lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
  selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test
  binder: initialize lsm_context structure
  rust: replace lsm context+len with lsm_context
  lsm: secctx provider check on release
  lsm: lsm_context in security_dentry_init_security
  lsm: use lsm_context in security_inode_getsecctx
  lsm: replace context+len with lsm_context
  lsm: ensure the correct LSM context releaser
</content>
</entry>
<entry>
<title>ipv6: Add __in6_dev_get_rtnl_net().</title>
<updated>2025-01-20T20:16:04Z</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@amazon.com</email>
</author>
<published>2025-01-15T08:05:58Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=f7a6082b5e4c15f34fd766cf0960f7e082009c54'/>
<id>urn:sha1:f7a6082b5e4c15f34fd766cf0960f7e082009c54</id>
<content type='text'>
We will convert rtnl_lock() with rtnl_net_lock(), and we want to
convert __in6_dev_get() too.

__in6_dev_get() uses rcu_dereference_rtnl(), but as written in its
comment, rtnl_dereference() or rcu_dereference() is preferable.

Let's add __in6_dev_get_rtnl_net() that uses rtnl_net_dereference().

We can add the RCU version helper later if needed.

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250115080608.28127-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next</title>
<updated>2025-01-20T19:59:25Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2025-01-20T19:59:25Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=4fd001f5f32deb4c5e9dcd05612f8371b79d380e'/>
<id>urn:sha1:4fd001f5f32deb4c5e9dcd05612f8371b79d380e</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter updates for net-next:

1) Unbreak set size settings for rbtree set backend, intervals in
   rbtree are represented as two elements, this detailed is leaked
   to userspace leading to bogus ENOSPC from control plane.

2) Remove dead code in br_netfilter's br_nf_pre_routing_finish()
   due to never matching error when looking up for route,
   from Antoine Tenart.

3) Simplify check for device already in use in flowtable,
   from Phil Sutter.

4) Three patches to restore interface name field in struct nft_hook
   and use it, this is to prepare for wildcard interface support.
   From Phil Sutter.

5) Do not remove netdev basechain when last device is gone, this is
   for consistency with the flowtable behaviour. This allows for netdev
   basechains without devices. Another patch to simplify netdev event
   notifier after this update. Also from Phil.

6) Two patches to add missing spinlock when flowtable updates TCP
   state flags, from Florian Westphal.

7) Simplify __nf_ct_refresh_acct() by removing skbuff parameter,
   also from Florian.

8) Flowtable gc now extends ct timeout for offloaded flow. This
   is to address a possible race that leads to handing over flow
   to classic path with long ct timeouts.

9) Tear down flow if cached rt_mtu is stale, before this patch,
   packet is handed over to classic path but flow entry still remained
   in place.

10) Revisit the flowtable teardown strategy, which was originally
    designed to release flowtable hardware entries early. Add a new
    CLOSING flag that still allows hardware to release entries when
    fin/rst is seen, but keeps the flow entry in place when the
    TCP connection is closed. Release flow after timeout or when a new
    syn packet is seen for TCP reopen scenario.

* tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: flowtable: add CLOSING state
  netfilter: flowtable: teardown flow if cached mtu is stale
  netfilter: conntrack: rework offload nf_conn timeout extension logic
  netfilter: conntrack: remove skb argument from nf_ct_refresh
  netfilter: nft_flow_offload: update tcp state flags under lock
  netfilter: nft_flow_offload: clear tcp MAXACK flag before moving to slowpath
  netfilter: nf_tables: Simplify chain netdev notifier
  netfilter: nf_tables: Tolerate chains with no remaining hooks
  netfilter: nf_tables: Compare netdev hooks based on stored name
  netfilter: nf_tables: Use stored ifname in netdev hook dumps
  netfilter: nf_tables: Store user-defined hook ifname
  netfilter: nf_tables: Flowtable hook's pf value never varies
  netfilter: br_netfilter: remove unused conditional and dead code
  netfilter: nf_tables: fix set size with rbtree backend
====================

Link: https://patch.msgid.link/20250119172051.8261-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: move HDS config from ethtool state</title>
<updated>2025-01-20T19:44:57Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2025-01-19T02:05:11Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=3c836451ca9041cfb32a7d8f59ea15b3b991bbb3'/>
<id>urn:sha1:3c836451ca9041cfb32a7d8f59ea15b3b991bbb3</id>
<content type='text'>
Separate the HDS config from the ethtool state struct.
The HDS config contains just simple parameters, not state.
Having it as a separate struct will make it easier to clone / copy
and also long term potentially make it per-queue.

Reviewed-by: Michael Chan &lt;michael.chan@broadcom.com&gt;
Link: https://patch.msgid.link/20250119020518.1962249-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Set drop reason in unix_dgram_disconnected().</title>
<updated>2025-01-20T19:27:41Z</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@amazon.com</email>
</author>
<published>2025-01-16T05:34:40Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=b3e365bbf4f47b8f76b25b0fcf3f38916ca53e42'/>
<id>urn:sha1:b3e365bbf4f47b8f76b25b0fcf3f38916ca53e42</id>
<content type='text'>
unix_dgram_disconnected() is called from two places:

  1. when a connect()ed socket dis-connect()s or re-connect()s to
     another socket

  2. when sendmsg() fails because the peer socket that the client
     has connect()ed to has been close()d

Then, the client's recv queue is purged to remove all messages from
the old peer socket.

Let's define a new drop reason for that case.

  # echo 1 &gt; /sys/kernel/tracing/events/skb/kfree_skb/enable

  # python3
  &gt;&gt;&gt; from socket import *
  &gt;&gt;&gt;
  &gt;&gt;&gt; # s1 has a message from s2
  &gt;&gt;&gt; s1, s2 = socketpair(AF_UNIX, SOCK_DGRAM)
  &gt;&gt;&gt; s2.send(b'hello world')
  &gt;&gt;&gt;
  &gt;&gt;&gt; # re-connect() drops the message from s2
  &gt;&gt;&gt; s3 = socket(AF_UNIX, SOCK_DGRAM)
  &gt;&gt;&gt; s3.bind('')
  &gt;&gt;&gt; s1.connect(s3.getsockname())

  # cat /sys/kernel/tracing/trace_pipe
     python3-250 ... kfree_skb: ... location=skb_queue_purge_reason+0xdc/0x110 reason: UNIX_DISCONNECT

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250116053441.5758-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Set drop reason in manage_oob().</title>
<updated>2025-01-20T19:27:41Z</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@amazon.com</email>
</author>
<published>2025-01-16T05:34:38Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=533643b091dd6e246d57caf81e6892fa9cbb1cc9'/>
<id>urn:sha1:533643b091dd6e246d57caf81e6892fa9cbb1cc9</id>
<content type='text'>
AF_UNIX SOCK_STREAM socket supports MSG_OOB.

When OOB data is sent to a socket, recv() will break at that point.

If the next recv() does not have MSG_OOB, the normal data following
the OOB data is returned.

Then, the OOB skb is dropped.

Let's define a new drop reason for that case in manage_oob().

  # echo 1 &gt; /sys/kernel/tracing/events/skb/kfree_skb/enable

  # python3
  &gt;&gt;&gt; from socket import *
  &gt;&gt;&gt; s1, s2 = socketpair(AF_UNIX)
  &gt;&gt;&gt; s1.send(b'a', MSG_OOB)
  &gt;&gt;&gt; s1.send(b'b')
  &gt;&gt;&gt; s2.recv(2)
  b'b'

  # cat /sys/kernel/tracing/trace_pipe
  ...
     python3-223 ... kfree_skb: ... location=unix_stream_read_generic+0x59e/0xc20 reason: UNIX_SKIP_OOB

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250116053441.5758-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
