summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/igc
AgeCommit message (Collapse)Author
2024-09-06igc: Remove setting of RX software timestampGal Pressman
The responsibility for reporting of RX software timestamp has moved to the core layer (see __ethtool_get_ts_info()), remove usage from the device drivers. Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/phy/phy_device.c 2560db6ede1a ("net: phy: Fix missing of_node_put() for leds") 1dce520abd46 ("net: phy: Use for_each_available_child_of_node_scoped()") https://lore.kernel.org/20240904115823.74333648@canb.auug.org.au Adjacent changes: drivers/net/ethernet/xilinx/xilinx_axienet.h drivers/net/ethernet/xilinx/xilinx_axienet_main.c 858430db28a5 ("net: xilinx: axienet: Fix race in axienet_stop") 76abb5d675c4 ("net: xilinx: axienet: Add statistics support") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02igc: Unlock on error in igc_io_resume()Dan Carpenter
Call rtnl_unlock() on this error path, before returning. Fixes: bc23aa949aeb ("igc: Add pcie error handler support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-30igc: Move the MULTI GBT AN Control Register to _regs fileSasha Neftin
MULTI GBT AN Control Register is IEEE Standard Register 7.32 (not a mask). The right place should be in igc_reg.h file. In accordance with the registers naming convention added IGC_' prefix. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-30igc: Add Energy Efficient Ethernet abilitySasha Neftin
According to the IEEE standard report the EEE ability (registers 7.60 and 7.62) and the EEE Link Partner ability (registers 7.61 and 7.63). Use the kernel's 'ethtool_keee' structure and report EEE link modes. Example: ethtool --show-eee <device> Before: Advertised EEE link modes: Not reported Link partner advertised EEE link modes: Not reported After: Advertised EEE link modes: 100baseT/Full 1000baseT/Full 2500baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full 2500baseT/Full Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-30igc: Get rid of spurious interruptsKurt Kanzenbach
When running the igc with XDP/ZC in busy polling mode with deferral of hard interrupts, interrupts still happen from time to time. That is caused by the igc task watchdog which triggers Rx interrupts periodically. That mechanism has been introduced to overcome skb/memory allocation failures [1]. So the Rx clean functions stop processing the Rx ring in case of such failure. The task watchdog triggers Rx interrupts periodically in the hope that memory became available in the mean time. The current behavior is undesirable for real time applications, because the driver induced Rx interrupts trigger also the softirq processing. However, all real time packets should be processed by the application which uses the busy polling method. Therefore, only trigger the Rx interrupts in case of real allocation failures. Introduce a new flag for signaling that condition. [1] - https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=3be507547e6177e5c808544bd6a2efa2c7f1d436 Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-30igc: Add MQPRIO offload supportKurt Kanzenbach
Add support for offloading MQPRIO. The hardware has four priorities as well as four queues. Each queue must be a assigned with a unique priority. However, the priorities are only considered in TSN Tx mode. There are two TSN Tx modes. In case of MQPRIO the Qbv capability is not required. Therefore, use the legacy TSN Tx mode, which performs strict priority arbitration. Example for mqprio with hardware offload: |tc qdisc replace dev ${INTERFACE} handle 100 parent root mqprio num_tc 4 \ | map 0 0 0 0 0 1 2 3 0 0 0 0 0 0 0 0 \ | queues 1@0 1@1 1@2 1@3 \ | hw 1 The mqprio Qdisc also allows to configure the `preemptible_tcs'. However, frame preemption is not supported yet. Tested on Intel i225 and implemented by following data sheet section 7.5.2, Transmit Scheduling. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix qbv tx latency by setting gtxoffsetFaizal Rahim
A large tx latency issue was discovered during testing when only QBV was enabled. The issue occurs because gtxoffset was not set when QBV is active, it was only set when launch time is active. The patch "igc: Correct the launchtime offset" only sets gtxoffset when the launchtime_enable field is set by the user. Enabling launchtime_enable ultimately sets the register IGC_TXQCTL_QUEUE_MODE_LAUNCHT (referred to as LaunchT in the SW user manual). Section 7.5.2.6 of the IGC i225/6 SW User Manual Rev 1.2.4 states: "The latency between transmission scheduling (launch time) and the time the packet is transmitted to the network is listed in Table 7-61." However, the patch misinterprets the phrase "launch time" in that section by assuming it specifically refers to the LaunchT register, whereas it actually denotes the generic term for when a packet is released from the internal buffer to the MAC transmit logic. This launch time, as per that section, also implicitly refers to the QBV gate open time, where a packet waits in the buffer for the QBV gate to open. Therefore, latency applies whenever QBV is in use. TSN features such as QBU and QAV reuse QBV, making the latency universal to TSN features. Discussed with i226 HW owner (Shalev, Avi) and we were in agreement that the term "launch time" used in Section 7.5.2.6 is not clear and can be easily misinterpreted. Avi will update this section to: "When TQAVCTRL.TRANSMIT_MODE = TSN, the latency between transmission scheduling and the time the packet is transmitted to the network is listed in Table 7-61." Fix this issue by using igc_tsn_is_tx_mode_in_tsn() as a condition to write to gtxoffset, aligning with the newly updated SW User Manual. Tested: 1. Enrol taprio on talker board base-time 0 cycle-time 1000000 flags 0x2 index 0 cmd S gatemask 0x1 interval1 index 0 cmd S gatemask 0x1 interval2 Note: interval1 = interval for a 64 bytes packet to go through interval2 = cycle-time - interval1 2. Take tcpdump on listener board 3. Use udp tai app on talker to send packets to listener 4. Check the timestamp on listener via wireshark Test Result: 100 Mbps: 113 ~193 ns 1000 Mbps: 52 ~ 84 ns 2500 Mbps: 95 ~ 223 ns Note that the test result is similar to the patch "igc: Correct the launchtime offset". Fixes: 790835fcc0cb ("igc: Correct the launchtime offset") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix reset adapter logics when tx mode changeFaizal Rahim
Following the "igc: Fix TX Hang issue when QBV Gate is close" changes, remaining issues with the reset adapter logic in igc_tsn_offload_apply() have been observed: 1. The reset adapter logics for i225 and i226 differ, although they should be the same according to the guidelines in I225/6 HW Design Section 7.5.2.1 on software initialization during tx mode changes. 2. The i225 resets adapter every time, even though tx mode doesn't change. This occurs solely based on the condition igc_is_device_id_i225() when calling schedule_work(). 3. i226 doesn't reset adapter for tsn->legacy tx mode changes. It only resets adapter for legacy->tsn tx mode transitions. 4. qbv_count introduced in the patch is actually not needed; in this context, a non-zero value of qbv_count is used to indicate if tx mode was unconditionally set to tsn in igc_tsn_enable_offload(). This could be replaced by checking the existing register IGC_TQAVCTRL_TRANSMIT_MODE_TSN bit. This patch resolves all issues and enters schedule_work() to reset the adapter only when changing tx mode. It also removes reliance on qbv_count. qbv_count field will be removed in a future patch. Test ran: 1. Verify reset adapter behaviour in i225/6: a) Enrol a new GCL Reset adapter observed (tx mode change legacy->tsn) b) Enrol a new GCL without deleting qdisc No reset adapter observed (tx mode remain tsn->tsn) c) Delete qdisc Reset adapter observed (tx mode change tsn->legacy) 2. Tested scenario from "igc: Fix TX Hang issue when QBV Gate is closed" to confirm it remains resolved. Fixes: 175c241288c0 ("igc: Fix TX Hang issue when QBV Gate is closed") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix qbv_config_change_errors logicsFaizal Rahim
When user issues these cmds: 1. Either a) or b) a) mqprio with hardware offload disabled b) taprio with txtime-assist feature enabled 2. etf 3. tc qdisc delete 4. taprio with base time in the past At step 4, qbv_config_change_errors wrongly increased by 1. Excerpt from IEEE 802.1Q-2018 8.6.9.3.1: "If AdminBaseTime specifies a time in the past, and the current schedule is running, then: Increment ConfigChangeError counter" qbv_config_change_errors should only increase if base time is in the past and no taprio is active. In user perspective, taprio was not active when first triggered at step 4. However, i225/6 reuses qbv for etf, so qbv is enabled with a dummy schedule at step 2 where it enters igc_tsn_enable_offload() and qbv_count got incremented to 1. At step 4, it enters igc_tsn_enable_offload() again, qbv_count is incremented to 2. Because taprio is running, tc_setup_type is TC_SETUP_QDISC_ETF and qbv_count > 1, qbv_config_change_errors value got incremented. This issue happens due to reliance on qbv_count field where a non-zero value indicates that taprio is running. But qbv_count increases regardless if taprio is triggered by user or by other tsn feature. It does not align with qbv_config_change_errors expectation where it is only concerned with taprio triggered by user. Fixing this by relocating the qbv_config_change_errors logic to igc_save_qbv_schedule(), eliminating reliance on qbv_count and its inaccuracies from i225/6's multiple uses of qbv feature for other TSN features. The new function created: igc_tsn_is_taprio_activated_by_user() uses taprio_offload_enable field to indicate that the current running taprio was triggered by user, instead of triggered by non-qbv feature like etf. Fixes: ae4fe4698300 ("igc: Add qbv_config_change_errors counter") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix packet still tx after gate close by reducing i226 MAC retry bufferFaizal Rahim
Testing uncovered that even when the taprio gate is closed, some packets still transmit. According to i225/6 hardware errata [1], traffic might overflow the planned QBV window. This happens because MAC maintains an internal buffer, primarily for supporting half duplex retries. Therefore, even when the gate closes, residual MAC data in the buffer may still transmit. To mitigate this for i226, reduce the MAC's internal buffer from 192 bytes to the recommended 88 bytes by modifying the RETX_CTL register value. This follows guidelines from: [1] Ethernet Controller I225/I22 Spec Update Rev 2.1 Errata Item 9: TSN: Packet Transmission Might Cross Qbv Window [2] I225/6 SW User Manual Rev 1.2.4: Section 8.11.5 Retry Buffer Control Note that the RETX_CTL register can't be used in TSN mode because half duplex feature cannot coexist with TSN. Test Steps: 1. Send taprio cmd to board A: tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 4 \ map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time 0 \ sched-entry S 0x07 500000 \ sched-entry S 0x0f 500000 \ flags 0x2 \ txtime-delay 0 Note that for TC3, gate should open for 500us and close for another 500us. 3. Take tcpdump log on Board B. 4. Send udp packets via UDP tai app from Board A to Board B. 5. Analyze tcpdump log via wireshark log on Board B. Ensure that the total time from the first to the last packet received during one cycle for TC3 does not exceed 500us. Fixes: 43546211738e ("igc: Add new device ID's") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-31igc: Fix double reset adapter triggered from a single taprio cmdFaizal Rahim
Following the implementation of "igc: Add TransmissionOverrun counter" patch, when a taprio command is triggered by user, igc processes two commands: TAPRIO_CMD_REPLACE followed by TAPRIO_CMD_STATS. However, both commands unconditionally pass through igc_tsn_offload_apply() which evaluates and triggers reset adapter. The double reset causes issues in the calculation of adapter->qbv_count in igc. TAPRIO_CMD_REPLACE command is expected to reset the adapter since it activates qbv. It's unexpected for TAPRIO_CMD_STATS to do the same because it doesn't configure any driver-specific TSN settings. So, the evaluation in igc_tsn_offload_apply() isn't needed for TAPRIO_CMD_STATS. To address this, commands parsing are relocated to igc_tsn_enable_qbv_scheduling(). Commands that don't require an adapter reset will exit after processing, thus avoiding igc_tsn_offload_apply(). Fixes: d3750076d464 ("igc: Add TransmissionOverrun counter") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240730173304.865479-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-16Merge tag 'net-next-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Not much excitement - a handful of large patchsets (devmem among them) did not make it in time. Core & protocols: - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT - Use flex array for netdevice priv area, ensure its cache alignment - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful - Support scheduling transmission of packets based on CLOCK_TAI - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync - Improve TCP compliance with RFC 9293 for simultaneous connect() - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled - Introduce IPPROTO_SMC for selecting SMC when socket is created - Allow UDP GSO transmit from devices with no checksum offload - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding - nf_tables: shrink memory consumption for transaction objects Things we sprinkled into general kernel code: - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390) [ Already merged separately - Linus ] - Add IRQ information in sysfs for auxiliary bus - Introduce guard definition for local_lock - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures BPF: - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered - Add new kfuncs for a generic, open-coded bits iterator - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs Driver API: - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits - Track additional RSS contexts in the core, make sure configuration changes don't break them - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths - Support updating firmware on SFP modules Tests and tooling: - mptcp: use net/lib.sh to manage netns - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints - openvswitch: make test self-contained (don't depend on OvS CLI tools) Drivers: - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591" * tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits) eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering" tcp: Replace strncpy() with strscpy() wifi: ath12k: fix build vs old compiler tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child(). eth: fbnic: Write the TCAM tables used for RSS control and Rx to host eth: fbnic: Add L2 address programming eth: fbnic: Add basic Rx handling eth: fbnic: Add basic Tx handling eth: fbnic: Add link detection eth: fbnic: Add initial messaging to notify FW of our presence eth: fbnic: Implement Rx queue alloc/start/stop/free eth: fbnic: Implement Tx queue alloc/start/stop/free eth: fbnic: Allocate a netdevice and napi vectors with queues eth: fbnic: Add FW communication mechanism eth: fbnic: Add message parsing for FW messages eth: fbnic: Add register init to set PCIe/Ethernet device config eth: fbnic: Allocate core device specific structures and devlink interface eth: fbnic: Add scaffolding for Meta's NIC driver PCI: Add Meta Platforms vendor ID net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK ...
2024-07-15Merge branch 'net-make-timestamping-selectable'Jakub Kicinski
First part of "net: Make timestamping selectable" from Kory Maincent. Change the driver-facing type already to lower rebasing pain. Link: https://lore.kernel.org/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15net: Add struct kernel_ethtool_ts_infoKory Maincent
In prevision to add new UAPI for hwtstamp we will be limited to the struct ethtool_ts_info that is currently passed in fixed binary format through the ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code already started operating on an extensible kernel variant of that structure, similar in concept to struct kernel_hwtstamp_config vs struct hwtstamp_config. Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here we introduce the kernel-only structure in include/linux/ethtool.h. The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO. Acked-by: Shannon Nelson <shannon.nelson@amd.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-13Merge tag 'timers-v6.11-rc1' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/event driver updates from Daniel Lezcano: - Remove unnecessary local variables initialization as they will be initialized in the code path anyway right after on the ARM arch timer and the ARM global timer (Li kunyu) - Fix a race condition in the interrupt leading to a deadlock on the SH CMT driver. Note that this fix was not tested on the platform using this timer but the fix seems reasonable enough to be picked confidently (Niklas Söderlund) - Increase the rating of the gic-timer and use the configured width clocksource register on the MIPS architecture (Jiaxun Yang) - Add the DT bindings for the TMU on the Renesas platforms (Geert Uytterhoeven) - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas Bonnefille) - Add the rtl-otto timer driver along with the DT bindings for the Realtek platform (Chris Packham) Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
2024-07-11igc: Remove the internal 'eee_advert' fieldSasha Neftin
Since the kernel's 'ethtool_keee' structure is in use, the internal 'eee_advert' field becomes pointless and can be removed. This patch comes to clean up this redundant code. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11net: intel: Remove MODULE_AUTHORsTony Nguyen
We are moving away from the Sourceforge email address. Rather than removing or updating the email for the affected entries, remove the MODULE_AUTHOR altogether as its usage is incorrect [1]. Link: https://lore.kernel.org/netdev/20200626115236.7f36d379@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ [1] Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> # libeth, libie Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts, no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13Revert "igc: fix a log entry using uninitialized netdev"Sasha Neftin
This reverts commit 86167183a17e03ec77198897975e9fdfbd53cb0b. igc_ptp_init() needs to be called before igc_reset(), otherwise kernel crash could be observed. Following the corresponding discussion [1] and [2] revert this commit. Link: https://lore.kernel.org/all/8fb634f8-7330-4cf4-a8ce-485af9c0a61a@intel.com/ [1] Link: https://lore.kernel.org/all/87o78rmkhu.fsf@intel.com/ [2] Fixes: 86167183a17e ("igc: fix a log entry using uninitialized netdev") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240611162456.961631-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10net: intel: Use *-y instead of *-objs in MakefileAndy Shevchenko
*-objs suffix is reserved rather for (user-space) host programs while usually *-y suffix is used for kernel drivers (although *-objs works for that purpose for now). Let's correct the old usages of *-objs in Makefiles. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-1-d1470cee3347@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05igc: Fix Energy Efficient Ethernet support declarationSasha Neftin
The commit 01cf893bf0f4 ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") removed SUPPORTED_Autoneg field but left inappropriate ethtool_keee structure initialization. When "ethtool --show <device>" (get_eee) invoke, the 'ethtool_keee' structure was accidentally overridden. Remove the 'ethtool_keee' overriding and add EEE declaration as per IEEE specification that allows reporting Energy Efficient Ethernet capabilities. Examples: Before fix: ethtool --show-eee enp174s0 EEE settings for enp174s0: EEE status: not supported After fix: EEE settings for enp174s0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full 2500baseT/Full Fixes: 01cf893bf0f4 ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-6-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03igc: Remove convert_art_ns_to_tsc()Thomas Gleixner
The core code now provides a mechanism to convert the ART base clock to the corresponding TSC value without requiring an architecture specific function. Replace the direct conversion by filling in the required data. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Lakshmi Sowjanya D <lakshmi.sowjanya.d@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240513103813.5666-5-lakshmi.sowjanya.d@intel.com
2024-05-20Merge tag 'dma-mapping-6.10-2024-05-20' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - optimize DMA sync calls when they are no-ops (Alexander Lobakin) - fix swiotlb padding for untrusted devices (Michael Kelley) - add documentation for swiotb (Michael Kelley) * tag 'dma-mapping-6.10-2024-05-20' of git://git.infradead.org/users/hch/dma-mapping: dma: fix DMA sync for drivers not calling dma_set_mask*() xsk: use generic DMA sync shortcut instead of a custom one page_pool: check for DMA sync shortcut earlier page_pool: don't use driver-set flags field directly page_pool: make sure frag API fields don't span between cachelines iommu/dma: avoid expensive indirect calls for sync operations dma: avoid redundant calls for sync operations dma: compile-out DMA sync op calls when not used iommu/dma: fix zeroing of bounce buffer padding used by untrusted devices swiotlb: remove alloc_size argument to swiotlb_tbl_map_single() Documentation/core-api: add swiotlb documentation
2024-05-08igc: fix a log entry using uninitialized netdevCorinna Vinschen
During successful probe, igc logs this: [ 5.133667] igc 0000:01:00.0 (unnamed net_device) (uninitialized): PHC added ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The reason is that igc_ptp_init() is called very early, even before register_netdev() has been called. So the netdev_info() call works on a partially uninitialized netdev. Fix this by calling igc_ptp_init() after register_netdev(), right after the media autosense check, just as in igb. Add a comment, just as in igb. Now the log message is fine: [ 5.200987] igc 0000:01:00.0 eth0: PHC added Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-08xsk: use generic DMA sync shortcut instead of a custom oneAlexander Lobakin
XSk infra's been using its own DMA sync shortcut to try avoiding redundant function calls. Now that there is a generic one, remove the custom implementation and rely on the generic helpers. xsk_buff_dma_sync_for_cpu() doesn't need the second argument anymore, remove it. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-05-07net: annotate writes on dev->mtu from ndo_change_mtu()Eric Dumazet
Simon reported that ndo_change_mtu() methods were never updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted in commit 501a90c94510 ("inet: protect against too small mtu values.") We read dev->mtu without holding RTNL in many places, with READ_ONCE() annotations. It is time to take care of ndo_change_mtu() methods to use corresponding WRITE_ONCE() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Simon Horman <horms@kernel.org> Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/ Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26igc: Add Tx hardware timestamp request for AF_XDP zero-copy packetSong Yoong Siang
This patch adds support to per-packet Tx hardware timestamp request to AF_XDP zero-copy packet via XDP Tx metadata framework. Please note that user needs to enable Tx HW timestamp capability via igc_ioctl() with SIOCSHWTSTAMP cmd before sending xsk Tx hardware timestamp request. Same as implementation in RX timestamp XDP hints kfunc metadata, Timer 0 (adjustable clock) is used in xsk Tx hardware timestamp. i225/i226 have four sets of timestamping registers. *skb and *xsk_tx_buffer pointers are used to indicate whether the timestamping register is already occupied. Furthermore, a boolean variable named xsk_pending_ts is used to hold the transmit completion until the tx hardware timestamp is ready. This is because, for i225/i226, the timestamp notification event comes some time after the transmit completion event. The driver will retrigger hardware irq to clean the packet after retrieve the tx hardware timestamp. Besides, xsk_meta is added into struct igc_tx_timestamp_request as a hook to the metadata location of the transmit packet. When the Tx timestamp interrupt is fired, the interrupt handler will copy the value of Tx hwts into metadata location via xsk_tx_metadata_complete(). This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel ADL-S platform. Below are the test steps and results. Test Step 1: Run xdp_hw_metadata app ./xdp_hw_metadata <iface> > /dev/shm/result.log Test Step 2: Enable Tx hardware timestamp hwstamp_ctl -i <iface> -t 1 -r 1 Test Step 3: Run ptp4l and phc2sys for time synchronization Test Step 4: Generate UDP packets with 1ms interval for 10s trafgen --dev <iface> '{eth(da=<addr>), udp(dp=9091)}' -t 1ms -n 10000 Test Step 5: Rerun Step 1-3 with 10s iperf3 as background traffic Test Step 6: Rerun Step 1-4 with 10s iperf3 as background traffic Based on iperf3 results below, the impact of holding tx completion to throughput is not observable. Result of last UDP packet (no. 10000) in Step 4: poll: 1 (0) skip=99 fail=0 redir=10000 xsk_ring_cons__peek: 1 0x5640a37972d0: rx_desc[9999]->addr=f2110 addr=f2110 comp_addr=f2110 EoP rx_hash: 0x2049BE1D with RSS type:0x1 HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (14.990 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (4.271 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x5640a37972d0: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6 0x5640a37972d0: complete tx idx=9999 addr=f010 HW TX-complete-time: 1679819246793036971 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (77.656 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (132.640 usec) HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to HW TX-complete-time sec:0.0001 (65.703 usec) 0x5640a37972d0: complete rx idx=10127 addr=f2110 Result of iperf3 without tx hwts request in step 5: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.05 sec 2.74 GBytes 2.34 Gbits/sec receiver Result of iperf3 running parallel with trafgen command in step 6: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 2.74 GBytes 2.34 Gbits/sec receiver Co-developed-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240424210256.3440903-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24igc: Fix LED-related deadlock on driver unbindLukas Wunner
Roman reports a deadlock on unplug of a Thunderbolt docking station containing an Intel I225 Ethernet adapter. The root cause is that led_classdev's for LEDs on the adapter are registered such that they're device-managed by the netdev. That results in recursive acquisition of the rtnl_lock() mutex on unplug: When the driver calls unregister_netdev(), it acquires rtnl_lock(), then frees the device-managed resources. Upon unregistering the LEDs, netdev_trig_deactivate() invokes unregister_netdevice_notifier(), which tries to acquire rtnl_lock() again. Avoid by using non-device-managed LED registration. Stack trace for posterity: schedule+0x6e/0xf0 schedule_preempt_disabled+0x15/0x20 __mutex_lock+0x2a0/0x750 unregister_netdevice_notifier+0x40/0x150 netdev_trig_deactivate+0x1f/0x60 [ledtrig_netdev] led_trigger_set+0x102/0x330 led_classdev_unregister+0x4b/0x110 release_nodes+0x3d/0xb0 devres_release_all+0x8b/0xc0 device_del+0x34f/0x3c0 unregister_netdevice_many_notify+0x80b/0xaf0 unregister_netdev+0x7c/0xd0 igc_remove+0xd8/0x1e0 [igc] pci_device_remove+0x3f/0xb0 Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226") Reported-by: Roman Lozko <lozko.roma@gmail.com> Closes: https://lore.kernel.org/r/CAEhC_B=ksywxCG_+aQqXUrGEgKq+4mqnSV8EBHOKbC3-Obj9+Q@mail.gmail.com/ Reported-by: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com> Closes: https://lore.kernel.org/r/ZhRD3cOtz5i-61PB@mail-itl/ Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel i225 Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240422204503.225448-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-08igc: Remove redundant runtime resume for ethtool opsBjorn Helgaas
8c5ad0dae93c ("igc: Add ethtool support") added ethtool_ops.begin() and .complete(), which used pm_runtime_get_sync() to resume suspended devices before any ethtool_ops callback and allow suspend after it completed. Subsequently, f32a21376573 ("ethtool: runtime-resume netdev parent before ethtool ioctl ops") added pm_runtime_get_sync() in the dev_ethtool() path, so the device is resumed before any ethtool_ops callback even if the driver didn't supply a .begin() callback. Remove the .begin() and .complete() callbacks, which are now redundant because dev_ethtool() already resumes the device. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-29igc: Refactor runtime power management flowSasha Neftin
Following the corresponding discussion [1] and [2] refactor the 'igc_open' method and avoid taking the rtnl_lock() during the 'igc_resume' method. The rtnl_lock is held by the upper layer and could lead to the deadlock during resuming from a runtime power management flow. Notify the stack of the actual queue counts 'netif_set_real_num_*_queues' outside the '_igc_open' wrapper. This notification doesn't have to be called on each resume. Test: 1. Disconnect the ethernet cable 2. Enable the runtime power management via file system: echo auto > /sys/devices/pci0000\.../power/control 3. Check the device state (lspci -s <device> -vvv | grep -i Status) Link: https://lore.kernel.org/netdev/20231206113934.8d7819857574.I2deb5804 ef1739a2af307283d320ef7d82456494@changeid/#r [1] Link: https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedo ra-pc1c0hjn.dhcp.thefacebook.com/t/ [2] Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-29net: intel: implement modern PM ops declarationsJesse Brandeburg
Switch the Intel networking drivers to use the new power management ops declaration formats and macros, which allows us to drop __maybe_unused, as well as a bunch of ifdef checking CONFIG_PM. This is safe to do because the compiler drops the unused functions, verified by checking for any of the power management function symbols being present in System.map for a build without CONFIG_PM. If a driver has runtime PM, define the ops with pm_ptr(), and if the driver has Simple PM, use pm_sleep_ptr(), as well as the new versions of the macros for declaring the members of the pm_ops structs. Checked with network-enabled allnoconfig, allyesconfig, allmodconfig on x64_64. Reviewed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-28net: remove gfp_mask from napi_alloc_skb()Jakub Kicinski
__napi_alloc_skb() is napi_alloc_skb() with the added flexibility of choosing gfp_mask. This is a NAPI function, so GFP_ATOMIC is implied. The only practical choice the caller has is whether to set __GFP_NOWARN. But that's a false choice, too, allocation failures in atomic context will happen, and printing warnings in logs, effectively for a packet drop, is both too much and very likely non-actionable. This leads me to a conclusion that most uses of napi_alloc_skb() are simply misguided, and should use __GFP_NOWARN in the first place. We also have a "standard" way of reporting allocation failures via the queue stat API (qstats::rx-alloc-fail). The direct motivation for this patch is that one of the drivers used at Meta calls napi_alloc_skb() (so prior to this patch without __GFP_NOWARN), and the resulting OOM warning is the top networking warning in our fleet. Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240327040213.3153864-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-25igc: Remove stale comment about Tx timestampingKurt Kanzenbach
The initial igc Tx timestamping implementation used only one register for retrieving Tx timestamps. Commit 3ed247e78911 ("igc: Add support for multiple in-flight TX timestamps") added support for utilizing all four of them e.g., for multiple domain support. Remove the stale comment/FIXME. Fixes: 3ed247e78911 ("igc: Add support for multiple in-flight TX timestamps") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Merge in late fixes to prepare for the 6.9 net-next PR. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/page_pool_user.c 0b11b1c5c320 ("netdev: let netlink core handle -EMSGSIZE errors") 429679dcf7d9 ("page_pool: fix netlink dump stop/resume") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06igc: Fix missing time sync eventsVinicius Costa Gomes
Fix "double" clearing of interrupts, which can cause external events or timestamps to be missed. The IGC_TSIRC Time Sync Interrupt Cause register can be cleared in two ways, by either reading it or by writing '1' into the specific cause bit. This is documented in section 8.16.1. The following flow was used: 1. read IGC_TSIRC into 'tsicr'; 2. handle the interrupts present in 'tsirc' and mark them in 'ack'; 3. write 'ack' into IGC_TSICR; As both (1) and (3) will clear the interrupt cause, if the same interrupt happens again between (1) and (3) it will be ignored, causing events to be missed. Remove the extra clear in (3). Fixes: 2c344ae24501 ("igc: Add support for TX timestamping") Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel i225 Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05igc: avoid returning frame twice in XDP_REDIRECTFlorian Kauer
When a frame can not be transmitted in XDP_REDIRECT (e.g. due to a full queue), it is necessary to free it by calling xdp_return_frame_rx_napi. However, this is the responsibility of the caller of the ndo_xdp_xmit (see for example bq_xmit_all in kernel/bpf/devmap.c) and thus calling it inside igc_xdp_xmit (which is the ndo_xdp_xmit of the igc driver) as well will lead to memory corruption. In fact, bq_xmit_all expects that it can return all frames after the last successfully transmitted one. Therefore, break for the first not transmitted frame, but do not call xdp_return_frame_rx_napi in igc_xdp_xmit. This is equally implemented in other Intel drivers such as the igb. There are two alternatives to this that were rejected: 1. Return num_frames as all the frames would have been transmitted and release them inside igc_xdp_xmit. While it might work technically, it is not what the return value is meant to represent (i.e. the number of SUCCESSFULLY transmitted packets). 2. Rework kernel/bpf/devmap.c and all drivers to support non-consecutively dropped packets. Besides being complex, it likely has a negative performance impact without a significant gain since it is anyway unlikely that the next frame can be transmitted if the previous one was dropped. The memory corruption can be reproduced with the following script which leads to a kernel panic after a few seconds. It basically generates more traffic than a i225 NIC can transmit and pushes it via XDP_REDIRECT from a virtual interface to the physical interface where frames get dropped. #!/bin/bash INTERFACE=enp4s0 INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex` sudo ip link add dev veth1 type veth peer name veth2 sudo ip link set up $INTERFACE sudo ip link set up veth1 sudo ip link set up veth2 cat << EOF > redirect.bpf.c SEC("prog") int redirect(struct xdp_md *ctx) { return bpf_redirect($INTERFACE_IDX, 0); } char _license[] SEC("license") = "GPL"; EOF clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o sudo ip link set veth2 xdp obj redirect.bpf.o cat << EOF > pass.bpf.c SEC("prog") int pass(struct xdp_md *ctx) { return XDP_PASS; } char _license[] SEC("license") = "GPL"; EOF clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o sudo ip link set $INTERFACE xdp obj pass.bpf.o cat << EOF > trafgen.cfg { /* Ethernet Header */ 0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, const16(ETH_P_IP), /* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum /* UDP Header */ 10, 0, 1, 1, # IP Src - adapt as needed 10, 0, 1, 2, # IP Dest - adapt as needed const16(6666), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum /* Payload */ fill('W', 1000), } EOF sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04eth: igc: remove unused embedded struct net_deviceJakub Kicinski
struct net_device poll_dev in struct igc_q_vector was added in one of the initial commits, but never used. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04net: adopt skb_network_offset() and similar helpersEric Dumazet
This is a cleanup patch, making code a bit more concise. 1) Use skb_network_offset(skb) in place of (skb_network_header(skb) - skb->data) 2) Use -skb_network_offset(skb) in place of (skb->data - skb_network_header(skb)) 3) Use skb_transport_offset(skb) in place of (skb_transport_header(skb) - skb->data) 4) Use skb_inner_transport_offset(skb) in place of (skb_inner_transport_header(skb) - skb->data) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> # for sfc Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: intel: igc: Use linkmode helpers for EEEAndrew Lunn
Make use of the existing linkmode helpers for converting PHY EEE register values into links modes, now that ethtool_keee uses link modes, rather than u32 values. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: intel: i40e/igc: Remove setting Autoneg in EEE capabilitiesAndrew Lunn
Energy Efficient Ethernet should always be negotiated with the link peer. Don't include SUPPORTED_Autoneg in the results of get_eee() for supported, advertised or lp_advertised, since it is assumed. Additionally, ethtool(1) ignores the set bit, and no other driver sets this. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()") 723de3ebef03 ("net: free altname using an RCU callback") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) c2da9408579d ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c bdd70eb68913 ("mptcp: drop the push_pending field") 28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-15igc: Add support for LEDs on i225/i226Kurt Kanzenbach
Add support for LEDs on i225/i226. The LEDs can be controlled via sysfs from user space using the netdev trigger. The LEDs are named as igc-<bus><device>-<led> to be easily identified. Offloading link speed and activity are supported. Other modes are simulated in software by using on/off. Tested on Intel i225. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240213184138.1483968-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-14igc: Remove temporary workaroundSasha Neftin
PHY_CONTROL register works as defined in the IEEE 802.3 specification (IEEE 802.3-2008 22.2.4.1). Tidy up the temporary workaround. User impact: PHY can now be powered down when the ethernet link is down. Testing hints: ip link set down <device> (or just disconnect the ethernet cable). Oldest tested NVM version is: 1045:740. Fixes: 5586838fe9ce ("igc: Add code for PHY support") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-07igc: Unify filtering rule fieldsKurt Kanzenbach
All filtering parameters such as EtherType and VLAN TCI are stored in host byte order except for the VLAN EtherType. Unify it. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-07igc: Use netdev printing functions for flex filtersKurt Kanzenbach
All igc filter implementations use netdev_*() printing functions except for the flex filters. Unify it. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-07igc: Use reverse xmas treeKurt Kanzenbach
Use reverse xmas tree coding style convention in igc_add_flex_filter(). Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-31ethtool: add suffix _u32 to legacy bitmap members of struct ethtool_keeeHeiner Kallweit
This is in preparation of using the existing names for linkmode bitmaps. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>