Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-01-24
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
Merge conflict: once merge with net-next, a contextual conflict will
appear in drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
since the code moved in net-next.
To resolve, just delete ALL of the conflicting hunk from net.
So sorry for the small mess ..
For -stable v5.4:
('net/mlx5: Update the list of the PCI supported devices')
('net/mlx5: Fix lowest FDB pool size')
('net/mlx5e: kTLS, Fix corner-case checks in TX resync flow')
('net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path')
('net/mlx5: Eswitch, Prevent ingress rate configuration of uplink rep')
('net/mlx5e: kTLS, Remove redundant posts in TX resync flow')
('net/mlx5: DR, Enable counter on non-fwd-dest objects')
('net/mlx5: DR, use non preemptible call to get the current cpu number')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The cxgb3 driver for "Chelsio T3-based gigabit and 10Gb Ethernet
adapters" implements a custom ioctl as SIOCCHIOCTL/SIOCDEVPRIVATE in
cxgb_extension_ioctl().
One of the subcommands of the ioctl is CHELSIO_GET_MEM, which appears
to read memory directly out of the adapter and return it to userspace.
It's not entirely clear what the contents of the adapter memory
contains, but the assumption is that it shouldn't be accessible to all
users.
So add a CAP_NET_ADMIN check to the CHELSIO_GET_MEM case. Put it after
the is_offload() check, which matches two of the other subcommands in
the same function which also check for is_offload() and CAP_NET_ADMIN.
Found by Ilja by code inspection, not tested as I don't have the
required hardware.
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before commit 7587935cfa11 ("net: bcmgenet: move NAPI initialization to
ring initialization") moved the code, this used to be
netif_tx_napi_add(), but we lost that small semantic change in the
process, restore that.
Fixes: 7587935cfa11 ("net: bcmgenet: move NAPI initialization to ring initialization")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use generic device API to get phy mode to fix probe failure
with ACPI based devices.
Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver
analyzes the packet and decides what handling should it get:
1. go to accelerated path (to be encrypted in HW),
2. go to regular xmit path (send w/o encryption),
3. drop.
Packets marked with skb->decrypted by the TLS stack in the TX flow skips
SW encryption, and rely on the HW offload.
Verify that such packets are never sent un-encrypted on the wire.
Add a WARN to catch such bugs, and prefer dropping the packet in these cases.
Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The call to tx_post_resync_params() is done earlier in the flow,
the post of the control WQEs is unnecessarily repeated. Remove it.
Fixes: 700ec4974240 ("net/mlx5e: kTLS, Fix missing SQ edge fill")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
There are the following cases:
1. Packet ends before start marker: bypass offload.
2. Packet starts before start marker and ends after it: drop,
not supported, breaks contract with kernel.
3. packet ends before tls record info starts: drop,
this packet was already acknowledged and its record info
was released.
Add the above as comment in code.
Mind possible wraparounds of the TCP seq, replace the simple comparison
with a call to the TCP before() method.
In addition, remove logic that handles negative sync_len values,
as it became impossible.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS
mode, when switching to it first time, VF can go up independently to
his representor, which is not expected.
Perform clearing of VF config when switching modes and set link state
to AUTO as default value. Also, when switching to OFFLOADS mode set
link state to DOWN, which allow VF link state to be controlled by its
REP.
Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will
get the following trace in debug-kernel:
BUG: using smp_processor_id() in preemptible [00000000] code: devlink
caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core]
Call Trace:
dump_stack+0x9a/0xf0
debug_smp_processor_id+0x1f3/0x200
dr_create_cq.constprop.2+0x31d/0x970
genl_family_rcv_msg+0x5fd/0x1170
genl_rcv_msg+0xb8/0x160
netlink_rcv_skb+0x11e/0x340
Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Since the implementation relies on limiting the VF transmit rate to
simulate ingress rate limiting, and since either uplink representor or
ecpf are not associated with a VF, we limit the rate limit configuration
for those ports.
Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The current code handles only counters that attached to dest, we still
have the cases where we have counter on non-dest, like over drop etc.
Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add the upcoming ConnectX-7 device ID.
Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The pool sizes represent the pool sizes in the fw. when we request
a pool size from fw, it will return the next possible group.
We track how many pools the fw has left and start requesting groups
from the big to the small.
When we start request 4k group, which doesn't exists in fw, fw
wants to allocate the next possible size, 64k, but will fail since
its exhausted. The correct smallest pool size in fw is 128 and not 4k.
Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a spelling mistake in a hw_dbg message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Section 5.5.3.2 of the datasheet says,
If FIFO Underrun, Byte Count Mismatch, Excessive Collision, or
Excessive Deferral (if enabled) errors occur, transmission ceases.
In this situation, the chip asserts a TXER interrupt rather than TXDN.
But the handler for the TXDN is the only way that the transmit queue
gets restarted. Hence, an aborted transmission can result in a watchdog
timeout.
This problem can be reproduced on congested link, as that can result in
excessive transmitter collisions. Another way to reproduce this is with
a FIFO Underrun, which may be caused by DMA latency.
In event of a TXER interrupt, prevent a watchdog timeout by restarting
transmission.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Section 4.3.1 of the datasheet says,
This bit [TXP] must not be set if a Load CAM operation is in
progress (LCAM is set). The SONIC will lock up if both bits are
set simultaneously.
Testing has shown that the driver sometimes attempts to set LCAM
while TXP is set. Avoid this by waiting for command completion
before and after giving the LCAM command.
After issuing the Load CAM command, poll for !SONIC_CR_LCAM rather than
SONIC_INT_LCD, because the SONIC_CR_TXP bit can't be used until
!SONIC_CR_LCAM.
When in reset mode, take the opportunity to reset the CAM Enable
register.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are several issues relating to command register usage during
chip initialization.
Firstly, the SONIC sometimes comes out of software reset with the
Start Timer bit set. This gets logged as,
macsonic macsonic eth0: sonic_init: status=24, i=101
Avoid this by giving the Stop Timer command earlier than later.
Secondly, the loop that waits for the Read RRA command to complete has
the break condition inverted. That's why the for loop iterates until
its termination condition. Call the helper for this instead.
Finally, give the Receiver Enable command after clearing interrupts,
not before, to avoid the possibility of losing an interrupt.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure the SONIC's DMA engine is idle before altering the transmit
and receive descriptors. Add a helper for this as it will be needed
again.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As soon as the driver is finished with a receive buffer it allocs a new
one and overwrites the corresponding RRA entry with a new buffer pointer.
Problem is, the buffer pointer is split across two word-sized registers.
It can't be updated in one atomic store. So this operation races with the
chip while it stores received packets and advances its RRP register.
This could result in memory corruption by a DMA write.
Avoid this problem by adding buffers only at the location given by the
RWP register, in accordance with the National Semiconductor datasheet.
Re-factor this code into separate functions to calculate a RRA pointer
and to update the RWP.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After sonic_tx_timeout() calls sonic_init(), it can happen that
sonic_rx() will subsequently encounter a receive descriptor with no
flags set. Remove the comment that says that this can't happen.
When giving a receive descriptor to the SONIC, clear the descriptor
status field. That way, any rx descriptor with flags set can only be
a newly received packet.
Don't process a descriptor without the LPKT bit set. The buffer is
still in use by the SONIC.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The while loop in sonic_rx() traverses the rx descriptor ring. It stops
when it reaches a descriptor that the SONIC has not used. Each iteration
advances the EOL flag so the SONIC can keep using more descriptors.
Therefore, the while loop has no definite termination condition.
The algorithm described in the National Semiconductor literature is quite
different. It consumes descriptors up to the one with its EOL flag set
(which will also have its "in use" flag set). All freed descriptors are
then returned to the ring at once, by adjusting the EOL flags (and link
pointers).
Adopt the algorithm from datasheet as it's simpler, terminates quickly
and avoids a lot of pointless descriptor EOL flag changes.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SONIC can sometimes advance its rx buffer pointer (RRP register)
without advancing its rx descriptor pointer (CRDA register). As a result
the index of the current rx descriptor may not equal that of the current
rx buffer. The driver mistakenly assumes that they are always equal.
This assumption leads to incorrect packet lengths and possible packet
duplication. Avoid this by calling a new function to locate the buffer
corresponding to a given descriptor.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The tx_aborted_errors statistic should count packets flagged with EXD,
EXC, FU, or BCM bits because those bits denote an aborted transmission.
That corresponds to the bitmask 0x0446, not 0x0642. Use macros for these
constants to avoid mistakes. Better to leave out FIFO Underruns (FU) as
there's a separate counter for that purpose.
Don't lump all these errors in with the general tx_errors counter as
that's used for tx timeout events.
On the rx side, don't count RDE and RBAE interrupts as dropped packets.
These interrupts don't indicate a lost packet, just a lack of resources.
When a lack of resources results in a lost packet, this gets reported
in the rx_missed_errors counter (along with RFO events).
Don't double-count rx_frame_errors and rx_crc_errors.
Don't use the general rx_errors counter for events that already have
special counters.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver accesses descriptor memory which is simultaneously accessed by
the chip, so the compiler must not be allowed to re-order CPU accesses.
sonic_buf_get() used 'volatile' to prevent that. sonic_buf_put() should
have done so too but was overlooked.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The chip can change a packet's descriptor status flags at any time.
However, an active interrupt flag gets cleared rather late. This
allows a race condition that could theoretically lose an interrupt.
Fix this by clearing asserted interrupt flags immediately.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The netif_stop_queue() call in sonic_send_packet() races with the
netif_wake_queue() call in sonic_interrupt(). This causes issues
like "NETDEV WATCHDOG: eth0 (macsonic): transmit queue 0 timed out".
Fix this by disabling interrupts when accessing tx_skb[] and next_tx.
Update a comment to clarify the synchronization properties.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As the only 10G PHY interface type defined at the moment the code
was developed was XGMII, although the PHY interface mode used was
not XGMII, XGMII was used in the code to denote 10G. This patch
renames the 10G interface mode to remove the ambiguity.
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When fsl,erratum-a011043 is set, adjust for erratum A011043:
MDIO reads to internal PCS registers may result in having
the MDIO_CFG[MDIO_RD_ER] bit set, even when there is no
error and read data (MDIO_DATA[MDIO_DATA]) is correct.
Software may get false read error when reading internal
PCS registers through MDIO. As a workaround, all internal
MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit.
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Driver while collecting firmware dump takes longer time to
collect/process some of the firmware dump entries/memories.
Bigger capture masks makes it worse as it results in larger
amount of data being collected and results in CPU soft lockup.
Place cond_resched() in some of the driver flows that are
expectedly time consuming to relinquish the CPU to avoid CPU
soft lockup panic.
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
Tested-by: Yonggen Xu <Yonggen.Xu@dell.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During reload (or module unload), the router block is de-initialized.
Among other things, this results in the removal of a default multicast
route from each active virtual router (VRF). These default routes are
configured during initialization to trap packets to the CPU. In
Spectrum-2, unlike Spectrum-1, multicast routes are implemented using
ACL rules.
Since the router block is de-initialized before the ACL block, it is
possible that the ACL rules corresponding to the default routes are
deleted while being accessed by the ACL delayed work that queries rules'
activity from the device. This can result in a rare use-after-free [1].
Fix this by protecting the rules list accessed by the delayed work with
a lock. We cannot use a spinlock as the activity read operation is
blocking.
[1]
[ 123.331662] ==================================================================
[ 123.339920] BUG: KASAN: use-after-free in mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0
[ 123.349381] Read of size 8 at addr ffff8881f3bb4520 by task kworker/0:2/78
[ 123.357080]
[ 123.358773] CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 5.5.0-rc5-custom-33108-gf5df95d3ef41 #2209
[ 123.368898] Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
[ 123.378456] Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work
[ 123.385970] Call Trace:
[ 123.388734] dump_stack+0xc6/0x11e
[ 123.392568] print_address_description.constprop.4+0x21/0x340
[ 123.403236] __kasan_report.cold.8+0x76/0xb1
[ 123.414884] kasan_report+0xe/0x20
[ 123.418716] mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0
[ 123.444034] process_one_work+0xb06/0x19a0
[ 123.453731] worker_thread+0x91/0xe90
[ 123.467348] kthread+0x348/0x410
[ 123.476847] ret_from_fork+0x24/0x30
[ 123.480863]
[ 123.482545] Allocated by task 73:
[ 123.486273] save_stack+0x19/0x80
[ 123.490000] __kasan_kmalloc.constprop.6+0xc1/0xd0
[ 123.495379] mlxsw_sp_acl_rule_create+0xa7/0x230
[ 123.500566] mlxsw_sp2_mr_tcam_route_create+0xf6/0x3e0
[ 123.506334] mlxsw_sp_mr_tcam_route_create+0x5b4/0x820
[ 123.512102] mlxsw_sp_mr_table_create+0x3b5/0x690
[ 123.517389] mlxsw_sp_vr_get+0x289/0x4d0
[ 123.521797] mlxsw_sp_fib_node_get+0xa2/0x990
[ 123.526692] mlxsw_sp_router_fib4_event_work+0x54c/0x2d60
[ 123.532752] process_one_work+0xb06/0x19a0
[ 123.537352] worker_thread+0x91/0xe90
[ 123.541471] kthread+0x348/0x410
[ 123.545103] ret_from_fork+0x24/0x30
[ 123.549113]
[ 123.550795] Freed by task 518:
[ 123.554231] save_stack+0x19/0x80
[ 123.557958] __kasan_slab_free+0x125/0x170
[ 123.562556] kfree+0xd7/0x3a0
[ 123.565895] mlxsw_sp_acl_rule_destroy+0x63/0xd0
[ 123.571081] mlxsw_sp2_mr_tcam_route_destroy+0xd5/0x130
[ 123.576946] mlxsw_sp_mr_tcam_route_destroy+0xba/0x260
[ 123.582714] mlxsw_sp_mr_table_destroy+0x1ab/0x290
[ 123.588091] mlxsw_sp_vr_put+0x1db/0x350
[ 123.592496] mlxsw_sp_fib_node_put+0x298/0x4c0
[ 123.597486] mlxsw_sp_vr_fib_flush+0x15b/0x360
[ 123.602476] mlxsw_sp_router_fib_flush+0xba/0x470
[ 123.607756] mlxsw_sp_vrs_fini+0xaa/0x120
[ 123.612260] mlxsw_sp_router_fini+0x137/0x384
[ 123.617152] mlxsw_sp_fini+0x30a/0x4a0
[ 123.621374] mlxsw_core_bus_device_unregister+0x159/0x600
[ 123.627435] mlxsw_devlink_core_bus_device_reload_down+0x7e/0xb0
[ 123.634176] devlink_reload+0xb4/0x380
[ 123.638391] devlink_nl_cmd_reload+0x610/0x700
[ 123.643382] genl_rcv_msg+0x6a8/0xdc0
[ 123.647497] netlink_rcv_skb+0x134/0x3a0
[ 123.651904] genl_rcv+0x29/0x40
[ 123.655436] netlink_unicast+0x4d4/0x700
[ 123.659843] netlink_sendmsg+0x7c0/0xc70
[ 123.664251] __sys_sendto+0x265/0x3c0
[ 123.668367] __x64_sys_sendto+0xe2/0x1b0
[ 123.672773] do_syscall_64+0xa0/0x530
[ 123.676892] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 123.682552]
[ 123.684238] The buggy address belongs to the object at ffff8881f3bb4500
[ 123.684238] which belongs to the cache kmalloc-128 of size 128
[ 123.698261] The buggy address is located 32 bytes inside of
[ 123.698261] 128-byte region [ffff8881f3bb4500, ffff8881f3bb4580)
[ 123.711303] The buggy address belongs to the page:
[ 123.716682] page:ffffea0007ceed00 refcount:1 mapcount:0 mapping:ffff888236403500 index:0x0
[ 123.725958] raw: 0200000000000200 dead000000000100 dead000000000122 ffff888236403500
[ 123.734646] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
[ 123.743315] page dumped because: kasan: bad access detected
[ 123.749562]
[ 123.751241] Memory state around the buggy address:
[ 123.756620] ffff8881f3bb4400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 123.764716] ffff8881f3bb4480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 123.772812] >ffff8881f3bb4500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 123.780904] ^
[ 123.785697] ffff8881f3bb4580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 123.793793] ffff8881f3bb4600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 123.801883] ==================================================================
Fixes: cf7221a4f5a5 ("mlxsw: spectrum_router: Add Multicast routing support for Spectrum-2")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A queue can't belong to multiple traffic classes. So, reject
any such configuration that results in overlapped queues for a
traffic class.
Fixes: b1396c2bd675 ("cxgb4: parse and configure TC-MQPRIO offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
T6 can support 2 egress traffic management channels per port to
double the total number of traffic classes that can be configured.
In this configuration, if the class belongs to the other channel,
then all the queues must be bound again explicitly to the new class,
for the rate limit parameters on the other channel to take effect.
So, always explicitly bind all queues to the port rate limit traffic
class, regardless of the traffic management channel that it belongs
to. Also, only bind queues to port rate limit traffic class, if all
the queues don't already belong to an existing different traffic
class.
Fixes: 4ec4762d8ec6 ("cxgb4: add TC-MATCHALL classifier egress offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DSN read can fail, for example on a kdump kernel without PCIe extended
config space support. If DSN read fails, don't set the
BNXT_FLAG_DSN_VALID flag and continue loading. Check the flag
to see if the stored DSN is valid before using it. Only VF reps
creation should fail without valid DSN.
Fixes: 03213a996531 ("bnxt: move bp->switch_id initialization to PF probe")
Reported-by: Marc Smith <msmith626@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix bnxt_fltr_match() to match ipv6 source and destination addresses.
The function currently only checks ipv4 addresses and will not work
corrently on ipv6 filters.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The NTUPLE related firmware commands are sent to the wrong firmware
channel, causing all these commands to fail on new firmware that
supports the new firmware channel. Fix it by excluding the 3
NTUPLE firmware commands from the list for the new firmware channel.
Fixes: 760b6d33410c ("bnxt_en: Add support for 2nd firmware message channel.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We would not be transmitting using the correct SYSTEMPORT transmit queue
during ndo_select_queue() which looks up the internal TX ring map
because while establishing the mapping we would be off by 4, so for
instance, when we populate switch port mappings we would be doing:
switch port 0, queue 0 -> ring index #0
switch port 0, queue 1 -> ring index #1
...
switch port 0, queue 3 -> ring index #3
switch port 1, queue 0 -> ring index #8 (4 + 4 * 1)
...
instead of using ring index #4. This would cause our ndo_select_queue()
to use the fallback queue mechanism which would pick up an incorrect
ring for that switch port. Fix this by using the correct switch queue
number instead of SYSTEMPORT queue number.
Fixes: 25c440704661 ("net: systemport: Simplify queue mapping logic")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When there is not enough memory and napi_alloc_skb() return NULL,
the HNS driver will print error message, and than try again, if
the memory is not enough for a while, huge error message and the
retry operation will cause soft lockup.
When napi_alloc_skb() return NULL because of no memory, we can
get a warn_alloc() call trace, so this patch deletes the error
message. We already use polling mode to handle irq, but the
retry operation will render the polling weight inactive, this
patch just return budget when the rx is not completed to avoid
dead loop.
Fixes: 36eedfde1a36 ("net: hns: Optimize hns_nic_common_poll for better performance")
Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When building with PROVE_LOCKING=y, lockdep shows the following
dump message.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
...
Calling device_set_wakeup_enable() directly occurs this issue,
and it isn't necessary for initialization, so this patch creates
internal function __ave_ethtool_set_wol() and replaces with this
in ave_init() and ave_resume().
Fixes: 7200f2e3c9e2 ("net: ethernet: ave: Set initial wol state to disabled")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The hardware can not handle short frames below or equal to 32
bytes according to the hardware user manual, and it will trigger
a RAS error when the frame's length is below 33 bytes.
This patch pads the SKB when skb->len is below 33 bytes before
sending it to hardware.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When HW does not support perfect filtering the feature will not be
enabled in the net_device. Add a check for this to prevent failures.
Fixes: 1b2250a04c1f ("net: stmmac: selftests: Add tests for VLAN Perfect Filtering")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the VLAN ID does not match the expected one it means filter failed
in HW. Fix it.
Fixes: 94e18382003c ("net: stmmac: selftests: Add selftest for VLAN TX Offload")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Synopsys AXS101 boards do not support unaligned memory loads or stores.
Change the selftests mechanism to explicity:
- Not add extra alignment in TX SKB
- Use the unaligned version of ether_addr_equal()
Fixes: 091810dbded9 ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mlxsw configures Spectrum in such a way that BUM traffic is passed not
through its nominal traffic class TC, but through its MC counterpart TC+8.
However, when collecting statistics, Qdiscs only look at the nominal TC and
ignore the MC TC.
Add two helpers to compute the value for logical TC from the constituents,
one for backlog, the other for tail drops. Use them throughout instead of
going through the xstats pointer directly.
Counters for TX bytes and packets are deduced from packet priority
counters, and therefore already include BUM traffic. wred_drop counter is
irrelevant on MC TCs, because RED is not enabled on them.
Fixes: 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Per-port counter cache used by Qdiscs is updated periodically, unless the
port is down. The fact that the cache is not updated for down ports is no
problem for most counters, which are relative in nature. However, backlog
is absolute in nature, and if there is a non-zero value in the cache around
the time that the port goes down, that value just stays there. This value
then leaks to offloaded Qdiscs that report non-zero backlog even if
there (obviously) is no traffic.
The HW does not keep backlog of a downed port, so do likewise: as the port
goes down, wipe the backlog value from xstats.
Fixes: 075ab8adaf4e ("mlxsw: spectrum: Collect tclass related stats periodically")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver needs to prepend a Tx header to each packet it is
transmitting. The header includes information such as the egress port
and traffic class.
The addition of the header requires the driver to modify the SKB's
header and therefore it must not be shared. Otherwise, we risk hitting
various race conditions.
For example, when a packet is flooded (cloned) by the bridge driver to
two switch ports swp1 and swp2:
t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with
swp1's port number
t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with
swp2's port number, overwriting swp1's port number
t2 - The device processes data buffer from t0. Packet is transmitted via
swp2
t3 - The device processes data buffer from t1. Packet is transmitted via
swp2
Usually, the device is fast enough and transmits the packet before its
Tx header is overwritten, but this is not the case in emulated
environments.
Fix this by making sure the SKB's header is writable by calling
skb_cow_head(). Since the function ensures we have headroom to push the
Tx header, the check further in the function can be removed.
v2:
* Use skb_cow_head() instead of skb_unshare() as suggested by Jakub
* Remove unnecessary check regarding headroom
Fixes: 31557f0f9755 ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver needs to prepend a Tx header to each packet it is
transmitting. The header includes information such as the egress port
and traffic class.
The addition of the header requires the driver to modify the SKB's
header and therefore it must not be shared. Otherwise, we risk hitting
various race conditions.
For example, when a packet is flooded (cloned) by the bridge driver to
two switch ports swp1 and swp2:
t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with
swp1's port number
t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with
swp2's port number, overwriting swp1's port number
t2 - The device processes data buffer from t0. Packet is transmitted via
swp2
t3 - The device processes data buffer from t1. Packet is transmitted via
swp2
Usually, the device is fast enough and transmits the packet before its
Tx header is overwritten, but this is not the case in emulated
environments.
Fix this by making sure the SKB's header is writable by calling
skb_cow_head(). Since the function ensures we have headroom to push the
Tx header, the check further in the function can be removed.
v2:
* Use skb_cow_head() instead of skb_unshare() as suggested by Jakub
* Remove unnecessary check regarding headroom
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit a72afb6879bb ("mlxsw: Enforce firmware version for
Spectrum-2") I added a required firmware version for Spectrum-2, but
missed the fact that mlxsw_sp2_init() is used by both Spectrum-2 and
Spectrum-3. This means that the same firmware version will be used for
both, which is wrong.
Fix this by creating a new init() callback for Spectrum-3.
Fixes: a72afb6879bb ("mlxsw: Enforce firmware version for Spectrum-2")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Page pool API will start syncing (if requested) starting from
page->dma_addr + pool->p.offset. Fix dma sync length in
mvneta_run_xdp since we do not need to account xdp headroom
Fixes: 07e13edbb6a6 ("net: mvneta: get rid of huge dma sync in mvneta_rx_refill")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|