diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-11-21 08:28:08 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-11-21 08:28:08 -0800 |
commit | fcc79e1714e8c2b8e216dc3149812edd37884eef (patch) | |
tree | 17a51d29db810b81412be040aaf380936b3261b4 /tools/testing | |
parent | 6e95ef0258ff4ee23ae3b06bf6b00b33dbbd5ef7 (diff) | |
parent | dd7207838d38780b51e4690ee508ab2d5057e099 (diff) |
Merge tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"The most significant set of changes is the per netns RTNL. The new
behavior is disabled by default, regression risk should be contained.
Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
default value from PTP_1588_CLOCK_KVM, as the first is intended to be
a more reliable replacement for the latter.
Core:
- Started a very large, in-progress, effort to make the RTNL lock
scope per network-namespace, thus reducing the lock contention
significantly in the containerized use-case, comprising:
- RCU-ified some relevant slices of the FIB control path
- introduce basic per netns locking helpers
- namespacified the IPv4 address hash table
- remove rtnl_register{,_module}() in favour of
rtnl_register_many()
- refactor rtnl_{new,del,set}link() moving as much validation as
possible out of RTNL lock
- convert all phonet doit() and dumpit() handlers to RCU
- convert IPv4 addresses manipulation to per-netns RTNL
- convert virtual interface creation to per-netns RTNL
the per-netns lock infrastructure is guarded by the
CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.
- Introduce NAPI suspension, to efficiently switching between busy
polling (NAPI processing suspended) and normal processing.
- Migrate the IPv4 routing input, output and control path from direct
ToS usage to DSCP macros. This is a work in progress to make ECN
handling consistent and reliable.
- Add drop reasons support to the IPv4 rotue input path, allowing
better introspection in case of packets drop.
- Make FIB seqnum lockless, dropping RTNL protection for read access.
- Make inet{,v6} addresses hashing less predicable.
- Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
and timestamps
Things we sprinkled into general kernel code:
- Add small file operations for debugfs, to reduce the struct ops
size.
- Refactoring and optimization for the implementation of page_frag
API, This is a preparatory work to consolidate the page_frag
implementation.
Netfilter:
- Optimize set element transactions to reduce memory consumption
- Extended netlink error reporting for attribute parser failure.
- Make legacy xtables configs user selectable, giving users the
option to configure iptables without enabling any other config.
- Address a lot of false-positive RCU issues, pointed by recent CI
improvements.
BPF:
- Put xsk sockets on a struct diet and add various cleanups. Overall,
this helps to bump performance by 12% for some workloads.
- Extend BPF selftests to increase coverage of XDP features in
combination with BPF cpumap.
- Optimize and homogenize bpf_csum_diff helper for all archs and also
add a batch of new BPF selftests for it.
- Extend netkit with an option to delegate skb->{mark,priority}
scrubbing to its BPF program.
- Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
programs.
Protocols:
- Introduces 4-tuple hash for connected udp sockets, speeding-up
significantly connected sockets lookup.
- Add a fastpath for some TCP timers that usually expires after
close, the socket lock contention.
- Add inbound and outbound xfrm state caches to speed up state
lookups.
- Avoid sending MPTCP advertisements on stale subflows, reducing
risks on loosing them.
- Make neighbours table flushing more scalable, maintaining per
device neigh lists.
Driver API:
- Introduce a unified interface to configure transmission H/W
shaping, and expose it to user-space via generic-netlink.
- Add support for per-NAPI config via netlink. This makes napi
configuration persistent across queues removal and re-creation.
Requires driver updates, currently supported drivers are:
nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.
- Add ethtool support for writing SFP / PHY firmware blocks.
- Track RSS context allocation from ethtool core.
- Implement support for mirroring to DSA CPU port, via TC mirror
offload.
- Consolidate FDB updates notification, to avoid duplicates on
device-specific entries.
- Expose DPLL clock quality level to the user-space.
- Support master-slave PHY config via device tree.
Tests and tooling:
- forwarding: introduce deferred commands, to simplify the cleanup
phase
Drivers:
- Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
IRQs and queues to NAPI IDs, allowing busy polling and better
introspection.
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- mlx5:
- a large refactor to implement support for cross E-Switch
scheduling
- refactor H/W conter management to let it scale better
- H/W GRO cleanups
- Intel (100G, ice)::
- add support for ethtool reset
- implement support for per TX queue H/W shaping
- AMD/Solarflare:
- implement per device queue stats support
- Broadcom (bnxt):
- improve wildcard l4proto on IPv4/IPv6 ntuple rules
- Marvell Octeon:
- Add representor support for each Resource Virtualization Unit
(RVU) device.
- Hisilicon:
- add support for the BMC Gigabit Ethernet
- IBM (EMAC):
- driver cleanup and modernization
- Cisco (VIC):
- raise the queues number limit to 256
- Ethernet virtual:
- Google vNIC:
- implement page pool support
- macsec:
- inherit lower device's features and TSO limits when
offloading
- virtio_net:
- enable premapped mode by default
- support for XDP socket(AF_XDP) zerocopy TX
- wireguard:
- set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
packets.
- Ethernet NICs embedded and virtual:
- Broadcom ASP:
- enable software timestamping
- Freescale:
- add enetc4 PF driver
- MediaTek: Airoha SoC:
- implement BQL support
- RealTek r8169:
- enable TSO by default on r8168/r8125
- implement extended ethtool stats
- Renesas AVB:
- enable TX checksum offload
- Synopsys (stmmac):
- support header splitting for vlan tagged packets
- move common code for DWMAC4 and DWXGMAC into a separate FPE
module.
- add dwmac driver support for T-HEAD TH1520 SoC
- Synopsys (xpcs):
- driver refactor and cleanup
- TI:
- icssg_prueth: add VLAN offload support
- Xilinx emaclite:
- add clock support
- Ethernet switches:
- Microchip:
- implement support for the lan969x Ethernet switch family
- add LAN9646 switch support to KSZ DSA driver
- Ethernet PHYs:
- Marvel: 88q2x: enable auto negotiation
- Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2
- PTP:
- Add support for the Amazon virtual clock device
- Add PtP driver for s390 clocks
- WiFi:
- mac80211
- EHT 1024 aggregation size for transmissions
- new operation to indicate that a new interface is to be added
- support radio separation of multi-band devices
- move wireless extension spy implementation to libiw
- Broadcom:
- brcmfmac: optional LPO clock support
- Microchip:
- add support for Atmel WILC3000
- Qualcomm (ath12k):
- firmware coredump collection support
- add debugfs support for a multitude of statistics
- Qualcomm (ath5k):
- Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
- Realtek:
- rtw88: 8821au and 8812au USB adapters support
- rtw89: add thermal protection
- rtw89: fine tune BT-coexsitence to improve user experience
- rtw89: firmware secure boot for WiFi 6 chip
- Bluetooth
- add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
0x13d3:0x3623
- add Realtek RTL8852BE support for id Foxconn 0xe123
- add MediaTek MT7920 support for wireless module ids
- btintel_pcie: add handshake between driver and firmware
- btintel_pcie: add recovery mechanism
- btnxpuart: add GPIO support to power save feature"
* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
mm: page_frag: fix a compile error when kernel is not compiled
Documentation: tipc: fix formatting issue in tipc.rst
selftests: nic_performance: Add selftest for performance of NIC driver
selftests: nic_link_layer: Add selftest case for speed and duplex states
selftests: nic_link_layer: Add link layer selftest for NIC driver
bnxt_en: Add FW trace coredump segments to the coredump
bnxt_en: Add a new ethtool -W dump flag
bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
bnxt_en: Add functions to copy host context memory
bnxt_en: Do not free FW log context memory
bnxt_en: Manage the FW trace context memory
bnxt_en: Allocate backing store memory for FW trace logs
bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
bnxt_en: Refactor bnxt_free_ctx_mem()
bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
bnxt_en: Update firmware interface spec to 1.10.3.85
selftests/bpf: Add some tests with sockmap SK_PASS
bpf: fix recursive lock when verdict program return SK_PASS
wireguard: device: support big tcp GSO
wireguard: selftests: load nf_conntrack if not present
...
Diffstat (limited to 'tools/testing')
120 files changed, 8534 insertions, 3065 deletions
diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index d45c9a9b304d..c2a1842c3d8b 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -23,7 +23,6 @@ test_flow_dissector flow_dissector_load test_tcpnotify_user test_libbpf -test_tcp_check_syncookie_user test_sysctl xdping test_cpp diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index b1080284522d..6ad3b1ba1920 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -137,7 +137,6 @@ TEST_PROGS := test_kmod.sh \ test_xdp_vlan_mode_generic.sh \ test_xdp_vlan_mode_native.sh \ test_lwt_ip_encap.sh \ - test_tcp_check_syncookie.sh \ test_tc_tunnel.sh \ test_tc_edt.sh \ test_xdping.sh \ @@ -154,11 +153,23 @@ TEST_PROGS_EXTENDED := with_addr.sh \ # Compile but not part of 'make run_tests' TEST_GEN_PROGS_EXTENDED = \ - flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ - test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ - xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \ - xdp_features bpf_test_no_cfi.ko bpf_test_modorder_x.ko \ - bpf_test_modorder_y.ko + bench \ + bpf_testmod.ko \ + bpf_test_modorder_x.ko \ + bpf_test_modorder_y.ko \ + bpf_test_no_cfi.ko \ + flow_dissector_load \ + runqslower \ + test_cpp \ + test_flow_dissector \ + test_lirc_mode2_user \ + veristat \ + xdp_features \ + xdp_hw_metadata \ + xdp_redirect_multi \ + xdp_synproxy \ + xdping \ + xskxceiver TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi @@ -370,7 +381,6 @@ $(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS) $(OUTPUT)/test_maps: $(TESTING_HELPERS) $(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS) $(UNPRIV_HELPERS) $(OUTPUT)/xsk.o: $(BPFOBJ) -$(OUTPUT)/test_tcp_check_syncookie_user: $(NETWORK_HELPERS) BPFTOOL ?= $(DEFAULT_BPFTOOL) $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index c72c16e1aff8..5764155b6d25 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __NETWORK_HELPERS_H #define __NETWORK_HELPERS_H +#include <arpa/inet.h> #include <sys/socket.h> #include <sys/types.h> #include <linux/types.h> diff --git a/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c b/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c index ef4d6a3ae423..cf15cc3be491 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c @@ -17,32 +17,37 @@ #include "test_progs.h" #include "test_btf_skc_cls_ingress.skel.h" -static struct test_btf_skc_cls_ingress *skel; -static struct sockaddr_in6 srv_sa6; -static __u32 duration; +#define TEST_NS "skc_cls_ingress" -static int prepare_netns(void) +#define BIT(n) (1 << (n)) +#define TEST_MODE_IPV4 BIT(0) +#define TEST_MODE_IPV6 BIT(1) +#define TEST_MODE_DUAL (TEST_MODE_IPV4 | TEST_MODE_IPV6) + +#define SERVER_ADDR_IPV4 "127.0.0.1" +#define SERVER_ADDR_IPV6 "::1" +#define SERVER_ADDR_DUAL "::0" +/* RFC791, 576 for minimal IPv4 datagram, minus 40 bytes of TCP header */ +#define MIN_IPV4_MSS 536 + +static struct netns_obj *prepare_netns(struct test_btf_skc_cls_ingress *skel) { LIBBPF_OPTS(bpf_tc_hook, qdisc_lo, .attach_point = BPF_TC_INGRESS); LIBBPF_OPTS(bpf_tc_opts, tc_attach, .prog_fd = bpf_program__fd(skel->progs.cls_ingress)); + struct netns_obj *ns = NULL; - if (CHECK(unshare(CLONE_NEWNET), "create netns", - "unshare(CLONE_NEWNET): %s (%d)", - strerror(errno), errno)) - return -1; - - if (CHECK(system("ip link set dev lo up"), - "ip link set dev lo up", "failed\n")) - return -1; + ns = netns_new(TEST_NS, true); + if (!ASSERT_OK_PTR(ns, "create and join netns")) + return ns; qdisc_lo.ifindex = if_nametoindex("lo"); if (!ASSERT_OK(bpf_tc_hook_create(&qdisc_lo), "qdisc add dev lo clsact")) - return -1; + goto free_ns; if (!ASSERT_OK(bpf_tc_attach(&qdisc_lo, &tc_attach), "filter add dev lo ingress")) - return -1; + goto free_ns; /* Ensure 20 bytes options (i.e. in total 40 bytes tcp header) for the * bpf_tcp_gen_syncookie() helper. @@ -50,71 +55,142 @@ static int prepare_netns(void) if (write_sysctl("/proc/sys/net/ipv4/tcp_window_scaling", "1") || write_sysctl("/proc/sys/net/ipv4/tcp_timestamps", "1") || write_sysctl("/proc/sys/net/ipv4/tcp_sack", "1")) - return -1; + goto free_ns; + + return ns; - return 0; +free_ns: + netns_free(ns); + return NULL; } -static void reset_test(void) +static void reset_test(struct test_btf_skc_cls_ingress *skel) { + memset(&skel->bss->srv_sa4, 0, sizeof(skel->bss->srv_sa4)); memset(&skel->bss->srv_sa6, 0, sizeof(skel->bss->srv_sa6)); skel->bss->listen_tp_sport = 0; skel->bss->req_sk_sport = 0; skel->bss->recv_cookie = 0; skel->bss->gen_cookie = 0; skel->bss->linum = 0; + skel->bss->mss = 0; } -static void print_err_line(void) +static void print_err_line(struct test_btf_skc_cls_ingress *skel) { if (skel->bss->linum) printf("bpf prog error at line %u\n", skel->bss->linum); } -static void test_conn(void) +static int v6only_true(int fd, void *opts) +{ + int mode = true; + + return setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &mode, sizeof(mode)); +} + +static int v6only_false(int fd, void *opts) { + int mode = false; + + return setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &mode, sizeof(mode)); +} + +static void run_test(struct test_btf_skc_cls_ingress *skel, bool gen_cookies, + int ip_mode) +{ + const char *tcp_syncookies = gen_cookies ? "2" : "1"; int listen_fd = -1, cli_fd = -1, srv_fd = -1, err; - socklen_t addrlen = sizeof(srv_sa6); + struct network_helper_opts opts = { 0 }; + struct sockaddr_storage *addr; + struct sockaddr_in6 srv_sa6; + struct sockaddr_in srv_sa4; + socklen_t addr_len; + int sock_family; + char *srv_addr; int srv_port; - if (write_sysctl("/proc/sys/net/ipv4/tcp_syncookies", "1")) + switch (ip_mode) { + case TEST_MODE_IPV4: + sock_family = AF_INET; + srv_addr = SERVER_ADDR_IPV4; + addr = (struct sockaddr_storage *)&srv_sa4; + addr_len = sizeof(srv_sa4); + break; + case TEST_MODE_IPV6: + opts.post_socket_cb = v6only_true; + sock_family = AF_INET6; + srv_addr = SERVER_ADDR_IPV6; + addr = (struct sockaddr_storage *)&srv_sa6; + addr_len = sizeof(srv_sa6); + break; + case TEST_MODE_DUAL: + opts.post_socket_cb = v6only_false; + sock_family = AF_INET6; + srv_addr = SERVER_ADDR_DUAL; + addr = (struct sockaddr_storage *)&srv_sa6; + addr_len = sizeof(srv_sa6); + break; + default: + PRINT_FAIL("Unknown IP mode %d", ip_mode); return; + } - listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0); - if (CHECK_FAIL(listen_fd == -1)) + if (write_sysctl("/proc/sys/net/ipv4/tcp_syncookies", tcp_syncookies)) return; - err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen); - if (CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d\n", err, - errno)) - goto done; - memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6)); - srv_port = ntohs(srv_sa6.sin6_port); + listen_fd = start_server_str(sock_family, SOCK_STREAM, srv_addr, 0, + &opts); + if (!ASSERT_OK_FD(listen_fd, "start server")) + return; - cli_fd = connect_to_fd(listen_fd, 0); - if (CHECK_FAIL(cli_fd == -1)) + err = getsockname(listen_fd, (struct sockaddr *)addr, &addr_len); + if (!ASSERT_OK(err, "getsockname(listen_fd)")) goto done; - srv_fd = accept(listen_fd, NULL, NULL); - if (CHECK_FAIL(srv_fd == -1)) + switch (ip_mode) { + case TEST_MODE_IPV4: + memcpy(&skel->bss->srv_sa4, &srv_sa4, sizeof(srv_sa4)); + srv_port = ntohs(srv_sa4.sin_port); + break; + case TEST_MODE_IPV6: + case TEST_MODE_DUAL: + memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6)); + srv_port = ntohs(srv_sa6.sin6_port); + break; + default: goto done; + } - if (CHECK(skel->bss->listen_tp_sport != srv_port || - skel->bss->req_sk_sport != srv_port, - "Unexpected sk src port", - "listen_tp_sport:%u req_sk_sport:%u expected:%u\n", - skel->bss->listen_tp_sport, skel->bss->req_sk_sport, - srv_port)) + cli_fd = connect_to_fd(listen_fd, 0); + if (!ASSERT_OK_FD(cli_fd, "connect client")) goto done; - if (CHECK(skel->bss->gen_cookie || skel->bss->recv_cookie, - "Unexpected syncookie states", - "gen_cookie:%u recv_cookie:%u\n", - skel->bss->gen_cookie, skel->bss->recv_cookie)) + srv_fd = accept(listen_fd, NULL, NULL); + if (!ASSERT_OK_FD(srv_fd, "accept connection")) goto done; - CHECK(skel->bss->linum, "bpf prog detected error", "at line %u\n", - skel->bss->linum); + ASSERT_EQ(skel->bss->listen_tp_sport, srv_port, "listen tp src port"); + + if (!gen_cookies) { + ASSERT_EQ(skel->bss->req_sk_sport, srv_port, + "request socket source port with syncookies disabled"); + ASSERT_EQ(skel->bss->gen_cookie, 0, + "generated syncookie with syncookies disabled"); + ASSERT_EQ(skel->bss->recv_cookie, 0, + "received syncookie with syncookies disabled"); + } else { + ASSERT_EQ(skel->bss->req_sk_sport, 0, + "request socket source port with syncookies enabled"); + ASSERT_NEQ(skel->bss->gen_cookie, 0, + "syncookie properly generated"); + ASSERT_EQ(skel->bss->gen_cookie, skel->bss->recv_cookie, + "matching syncookies on client and server"); + ASSERT_GT(skel->bss->mss, MIN_IPV4_MSS, + "MSS in cookie min value"); + ASSERT_LT(skel->bss->mss, USHRT_MAX, + "MSS in cookie max value"); + } done: if (listen_fd != -1) @@ -125,96 +201,74 @@ done: close(srv_fd); } -static void test_syncookie(void) +static void test_conn_ipv4(struct test_btf_skc_cls_ingress *skel) { - int listen_fd = -1, cli_fd = -1, srv_fd = -1, err; - socklen_t addrlen = sizeof(srv_sa6); - int srv_port; - - /* Enforce syncookie mode */ - if (write_sysctl("/proc/sys/net/ipv4/tcp_syncookies", "2")) - return; - - listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0); - if (CHECK_FAIL(listen_fd == -1)) - return; - - err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen); - if (CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d\n", err, - errno)) - goto done; - memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6)); - srv_port = ntohs(srv_sa6.sin6_port); - - cli_fd = connect_to_fd(listen_fd, 0); - if (CHECK_FAIL(cli_fd == -1)) - goto done; - - srv_fd = accept(listen_fd, NULL, NULL); - if (CHECK_FAIL(srv_fd == -1)) - goto done; + run_test(skel, false, TEST_MODE_IPV4); +} - if (CHECK(skel->bss->listen_tp_sport != srv_port, - "Unexpected tp src port", - "listen_tp_sport:%u expected:%u\n", - skel->bss->listen_tp_sport, srv_port)) - goto done; +static void test_conn_ipv6(struct test_btf_skc_cls_ingress *skel) +{ + run_test(skel, false, TEST_MODE_IPV6); +} - if (CHECK(skel->bss->req_sk_sport, - "Unexpected req_sk src port", - "req_sk_sport:%u expected:0\n", - skel->bss->req_sk_sport)) - goto done; +static void test_conn_dual(struct test_btf_skc_cls_ingress *skel) +{ + run_test(skel, false, TEST_MODE_DUAL); +} - if (CHECK(!skel->bss->gen_cookie || - skel->bss->gen_cookie != skel->bss->recv_cookie, - "Unexpected syncookie states", - "gen_cookie:%u recv_cookie:%u\n", - skel->bss->gen_cookie, skel->bss->recv_cookie)) - goto done; +static void test_syncookie_ipv4(struct test_btf_skc_cls_ingress *skel) +{ + run_test(skel, true, TEST_MODE_IPV4); +} - CHECK(skel->bss->linum, "bpf prog detected error", "at line %u\n", - skel->bss->linum); +static void test_syncookie_ipv6(struct test_btf_skc_cls_ingress *skel) +{ + run_test(skel, true, TEST_MODE_IPV6); +} -done: - if (listen_fd != -1) - close(listen_fd); - if (cli_fd != -1) - close(cli_fd); - if (srv_fd != -1) - close(srv_fd); +static void test_syncookie_dual(struct test_btf_skc_cls_ingress *skel) +{ + run_test(skel, true, TEST_MODE_DUAL); } struct test { const char *desc; - void (*run)(void); + void (*run)(struct test_btf_skc_cls_ingress *skel); }; #define DEF_TEST(name) { #name, test_##name } static struct test tests[] = { - DEF_TEST(conn), - DEF_TEST(syncookie), + DEF_TEST(conn_ipv4), + DEF_TEST(conn_ipv6), + DEF_TEST(conn_dual), + DEF_TEST(syncookie_ipv4), + DEF_TEST(syncookie_ipv6), + DEF_TEST(syncookie_dual), }; void test_btf_skc_cls_ingress(void) { + struct test_btf_skc_cls_ingress *skel; + struct netns_obj *ns; int i; skel = test_btf_skc_cls_ingress__open_and_load(); - if (CHECK(!skel, "test_btf_skc_cls_ingress__open_and_load", "failed\n")) + if (!ASSERT_OK_PTR(skel, "test_btf_skc_cls_ingress__open_and_load")) return; for (i = 0; i < ARRAY_SIZE(tests); i++) { if (!test__start_subtest(tests[i].desc)) continue; - if (prepare_netns()) + ns = prepare_netns(skel); + if (!ns) break; - tests[i].run(); + tests[i].run(skel); - print_err_line(); - reset_test(); + print_err_line(skel); + reset_test(skel); + netns_free(ns); } test_btf_skc_cls_ingress__destroy(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index d2ca32fa3b21..f8eb7f9d4fd2 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -5,12 +5,17 @@ #include <linux/const.h> #include <netinet/in.h> #include <test_progs.h> +#include <unistd.h> #include "cgroup_helpers.h" #include "network_helpers.h" #include "mptcp_sock.skel.h" #include "mptcpify.skel.h" +#include "mptcp_subflow.skel.h" #define NS_TEST "mptcp_ns" +#define ADDR_1 "10.0.1.1" +#define ADDR_2 "10.0.1.2" +#define PORT_1 10001 #ifndef IPPROTO_MPTCP #define IPPROTO_MPTCP 262 @@ -64,24 +69,6 @@ struct mptcp_storage { char ca_name[TCP_CA_NAME_MAX]; }; -static struct nstoken *create_netns(void) -{ - SYS(fail, "ip netns add %s", NS_TEST); - SYS(fail, "ip -net %s link set dev lo up", NS_TEST); - - return open_netns(NS_TEST); -fail: - return NULL; -} - -static void cleanup_netns(struct nstoken *nstoken) -{ - if (nstoken) - close_netns(nstoken); - - SYS_NOFAIL("ip netns del %s", NS_TEST); -} - static int start_mptcp_server(int family, const char *addr_str, __u16 port, int timeout_ms) { @@ -201,15 +188,15 @@ out: static void test_base(void) { - struct nstoken *nstoken = NULL; + struct netns_obj *netns = NULL; int server_fd, cgroup_fd; cgroup_fd = test__join_cgroup("/mptcp"); if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return; - nstoken = create_netns(); - if (!ASSERT_OK_PTR(nstoken, "create_netns")) + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) goto fail; /* without MPTCP */ @@ -232,7 +219,7 @@ with_mptcp: close(server_fd); fail: - cleanup_netns(nstoken); + netns_free(netns); close(cgroup_fd); } @@ -317,21 +304,135 @@ out: static void test_mptcpify(void) { - struct nstoken *nstoken = NULL; + struct netns_obj *netns = NULL; int cgroup_fd; cgroup_fd = test__join_cgroup("/mptcpify"); if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return; - nstoken = create_netns(); - if (!ASSERT_OK_PTR(nstoken, "create_netns")) + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) goto fail; ASSERT_OK(run_mptcpify(cgroup_fd), "run_mptcpify"); fail: - cleanup_netns(nstoken); + netns_free(netns); + close(cgroup_fd); +} + +static int endpoint_init(char *flags) +{ + SYS(fail, "ip -net %s link add veth1 type veth peer name veth2", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth1", NS_TEST, ADDR_1); + SYS(fail, "ip -net %s link set dev veth1 up", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth2", NS_TEST, ADDR_2); + SYS(fail, "ip -net %s link set dev veth2 up", NS_TEST); + if (SYS_NOFAIL("ip -net %s mptcp endpoint add %s %s", NS_TEST, ADDR_2, flags)) { + printf("'ip mptcp' not supported, skip this test.\n"); + test__skip(); + goto fail; + } + + return 0; +fail: + return -1; +} + +static void wait_for_new_subflows(int fd) +{ + socklen_t len; + u8 subflows; + int err, i; + + len = sizeof(subflows); + /* Wait max 5 sec for new subflows to be created */ + for (i = 0; i < 50; i++) { + err = getsockopt(fd, SOL_MPTCP, MPTCP_INFO, &subflows, &len); + if (!err && subflows > 0) + break; + + usleep(100000); /* 0.1s */ + } +} + +static void run_subflow(void) +{ + int server_fd, client_fd, err; + char new[TCP_CA_NAME_MAX]; + char cc[TCP_CA_NAME_MAX]; + unsigned int mark; + socklen_t len; + + server_fd = start_mptcp_server(AF_INET, ADDR_1, PORT_1, 0); + if (!ASSERT_OK_FD(server_fd, "start_mptcp_server")) + return; + + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_OK_FD(client_fd, "connect_to_fd")) + goto close_server; + + send_byte(client_fd); + wait_for_new_subflows(client_fd); + + len = sizeof(mark); + err = getsockopt(client_fd, SOL_SOCKET, SO_MARK, &mark, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, SO_MARK)")) + ASSERT_EQ(mark, 0, "mark"); + + len = sizeof(new); + err = getsockopt(client_fd, SOL_TCP, TCP_CONGESTION, new, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, TCP_CONGESTION)")) { + get_msk_ca_name(cc); + ASSERT_STREQ(new, cc, "cc"); + } + + close(client_fd); +close_server: + close(server_fd); +} + +static void test_subflow(void) +{ + struct mptcp_subflow *skel; + struct netns_obj *netns; + int cgroup_fd; + + cgroup_fd = test__join_cgroup("/mptcp_subflow"); + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup: mptcp_subflow")) + return; + + skel = mptcp_subflow__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_subflow")) + goto close_cgroup; + + skel->bss->pid = getpid(); + + skel->links.mptcp_subflow = + bpf_program__attach_cgroup(skel->progs.mptcp_subflow, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.mptcp_subflow, "attach mptcp_subflow")) + goto skel_destroy; + + skel->links._getsockopt_subflow = + bpf_program__attach_cgroup(skel->progs._getsockopt_subflow, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links._getsockopt_subflow, "attach _getsockopt_subflow")) + goto skel_destroy; + + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new: mptcp_subflow")) + goto skel_destroy; + + if (endpoint_init("subflow") < 0) + goto close_netns; + + run_subflow(); + +close_netns: + netns_free(netns); +skel_destroy: + mptcp_subflow__destroy(skel); +close_cgroup: close(cgroup_fd); } @@ -341,4 +442,6 @@ void test_mptcp(void) test_base(); if (test__start_subtest("mptcpify")) test_mptcpify(); + if (test__start_subtest("subflow")) + test_subflow(); } diff --git a/tools/testing/selftests/bpf/prog_tests/netns_cookie.c b/tools/testing/selftests/bpf/prog_tests/netns_cookie.c index 71d8f3ba7d6b..ac3c3c097c0e 100644 --- a/tools/testing/selftests/bpf/prog_tests/netns_cookie.c +++ b/tools/testing/selftests/bpf/prog_tests/netns_cookie.c @@ -8,12 +8,16 @@ #define SO_NETNS_COOKIE 71 #endif +#define loopback 1 + static int duration; void test_netns_cookie(void) { + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); int server_fd = -1, client_fd = -1, cgroup_fd = -1; - int err, val, ret, map, verdict; + int err, val, ret, map, verdict, tc_fd; struct netns_cookie_prog *skel; uint64_t cookie_expected_value; socklen_t vallen = sizeof(cookie_expected_value); @@ -38,36 +42,47 @@ void test_netns_cookie(void) if (!ASSERT_OK(err, "prog_attach")) goto done; + tc_fd = bpf_program__fd(skel->progs.get_netns_cookie_tcx); + err = bpf_prog_attach_opts(tc_fd, loopback, BPF_TCX_INGRESS, &opta); + if (!ASSERT_OK(err, "prog_attach")) + goto done; + server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0); if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno)) - goto done; + goto cleanup_tc; client_fd = connect_to_fd(server_fd, 0); if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno)) - goto done; + goto cleanup_tc; ret = send(client_fd, send_msg, sizeof(send_msg), 0); if (CHECK(ret != sizeof(send_msg), "send(msg)", "ret:%d\n", ret)) - goto done; + goto cleanup_tc; err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.sockops_netns_cookies), &client_fd, &val); if (!ASSERT_OK(err, "map_lookup(sockops_netns_cookies)")) - goto done; + goto cleanup_tc; err = getsockopt(client_fd, SOL_SOCKET, SO_NETNS_COOKIE, &cookie_expected_value, &vallen); if (!ASSERT_OK(err, "getsockopt")) - goto done; + goto cleanup_tc; ASSERT_EQ(val, cookie_expected_value, "cookie_value"); err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.sk_msg_netns_cookies), &client_fd, &val); if (!ASSERT_OK(err, "map_lookup(sk_msg_netns_cookies)")) - goto done; + goto cleanup_tc; ASSERT_EQ(val, cookie_expected_value, "cookie_value"); + ASSERT_EQ(skel->bss->tcx_init_netns_cookie, cookie_expected_value, "cookie_value"); + ASSERT_EQ(skel->bss->tcx_netns_cookie, cookie_expected_value, "cookie_value"); + +cleanup_tc: + err = bpf_prog_detach_opts(tc_fd, loopback, BPF_TCX_INGRESS, &optd); + ASSERT_OK(err, "prog_detach"); done: if (server_fd != -1) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 82bfb266741c..a2041f8e32eb 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -501,6 +501,58 @@ out: test_sockmap_pass_prog__destroy(skel); } +static void test_sockmap_stream_pass(void) +{ + int zero = 0, sent, recvd; + int verdict, parser; + int err, map; + int c = -1, p = -1; + struct test_sockmap_pass_prog *pass = NULL; + char snd[256] = "0123456789"; + char rcv[256] = "0"; + + pass = test_sockmap_pass_prog__open_and_load(); + verdict = bpf_program__fd(pass->progs.prog_skb_verdict); + parser = bpf_program__fd(pass->progs.prog_skb_parser); + map = bpf_map__fd(pass->maps.sock_map_rx); + + err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0); + if (!ASSERT_OK(err, "bpf_prog_attach stream parser")) + goto out; + + err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0); + if (!ASSERT_OK(err, "bpf_prog_attach stream verdict")) + goto out; + + err = create_pair(AF_INET, SOCK_STREAM, &c, &p); + if (err) + goto out; + + /* sk_data_ready of 'p' will be replaced by strparser handler */ + err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST); + if (!ASSERT_OK(err, "bpf_map_update_elem(p)")) + goto out_close; + + /* + * as 'prog_skb_parser' return the original skb len and + * 'prog_skb_verdict' return SK_PASS, the kernel will just + * pass it through to original socket 'p' + */ + sent = xsend(c, snd, sizeof(snd), 0); + ASSERT_EQ(sent, sizeof(snd), "xsend(c)"); + + recvd = recv_timeout(p, rcv, sizeof(rcv), SOCK_NONBLOCK, + IO_TIMEOUT_SEC); + ASSERT_EQ(recvd, sizeof(rcv), "recv_timeout(p)"); + +out_close: + close(c); + close(p); + +out: + test_sockmap_pass_prog__destroy(pass); +} + static void test_sockmap_skb_verdict_fionread(bool pass_prog) { int err, map, verdict, c0 = -1, c1 = -1, p0 = -1, p1 = -1; @@ -923,6 +975,8 @@ void test_sockmap_basic(void) test_sockmap_progs_query(BPF_SK_SKB_VERDICT); if (test__start_subtest("sockmap skb_verdict shutdown")) test_sockmap_skb_verdict_shutdown(); + if (test__start_subtest("sockmap stream parser and verdict pass")) + test_sockmap_stream_pass(); if (test__start_subtest("sockmap skb_verdict fionread")) test_sockmap_skb_verdict_fionread(true); if (test__start_subtest("sockmap skb_verdict fionread on drop")) diff --git a/tools/testing/selftests/bpf/prog_tests/tc_netkit.c b/tools/testing/selftests/bpf/prog_tests/tc_netkit.c index b9135720024c..151a4210028f 100644 --- a/tools/testing/selftests/bpf/prog_tests/tc_netkit.c +++ b/tools/testing/selftests/bpf/prog_tests/tc_netkit.c @@ -14,7 +14,9 @@ #include "netlink_helpers.h" #include "tc_helpers.h" -#define ICMP_ECHO 8 +#define MARK 42 +#define PRIO 0xeb9f +#define ICMP_ECHO 8 struct icmphdr { __u8 type; @@ -33,7 +35,7 @@ struct iplink_req { }; static int create_netkit(int mode, int policy, int peer_policy, int *ifindex, - bool same_netns) + bool same_netns, int scrub, int peer_scrub) { struct rtnl_handle rth = { .fd = -1 }; struct iplink_req req = {}; @@ -58,6 +60,8 @@ static int create_netkit(int mode, int policy, int peer_policy, int *ifindex, data = addattr_nest(&req.n, sizeof(req), IFLA_INFO_DATA); addattr32(&req.n, sizeof(req), IFLA_NETKIT_POLICY, policy); addattr32(&req.n, sizeof(req), IFLA_NETKIT_PEER_POLICY, peer_policy); + addattr32(&req.n, sizeof(req), IFLA_NETKIT_SCRUB, scrub); + addattr32(&req.n, sizeof(req), IFLA_NETKIT_PEER_SCRUB, peer_scrub); addattr32(&req.n, sizeof(req), IFLA_NETKIT_MODE, mode); addattr_nest_end(&req.n, data); addattr_nest_end(&req.n, linkinfo); @@ -118,9 +122,9 @@ static void destroy_netkit(void) static int __send_icmp(__u32 dest) { + int sock, ret, mark = MARK, prio = PRIO; struct sockaddr_in addr; struct icmphdr icmp; - int sock, ret; ret = write_sysctl("/proc/sys/net/ipv4/ping_group_range", "0 0"); if (!ASSERT_OK(ret, "write_sysctl(net.ipv4.ping_group_range)")) @@ -135,6 +139,15 @@ static int __send_icmp(__u32 dest) if (!ASSERT_OK(ret, "setsockopt(SO_BINDTODEVICE)")) goto out; + ret = setsockopt(sock, SOL_SOCKET, SO_MARK, &mark, sizeof(mark)); + if (!ASSERT_OK(ret, "setsockopt(SO_MARK)")) + goto out; + + ret = setsockopt(sock, SOL_SOCKET, SO_PRIORITY, + &prio, sizeof(prio)); + if (!ASSERT_OK(ret, "setsockopt(SO_PRIORITY)")) + goto out; + memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(dest); @@ -171,7 +184,8 @@ void serial_test_tc_netkit_basic(void) int err, ifindex; err = create_netkit(NETKIT_L2, NETKIT_PASS, NETKIT_PASS, - &ifindex, false); + &ifindex, false, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -285,7 +299,8 @@ static void serial_test_tc_netkit_multi_links_target(int mode, int target) int err, ifindex; err = create_netkit(mode, NETKIT_PASS, NETKIT_PASS, - &ifindex, false); + &ifindex, false, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -413,7 +428,8 @@ static void serial_test_tc_netkit_multi_opts_target(int mode, int target) int err, ifindex; err = create_netkit(mode, NETKIT_PASS, NETKIT_PASS, - &ifindex, false); + &ifindex, false, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -527,7 +543,8 @@ void serial_test_tc_netkit_device(void) int err, ifindex, ifindex2; err = create_netkit(NETKIT_L3, NETKIT_PASS, NETKIT_PASS, - &ifindex, true); + &ifindex, true, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -638,7 +655,8 @@ static void serial_test_tc_netkit_neigh_links_target(int mode, int target) int err, ifindex; err = create_netkit(mode, NETKIT_PASS, NETKIT_PASS, - &ifindex, false); + &ifindex, false, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -715,7 +733,8 @@ static void serial_test_tc_netkit_pkt_type_mode(int mode) struct bpf_link *link; err = create_netkit(mode, NETKIT_PASS, NETKIT_PASS, - &ifindex, true); + &ifindex, true, NETKIT_SCRUB_DEFAULT, + NETKIT_SCRUB_DEFAULT); if (err) return; @@ -779,3 +798,60 @@ void serial_test_tc_netkit_pkt_type(void) serial_test_tc_netkit_pkt_type_mode(NETKIT_L2); serial_test_tc_netkit_pkt_type_mode(NETKIT_L3); } + +static void serial_test_tc_netkit_scrub_type(int scrub) +{ + LIBBPF_OPTS(bpf_netkit_opts, optl); + struct test_tc_link *skel; + struct bpf_link *link; + int err, ifindex; + + err = create_netkit(NETKIT_L2, NETKIT_PASS, NETKIT_PASS, + &ifindex, false, scrub, scrub); + if (err) + return; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc8, + BPF_NETKIT_PRIMARY), 0, "tc8_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 0); + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PEER, 0); + + ASSERT_EQ(skel->bss->seen_tc8, false, "seen_tc8"); + + link = bpf_program__attach_netkit(skel->progs.tc8, ifindex, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + + skel->links.tc8 = link; + + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 1); + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PEER, 0); + + tc_skel_reset_all_seen(skel); + ASSERT_EQ(send_icmp(), 0, "icmp_pkt"); + + ASSERT_EQ(skel->bss->seen_tc8, true, "seen_tc8"); + ASSERT_EQ(skel->bss->mark, scrub == NETKIT_SCRUB_NONE ? MARK : 0, "mark"); + ASSERT_EQ(skel->bss->prio, scrub == NETKIT_SCRUB_NONE ? PRIO : 0, "prio"); +cleanup: + test_tc_link__destroy(skel); + + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 0); + assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PEER, 0); + destroy_netkit(); +} + +void serial_test_tc_netkit_scrub(void) +{ + serial_test_tc_netkit_scrub_type(NETKIT_SCRUB_DEFAULT); + serial_test_tc_netkit_scrub_type(NETKIT_SCRUB_NONE); +} diff --git a/tools/testing/selftests/bpf/prog_tests/test_csum_diff.c b/tools/testing/selftests/bpf/prog_tests/test_csum_diff.c new file mode 100644 index 000000000000..107b20d43e83 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_csum_diff.c @@ -0,0 +1,408 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright Amazon.com Inc. or its affiliates */ +#include <test_progs.h> +#include "csum_diff_test.skel.h" + +#define BUFF_SZ 512 + +struct testcase { + unsigned long long to_buff[BUFF_SZ / 8]; + unsigned int to_buff_len; + unsigned long long from_buff[BUFF_SZ / 8]; + unsigned int from_buff_len; + unsigned short seed; + unsigned short result; +}; + +#define NUM_PUSH_TESTS 4 + +struct testcase push_tests[NUM_PUSH_TESTS] = { + { + .to_buff = { + 0xdeadbeefdeadbeef, + }, + .to_buff_len = 8, + .from_buff = {}, + .from_buff_len = 0, + .seed = 0, + .result = 0x3b3b + }, + { + .to_buff = { + 0xdeadbeefdeadbeef, + 0xbeefdeadbeefdead, + }, + .to_buff_len = 16, + .from_buff = {}, + .from_buff_len = 0, + .seed = 0x1234, + .result = 0x88aa + }, + { + .to_buff = { + 0xdeadbeefdeadbeef, + 0xbeefdeadbeefdead, + }, + .to_buff_len = 15, + .from_buff = {}, + .from_buff_len = 0, + .seed = 0x1234, +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + .result = 0xcaa9 +#else + .result = 0x87fd +#endif + }, + { + .to_buff = { + 0x327b23c66b8b4567, + 0x66334873643c9869, + 0x19495cff74b0dc51, + 0x625558ec2ae8944a, + 0x46e87ccd238e1f29, + 0x507ed7ab3d1b58ba, + 0x41b71efb2eb141f2, + 0x7545e14679e2a9e3, + 0x5bd062c2515f007c, + 0x4db127f812200854, + 0x1f16e9e80216231b, + 0x66ef438d1190cde7, + 0x3352255a140e0f76, + 0x0ded7263109cf92e, + 0x1befd79f7fdcc233, + 0x6b68079a41a7c4c9, + 0x25e45d324e6afb66, + 0x431bd7b7519b500d, + 0x7c83e4583f2dba31, + 0x62bbd95a257130a3, + 0x628c895d436c6125, + 0x721da317333ab105, + 0x2d1d5ae92443a858, + 0x75a2a8d46763845e, + 0x79838cb208edbdab, + 0x0b03e0c64353d0cd, + 0x54e49eb4189a769b, + 0x2ca8861171f32454, + 0x02901d820836c40e, + 0x081386413a95f874, + 0x7c3dbd3d1e7ff521, + 0x6ceaf087737b8ddc, + 0x4516dde922221a70, + 0x614fd4a13006c83e, + 0x5577f8e1419ac241, + 0x05072367440badfc, + 0x77465f013804823e, + 0x5c482a977724c67e, + 0x5e884adc2463b9ea, + 0x2d51779651ead36b, + 0x153ea438580bd78f, + 0x70a64e2a3855585c, + 0x2a487cb06a2342ec, + 0x725a06fb1d4ed43b, + 0x57e4ccaf2cd89a32, + 0x4b588f547a6d8d3c, + 0x6de91b18542289ec, + 0x7644a45c38437fdb, + 0x684a481a32fff902, + 0x749abb43579478fe, + 0x1ba026fa3dc240fb, + 0x75c6c33a79a1deaa, + 0x70c6a52912e685fb, + 0x374a3fe6520eedd1, + 0x23f9c13c4f4ef005, + 0x275ac794649bb77c, + 0x1cf10fd839386575, + 0x235ba861180115be, + 0x354fe9f947398c89, + 0x741226bb15b5af5c, + 0x10233c990d34b6a8, + 0x615740953f6ab60f, + 0x77ae35eb7e0c57b1, + 0x310c50b3579be4f1, + }, + .to_buff_len = 512, + .from_buff = {}, + .from_buff_len = 0, + .seed = 0xffff, + .result = 0xca45 + }, +}; + +#define NUM_PULL_TESTS 4 + +struct testcase pull_tests[NUM_PULL_TESTS] = { + { + .from_buff = { + 0xdeadbeefdeadbeef, + }, + .from_buff_len = 8, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0, + .result = 0xc4c4 + }, + { + .from_buff = { + 0xdeadbeefdeadbeef, + 0xbeefdeadbeefdead, + }, + .from_buff_len = 16, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0x1234, + .result = 0x9bbd + }, + { + .from_buff = { + 0xdeadbeefdeadbeef, + 0xbeefdeadbeefdead, + }, + .from_buff_len = 15, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0x1234, +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + .result = 0x59be +#else + .result = 0x9c6a +#endif + }, + { + .from_buff = { + 0x327b23c66b8b4567, + 0x66334873643c9869, + 0x19495cff74b0dc51, + 0x625558ec2ae8944a, + 0x46e87ccd238e1f29, + 0x507ed7ab3d1b58ba, + 0x41b71efb2eb141f2, + 0x7545e14679e2a9e3, + 0x5bd062c2515f007c, + 0x4db127f812200854, + 0x1f16e9e80216231b, + 0x66ef438d1190cde7, + 0x3352255a140e0f76, + 0x0ded7263109cf92e, + 0x1befd79f7fdcc233, + 0x6b68079a41a7c4c9, + 0x25e45d324e6afb66, + 0x431bd7b7519b500d, + 0x7c83e4583f2dba31, + 0x62bbd95a257130a3, + 0x628c895d436c6125, + 0x721da317333ab105, + 0x2d1d5ae92443a858, + 0x75a2a8d46763845e, + 0x79838cb208edbdab, + 0x0b03e0c64353d0cd, + 0x54e49eb4189a769b, + 0x2ca8861171f32454, + 0x02901d820836c40e, + 0x081386413a95f874, + 0x7c3dbd3d1e7ff521, + 0x6ceaf087737b8ddc, + 0x4516dde922221a70, + 0x614fd4a13006c83e, + 0x5577f8e1419ac241, + 0x05072367440badfc, + 0x77465f013804823e, + 0x5c482a977724c67e, + 0x5e884adc2463b9ea, + 0x2d51779651ead36b, + 0x153ea438580bd78f, + 0x70a64e2a3855585c, + 0x2a487cb06a2342ec, + 0x725a06fb1d4ed43b, + 0x57e4ccaf2cd89a32, + 0x4b588f547a6d8d3c, + 0x6de91b18542289ec, + 0x7644a45c38437fdb, + 0x684a481a32fff902, + 0x749abb43579478fe, + 0x1ba026fa3dc240fb, + 0x75c6c33a79a1deaa, + 0x70c6a52912e685fb, + 0x374a3fe6520eedd1, + 0x23f9c13c4f4ef005, + 0x275ac794649bb77c, + 0x1cf10fd839386575, + 0x235ba861180115be, + 0x354fe9f947398c89, + 0x741226bb15b5af5c, + 0x10233c990d34b6a8, + 0x615740953f6ab60f, + 0x77ae35eb7e0c57b1, + 0x310c50b3579be4f1, + }, + .from_buff_len = 512, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0xffff, + .result = 0x35ba + }, +}; + +#define NUM_DIFF_TESTS 4 + +struct testcase diff_tests[NUM_DIFF_TESTS] = { + { + .from_buff = { + 0xdeadbeefdeadbeef, + }, + .from_buff_len = 8, + .to_buff = { + 0xabababababababab, + }, + .to_buff_len = 8, + .seed = 0, + .result = 0x7373 + }, + { + .from_buff = { + 0xdeadbeefdeadbeef, + }, + .from_buff_len = 7, + .to_buff = { + 0xabababababababab, + }, + .to_buff_len = 7, + .seed = 0, +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + .result = 0xa673 +#else + .result = 0x73b7 +#endif + }, + { + .from_buff = { + 0, + }, + .from_buff_len = 8, + .to_buff = { + 0xabababababababab, + }, + .to_buff_len = 8, + .seed = 0, + .result = 0xaeae + }, + { + .from_buff = { + 0xdeadbeefdeadbeef + }, + .from_buff_len = 8, + .to_buff = { + 0, + }, + .to_buff_len = 8, + .seed = 0xffff, + .result = 0xc4c4 + }, +}; + +#define NUM_EDGE_TESTS 4 + +struct testcase edge_tests[NUM_EDGE_TESTS] = { + { + .from_buff = {}, + .from_buff_len = 0, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0, + .result = 0 + }, + { + .from_buff = { + 0x1234 + }, + .from_buff_len = 0, + .to_buff = { + 0x1234 + }, + .to_buff_len = 0, + .seed = 0, + .result = 0 + }, + { + .from_buff = {}, + .from_buff_len = 0, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0x1234, + .result = 0x1234 + }, + { + .from_buff = {}, + .from_buff_len = 512, + .to_buff = {}, + .to_buff_len = 0, + .seed = 0xffff, + .result = 0xffff + }, +}; + +static unsigned short trigger_csum_diff(const struct csum_diff_test *skel) +{ + u8 tmp_out[64 << 2] = {}; + u8 tmp_in[64] = {}; + int err; + int pfd; + + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = tmp_in, + .data_size_in = sizeof(tmp_in), + .data_out = tmp_out, + .data_size_out = sizeof(tmp_out), + .repeat = 1, + ); + pfd = bpf_program__fd(skel->progs.compute_checksum); + err = bpf_prog_test_run_opts(pfd, &topts); + if (err) + return -1; + + return skel->bss->result; +} + +static void test_csum_diff(struct testcase *tests, int num_tests) +{ + struct csum_diff_test *skel; + unsigned short got; + int err; + + for (int i = 0; i < num_tests; i++) { + skel = csum_diff_test__open(); + if (!ASSERT_OK_PTR(skel, "csum_diff_test open")) + return; + + skel->rodata->to_buff_len = tests[i].to_buff_len; + skel->rodata->from_buff_len = tests[i].from_buff_len; + + err = csum_diff_test__load(skel); + if (!ASSERT_EQ(err, 0, "csum_diff_test load")) + goto out; + + memcpy(skel->bss->to_buff, tests[i].to_buff, tests[i].to_buff_len); + memcpy(skel->bss->from_buff, tests[i].from_buff, tests[i].from_buff_len); + skel->bss->seed = tests[i].seed; + + got = trigger_csum_diff(skel); + ASSERT_EQ(got, tests[i].result, "csum_diff result"); + + csum_diff_test__destroy(skel); + } + + return; +out: + csum_diff_test__destroy(skel); +} + +void test_test_csum_diff(void) +{ + if (test__start_subtest("csum_diff_push")) + test_csum_diff(push_tests, NUM_PUSH_TESTS); + if (test__start_subtest("csum_diff_pull")) + test_csum_diff(pull_tests, NUM_PULL_TESTS); + if (test__start_subtest("csum_diff_diff")) + test_csum_diff(diff_tests, NUM_DIFF_TESTS); + if (test__start_subtest("csum_diff_edge")) + test_csum_diff(edge_tests, NUM_EDGE_TESTS); +} diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c index 481626a875d1..c7f74f068e78 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c @@ -2,35 +2,41 @@ #include <uapi/linux/bpf.h> #include <linux/if_link.h> #include <test_progs.h> +#include <network_helpers.h> #include "test_xdp_with_cpumap_frags_helpers.skel.h" #include "test_xdp_with_cpumap_helpers.skel.h" #define IFINDEX_LO 1 +#define TEST_NS "cpu_attach_ns" static void test_xdp_with_cpumap_helpers(void) { - struct test_xdp_with_cpumap_helpers *skel; + struct test_xdp_with_cpumap_helpers *skel = NULL; struct bpf_prog_info info = {}; __u32 len = sizeof(info); struct bpf_cpumap_val val = { .qsize = 192, }; - int err, prog_fd, map_fd; + int err, prog_fd, prog_redir_fd, map_fd; + struct nstoken *nstoken = NULL; __u32 idx = 0; + SYS(out_close, "ip netns add %s", TEST_NS); + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto out_close; + SYS(out_close, "ip link set dev lo up"); + skel = test_xdp_with_cpumap_helpers__open_and_load(); if (!ASSERT_OK_PTR(skel, "test_xdp_with_cpumap_helpers__open_and_load")) return; - prog_fd = bpf_program__fd(skel->progs.xdp_redir_prog); - err = bpf_xdp_attach(IFINDEX_LO, prog_fd, XDP_FLAGS_SKB_MODE, NULL); + prog_redir_fd = bpf_program__fd(skel->progs.xdp_redir_prog); + err = bpf_xdp_attach(IFINDEX_LO, prog_redir_fd, XDP_FLAGS_SKB_MODE, NULL); if (!ASSERT_OK(err, "Generic attach of program with 8-byte CPUMAP")) goto out_close; - err = bpf_xdp_detach(IFINDEX_LO, XDP_FLAGS_SKB_MODE, NULL); - ASSERT_OK(err, "XDP program detach"); - prog_fd = bpf_program__fd(skel->progs.xdp_dummy_cm); map_fd = bpf_map__fd(skel->maps.cpu_map); err = bpf_prog_get_info_by_fd(prog_fd, &info, &len); @@ -45,6 +51,26 @@ static void test_xdp_with_cpumap_helpers(void) ASSERT_OK(err, "Read cpumap entry"); ASSERT_EQ(info.id, val.bpf_prog.id, "Match program id to cpumap entry prog_id"); + /* send a packet to trigger any potential bugs in there */ + char data[10] = {}; + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &data, + .data_size_in = 10, + .flags = BPF_F_TEST_XDP_LIVE_FRAMES, + .repeat = 1, + ); + err = bpf_prog_test_run_opts(prog_redir_fd, &opts); + ASSERT_OK(err, "XDP test run"); + + /* wait for the packets to be flushed, then check that redirect has been + * performed + */ + kern_sync_rcu(); + ASSERT_NEQ(skel->bss->redirect_count, 0, "redirected packets"); + + err = bpf_xdp_detach(IFINDEX_LO, XDP_FLAGS_SKB_MODE, NULL); + ASSERT_OK(err, "XDP program detach"); + /* can not attach BPF_XDP_CPUMAP program to a device */ err = bpf_xdp_attach(IFINDEX_LO, prog_fd, XDP_FLAGS_SKB_MODE, NULL); if (!ASSERT_NEQ(err, 0, "Attach of BPF_XDP_CPUMAP program")) @@ -65,6 +91,8 @@ static void test_xdp_with_cpumap_helpers(void) ASSERT_NEQ(err, 0, "Add BPF_XDP program with frags to cpumap entry"); out_close: + close_netns(nstoken); + SYS_NOFAIL("ip netns del %s", TEST_NS); test_xdp_with_cpumap_helpers__destroy(skel); } @@ -111,7 +139,7 @@ out_close: test_xdp_with_cpumap_frags_helpers__destroy(skel); } -void serial_test_xdp_cpumap_attach(void) +void test_xdp_cpumap_attach(void) { if (test__start_subtest("CPUMAP with programs in entries")) test_xdp_with_cpumap_helpers(); diff --git a/tools/testing/selftests/bpf/progs/csum_diff_test.c b/tools/testing/selftests/bpf/progs/csum_diff_test.c new file mode 100644 index 000000000000..9438f1773a58 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/csum_diff_test.c @@ -0,0 +1,42 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright Amazon.com Inc. or its affiliates */ +#include <linux/types.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +#define BUFF_SZ 512 + +/* Will be updated by benchmark before program loading */ +char to_buff[BUFF_SZ]; +const volatile unsigned int to_buff_len = 0; +char from_buff[BUFF_SZ]; +const volatile unsigned int from_buff_len = 0; +unsigned short seed = 0; + +short result; + +char _license[] SEC("license") = "GPL"; + +SEC("tc") +int compute_checksum(void *ctx) +{ + int to_len_half = to_buff_len / 2; + int from_len_half = from_buff_len / 2; + short result2; + + /* Calculate checksum in one go */ + result2 = bpf_csum_diff((void *)from_buff, from_buff_len, + (void *)to_buff, to_buff_len, seed); + + /* Calculate checksum by concatenating bpf_csum_diff()*/ + result = bpf_csum_diff((void *)from_buff, from_buff_len - from_len_half, + (void *)to_buff, to_buff_len - to_len_half, seed); + + result = bpf_csum_diff((void *)from_buff + (from_buff_len - from_len_half), from_len_half, + (void *)to_buff + (to_buff_len - to_len_half), to_len_half, result); + + result = (result == result2) ? result : 0; + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf.h b/tools/testing/selftests/bpf/progs/mptcp_bpf.h new file mode 100644 index 000000000000..3b188ccdcc40 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +#ifndef __MPTCP_BPF_H__ +#define __MPTCP_BPF_H__ + +#include "bpf_experimental.h" + +/* list helpers from include/linux/list.h */ +static inline int list_is_head(const struct list_head *list, + const struct list_head *head) +{ + return list == head; +} + +#define list_entry(ptr, type, member) \ + container_of(ptr, type, member) + +#define list_first_entry(ptr, type, member) \ + list_entry((ptr)->next, type, member) + +#define list_next_entry(pos, member) \ + list_entry((pos)->member.next, typeof(*(pos)), member) + +#define list_entry_is_head(pos, head, member) \ + list_is_head(&pos->member, (head)) + +/* small difference: 'can_loop' has been added in the conditions */ +#define list_for_each_entry(pos, head, member) \ + for (pos = list_first_entry(head, typeof(*pos), member); \ + !list_entry_is_head(pos, head, member) && can_loop; \ + pos = list_next_entry(pos, member)) + +/* mptcp helpers from protocol.h */ +#define mptcp_for_each_subflow(__msk, __subflow) \ + list_for_each_entry(__subflow, &((__msk)->conn_list), node) + +static __always_inline struct sock * +mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) +{ + return subflow->tcp_sock; +} + +#endif diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c new file mode 100644 index 000000000000..70302477e326 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c @@ -0,0 +1,128 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Tessares SA. */ +/* Copyright (c) 2024, Kylin Software */ + +/* vmlinux.h, bpf_helpers.h and other 'define' */ +#include "bpf_tracing_net.h" +#include "mptcp_bpf.h" + +char _license[] SEC("license") = "GPL"; + +char cc[TCP_CA_NAME_MAX] = "reno"; +int pid; + +/* Associate a subflow counter to each token */ +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); + __uint(max_entries, 100); +} mptcp_sf SEC(".maps"); + +SEC("sockops") +int mptcp_subflow(struct bpf_sock_ops *skops) +{ + __u32 init = 1, key, mark, *cnt; + struct mptcp_sock *msk; + struct bpf_sock *sk; + int err; + + if (skops->op != BPF_SOCK_OPS_TCP_CONNECT_CB) + return 1; + + sk = skops->sk; + if (!sk) + return 1; + + msk = bpf_skc_to_mptcp_sock(sk); + if (!msk) + return 1; + + key = msk->token; + cnt = bpf_map_lookup_elem(&mptcp_sf, &key); + if (cnt) { + /* A new subflow is added to an existing MPTCP connection */ + __sync_fetch_and_add(cnt, 1); + mark = *cnt; + } else { + /* A new MPTCP connection is just initiated and this is its primary subflow */ + bpf_map_update_elem(&mptcp_sf, &key, &init, BPF_ANY); + mark = init; + } + + /* Set the mark of the subflow's socket based on appearance order */ + err = bpf_setsockopt(skops, SOL_SOCKET, SO_MARK, &mark, sizeof(mark)); + if (err < 0) + return 1; + if (mark == 2) + err = bpf_setsockopt(skops, SOL_TCP, TCP_CONGESTION, cc, TCP_CA_NAME_MAX); + + return 1; +} + +static int _check_getsockopt_subflow_mark(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + int i = 0; + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + + if (ssk->sk_mark != ++i) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +static int _check_getsockopt_subflow_cc(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + + mptcp_for_each_subflow(msk, subflow) { + struct inet_connection_sock *icsk; + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + icsk = bpf_core_cast(ssk, struct inet_connection_sock); + + if (ssk->sk_mark == 2 && + __builtin_memcmp(icsk->icsk_ca_ops->name, cc, TCP_CA_NAME_MAX)) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +SEC("cgroup/getsockopt") +int _getsockopt_subflow(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + struct mptcp_sock *msk; + + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 1; + + if (!sk || sk->protocol != IPPROTO_MPTCP || + (!(ctx->level == SOL_SOCKET && ctx->optname == SO_MARK) && + !(ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION))) + return 1; + + msk = bpf_core_cast(sk, struct mptcp_sock); + if (msk->pm.subflows != 1) { + ctx->retval = -1; + return 1; + } + + if (ctx->optname == SO_MARK) + return _check_getsockopt_subflow_mark(msk, ctx); + return _check_getsockopt_subflow_cc(msk, ctx); +} diff --git a/tools/testing/selftests/bpf/progs/netns_cookie_prog.c b/tools/testing/selftests/bpf/progs/netns_cookie_prog.c index aeff3a4f9287..c6edf8dbefeb 100644 --- a/tools/testing/selftests/bpf/progs/netns_cookie_prog.c +++ b/tools/testing/selftests/bpf/progs/netns_cookie_prog.c @@ -27,6 +27,8 @@ struct { __type(value, __u64); } sock_map SEC(".maps"); +int tcx_init_netns_cookie, tcx_netns_cookie; + SEC("sockops") int get_netns_cookie_sockops(struct bpf_sock_ops *ctx) { @@ -81,4 +83,12 @@ int get_netns_cookie_sk_msg(struct sk_msg_md *msg) return 1; } +SEC("tcx/ingress") +int get_netns_cookie_tcx(struct __sk_buff *skb) +{ + tcx_init_netns_cookie = bpf_get_netns_cookie(NULL); + tcx_netns_cookie = bpf_get_netns_cookie(skb); + return TCX_PASS; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c b/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c index f0759efff6ef..1cd1a1b72cb5 100644 --- a/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c +++ b/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c @@ -10,16 +10,18 @@ #endif struct sockaddr_in6 srv_sa6 = {}; +struct sockaddr_in srv_sa4 = {}; __u16 listen_tp_sport = 0; __u16 req_sk_sport = 0; __u32 recv_cookie = 0; __u32 gen_cookie = 0; +__u32 mss = 0; __u32 linum = 0; #define LOG() ({ if (!linum) linum = __LINE__; }) -static void test_syncookie_helper(struct ipv6hdr *ip6h, struct tcphdr *th, - struct tcp_sock *tp, +static void test_syncookie_helper(void *iphdr, int iphdr_size, + struct tcphdr *th, struct tcp_sock *tp, struct __sk_buff *skb) { if (th->syn) { @@ -38,17 +40,18 @@ static void test_syncookie_helper(struct ipv6hdr *ip6h, struct tcphdr *th, return; } - mss_cookie = bpf_tcp_gen_syncookie(tp, ip6h, sizeof(*ip6h), + mss_cookie = bpf_tcp_gen_syncookie(tp, iphdr, iphdr_size, th, 40); if (mss_cookie < 0) { if (mss_cookie != -ENOENT) LOG(); } else { gen_cookie = (__u32)mss_cookie; + mss = mss_cookie >> 32; } } else if (gen_cookie) { /* It was in cookie mode */ - int ret = bpf_tcp_check_syncookie(tp, ip6h, sizeof(*ip6h), + int ret = bpf_tcp_check_syncookie(tp, iphdr, iphdr_size, th, sizeof(*th)); if (ret < 0) { @@ -60,26 +63,58 @@ static void test_syncookie_helper(struct ipv6hdr *ip6h, struct tcphdr *th, } } -static int handle_ip6_tcp(struct ipv6hdr *ip6h, struct __sk_buff *skb) +static int handle_ip_tcp(struct ethhdr *eth, struct __sk_buff *skb) { - struct bpf_sock_tuple *tuple; + struct bpf_sock_tuple *tuple = NULL; + unsigned int tuple_len = 0; struct bpf_sock *bpf_skc; - unsigned int tuple_len; + void *data_end, *iphdr; + struct ipv6hdr *ip6h; + struct iphdr *ip4h; struct tcphdr *th; - void *data_end; + int iphdr_size; data_end = (void *)(long)(skb->data_end); - th = (struct tcphdr *)(ip6h + 1); - if (th + 1 > data_end) - return TC_ACT_OK; - - /* Is it the testing traffic? */ - if (th->dest != srv_sa6.sin6_port) + switch (eth->h_proto) { + case bpf_htons(ETH_P_IP): + ip4h = (struct iphdr *)(eth + 1); + if (ip4h + 1 > data_end) + return TC_ACT_OK; + if (ip4h->protocol != IPPROTO_TCP) + return TC_ACT_OK; + th = (struct tcphdr *)(ip4h + 1); + if (th + 1 > data_end) + return TC_ACT_OK; + /* Is it the testing traffic? */ + if (th->dest != srv_sa4.sin_port) + return TC_ACT_OK; + tuple_len = sizeof(tuple->ipv4); + tuple = (struct bpf_sock_tuple *)&ip4h->saddr; + iphdr = ip4h; + iphdr_size = sizeof(*ip4h); + break; + case bpf_htons(ETH_P_IPV6): + ip6h = (struct ipv6hdr *)(eth + 1); + if (ip6h + 1 > data_end) + return TC_ACT_OK; + if (ip6h->nexthdr != IPPROTO_TCP) + return TC_ACT_OK; + th = (struct tcphdr *)(ip6h + 1); + if (th + 1 > data_end) + return TC_ACT_OK; + /* Is it the testing traffic? */ + if (th->dest != srv_sa6.sin6_port) + return TC_ACT_OK; + tuple_len = sizeof(tuple->ipv6); + tuple = (struct bpf_sock_tuple *)&ip6h->saddr; + iphdr = ip6h; + iphdr_size = sizeof(*ip6h); + break; + default: return TC_ACT_OK; + } - tuple_len = sizeof(tuple->ipv6); - tuple = (struct bpf_sock_tuple *)&ip6h->saddr; if ((void *)tuple + tuple_len > data_end) { LOG(); return TC_ACT_OK; @@ -126,7 +161,7 @@ static int handle_ip6_tcp(struct ipv6hdr *ip6h, struct __sk_buff *skb) listen_tp_sport = tp->inet_conn.icsk_inet.sk.__sk_common.skc_num; - test_syncookie_helper(ip6h, th, tp, skb); + test_syncookie_helper(iphdr, iphdr_size, th, tp, skb); bpf_sk_release(tp); return TC_ACT_OK; } @@ -142,7 +177,6 @@ release: SEC("tc") int cls_ingress(struct __sk_buff *skb) { - struct ipv6hdr *ip6h; struct ethhdr *eth; void *data_end; @@ -152,17 +186,11 @@ int cls_ingress(struct __sk_buff *skb) if (eth + 1 > data_end) return TC_ACT_OK; - if (eth->h_proto != bpf_htons(ETH_P_IPV6)) - return TC_ACT_OK; - - ip6h = (struct ipv6hdr *)(eth + 1); - if (ip6h + 1 > data_end) + if (eth->h_proto != bpf_htons(ETH_P_IP) && + eth->h_proto != bpf_htons(ETH_P_IPV6)) return TC_ACT_OK; - if (ip6h->nexthdr == IPPROTO_TCP) - return handle_ip6_tcp(ip6h, skb); - - return TC_ACT_OK; + return handle_ip_tcp(eth, skb); } char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_tc_link.c b/tools/testing/selftests/bpf/progs/test_tc_link.c index ab3eae3d6af8..10d825928499 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_link.c +++ b/tools/testing/selftests/bpf/progs/test_tc_link.c @@ -18,6 +18,7 @@ bool seen_tc4; bool seen_tc5; bool seen_tc6; bool seen_tc7; +bool seen_tc8; bool set_type; @@ -25,6 +26,8 @@ bool seen_eth; bool seen_host; bool seen_mcast; +int mark, prio; + SEC("tc/ingress") int tc1(struct __sk_buff *skb) { @@ -100,3 +103,12 @@ out: seen_tc7 = true; return TCX_PASS; } + +SEC("tc/egress") +int tc8(struct __sk_buff *skb) +{ + seen_tc8 = true; + mark = skb->mark; + prio = skb->priority; + return TCX_PASS; +} diff --git a/tools/testing/selftests/bpf/progs/test_tcp_check_syncookie_kern.c b/tools/testing/selftests/bpf/progs/test_tcp_check_syncookie_kern.c deleted file mode 100644 index 6edebce563b5..000000000000 --- a/tools/testing/selftests/bpf/progs/test_tcp_check_syncookie_kern.c +++ /dev/null @@ -1,167 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -// Copyright (c) 2018 Facebook -// Copyright (c) 2019 Cloudflare - -#include <string.h> - -#include <linux/bpf.h> -#include <linux/pkt_cls.h> -#include <linux/if_ether.h> -#include <linux/in.h> -#include <linux/ip.h> -#include <linux/ipv6.h> -#include <sys/socket.h> -#include <linux/tcp.h> - -#include <bpf/bpf_helpers.h> -#include <bpf/bpf_endian.h> - -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __type(key, __u32); - __type(value, __u32); - __uint(max_entries, 3); -} results SEC(".maps"); - -static __always_inline __s64 gen_syncookie(void *data_end, struct bpf_sock *sk, - void *iph, __u32 ip_size, - struct tcphdr *tcph) -{ - __u32 thlen = tcph->doff * 4; - - if (tcph->syn && !tcph->ack) { - // packet should only have an MSS option - if (thlen != 24) - return 0; - - if ((void *)tcph + thlen > data_end) - return 0; - - return bpf_tcp_gen_syncookie(sk, iph, ip_size, tcph, thlen); - } - return 0; -} - -static __always_inline void check_syncookie(void *ctx, void *data, - void *data_end) -{ - struct bpf_sock_tuple tup; - struct bpf_sock *sk; - struct ethhdr *ethh; - struct iphdr *ipv4h; - struct ipv6hdr *ipv6h; - struct tcphdr *tcph; - int ret; - __u32 key_mss = 2; - __u32 key_gen = 1; - __u32 key = 0; - __s64 seq_mss; - - ethh = data; - if (ethh + 1 > data_end) - return; - - switch (bpf_ntohs(ethh->h_proto)) { - case ETH_P_IP: - ipv4h = data + sizeof(struct ethhdr); - if (ipv4h + 1 > data_end) - return; - - if (ipv4h->ihl != 5) - return; - - tcph = data + sizeof(struct ethhdr) + sizeof(struct iphdr); - if (tcph + 1 > data_end) - return; - - tup.ipv4.saddr = ipv4h->saddr; - tup.ipv4.daddr = ipv4h->daddr; - tup.ipv4.sport = tcph->source; - tup.ipv4.dport = tcph->dest; - - sk = bpf_skc_lookup_tcp(ctx, &tup, sizeof(tup.ipv4), - BPF_F_CURRENT_NETNS, 0); - if (!sk) - return; - - if (sk->state != BPF_TCP_LISTEN) - goto release; - - seq_mss = gen_syncookie(data_end, sk, ipv4h, sizeof(*ipv4h), - tcph); - - ret = bpf_tcp_check_syncookie(sk, ipv4h, sizeof(*ipv4h), - tcph, sizeof(*tcph)); - break; - - case ETH_P_IPV6: - ipv6h = data + sizeof(struct ethhdr); - if (ipv6h + 1 > data_end) - return; - - if (ipv6h->nexthdr != IPPROTO_TCP) - return; - - tcph = data + sizeof(struct ethhdr) + sizeof(struct ipv6hdr); - if (tcph + 1 > data_end) - return; - - memcpy(tup.ipv6.saddr, &ipv6h->saddr, sizeof(tup.ipv6.saddr)); - memcpy(tup.ipv6.daddr, &ipv6h->daddr, sizeof(tup.ipv6.daddr)); - tup.ipv6.sport = tcph->source; - tup.ipv6.dport = tcph->dest; - - sk = bpf_skc_lookup_tcp(ctx, &tup, sizeof(tup.ipv6), - BPF_F_CURRENT_NETNS, 0); - if (!sk) - return; - - if (sk->state != BPF_TCP_LISTEN) - goto release; - - seq_mss = gen_syncookie(data_end, sk, ipv6h, sizeof(*ipv6h), - tcph); - - ret = bpf_tcp_check_syncookie(sk, ipv6h, sizeof(*ipv6h), - tcph, sizeof(*tcph)); - break; - - default: - return; - } - - if (seq_mss > 0) { - __u32 cookie = (__u32)seq_mss; - __u32 mss = seq_mss >> 32; - - bpf_map_update_elem(&results, &key_gen, &cookie, 0); - bpf_map_update_elem(&results, &key_mss, &mss, 0); - } - - if (ret == 0) { - __u32 cookie = bpf_ntohl(tcph->ack_seq) - 1; - - bpf_map_update_elem(&results, &key, &cookie, 0); - } - -release: - bpf_sk_release(sk); -} - -SEC("tc") -int check_syncookie_clsact(struct __sk_buff *skb) -{ - check_syncookie(skb, (void *)(long)skb->data, - (void *)(long)skb->data_end); - return TC_ACT_OK; -} - -SEC("xdp") -int check_syncookie_xdp(struct xdp_md *ctx) -{ - check_syncookie(ctx, (void *)(long)ctx->data, - (void *)(long)ctx->data_end); - return XDP_PASS; -} - -char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c b/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c index 20ec6723df18..3619239b01b7 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_with_cpumap_helpers.c @@ -12,10 +12,12 @@ struct { __uint(max_entries, 4); } cpu_map SEC(".maps"); +__u32 redirect_count = 0; + SEC("xdp") int xdp_redir_prog(struct xdp_md *ctx) { - return bpf_redirect_map(&cpu_map, 1, 0); + return bpf_redirect_map(&cpu_map, 0, 0); } SEC("xdp") @@ -27,6 +29,9 @@ int xdp_dummy_prog(struct xdp_md *ctx) SEC("xdp/cpumap") int xdp_dummy_cm(struct xdp_md *ctx) { + if (bpf_get_smp_processor_id() == 0) + redirect_count++; + if (ctx->ingress_ifindex == IFINDEX_LO) return XDP_DROP; diff --git a/tools/testing/selftests/bpf/progs/verifier_array_access.c b/tools/testing/selftests/bpf/progs/verifier_array_access.c index 95d7ecc12963..4195aa824ba5 100644 --- a/tools/testing/selftests/bpf/progs/verifier_array_access.c +++ b/tools/testing/selftests/bpf/progs/verifier_array_access.c @@ -368,8 +368,7 @@ __naked void a_read_only_array_2_1(void) r4 = 0; \ r5 = 0; \ call %[bpf_csum_diff]; \ -l0_%=: r0 &= 0xffff; \ - exit; \ +l0_%=: exit; \ " : : __imm(bpf_csum_diff), __imm(bpf_map_lookup_elem), diff --git a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c index f8f5dc9f72b8..62b8e29ced9f 100644 --- a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c +++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c @@ -21,7 +21,6 @@ #define tcp_flag_word(tp) (((union tcp_word_hdr *)(tp))->words[3]) -#define IP_DF 0x4000 #define IP_MF 0x2000 #define IP_OFFSET 0x1fff @@ -442,7 +441,7 @@ static __always_inline int tcp_lookup(void *ctx, struct header_pointers *hdr, bo /* TCP doesn't normally use fragments, and XDP can't reassemble * them. */ - if ((hdr->ipv4->frag_off & bpf_htons(IP_DF | IP_MF | IP_OFFSET)) != bpf_htons(IP_DF)) + if ((hdr->ipv4->frag_off & bpf_htons(IP_MF | IP_OFFSET)) != 0) return XDP_DROP; tup.ipv4.saddr = hdr->ipv4->saddr; diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 3e02d7267de8..e5c7ecbe57e3 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -56,6 +56,8 @@ static void running_handler(int a); #define BPF_SOCKHASH_FILENAME "test_sockhash_kern.bpf.o" #define CG_PATH "/sockmap" +#define EDATAINTEGRITY 2001 + /* global sockets */ int s1, s2, c1, c2, p1, p2; int test_cnt; @@ -86,6 +88,10 @@ int ktls; int peek_flag; int skb_use_parser; int txmsg_omit_skb_parser; +int verify_push_start; +int verify_push_len; +int verify_pop_start; +int verify_pop_len; static const struct option long_options[] = { {"help", no_argument, NULL, 'h' }, @@ -418,16 +424,18 @@ static int msg_loop_sendpage(int fd, int iov_length, int cnt, { bool drop = opt->drop_expected; unsigned char k = 0; + int i, j, fp; FILE *file; - int i, fp; file = tmpfile(); if (!file) { perror("create file for sendpage"); return 1; } - for (i = 0; i < iov_length * cnt; i++, k++) - fwrite(&k, sizeof(char), 1, file); + for (i = 0; i < cnt; i++, k = 0) { + for (j = 0; j < iov_length; j++, k++) + fwrite(&k, sizeof(char), 1, file); + } fflush(file); fseek(file, 0, SEEK_SET); @@ -510,42 +518,111 @@ unwind_iov: return -ENOMEM; } -static int msg_verify_data(struct msghdr *msg, int size, int chunk_sz) +/* In push or pop test, we need to do some calculations for msg_verify_data */ +static void msg_verify_date_prep(void) { - int i, j = 0, bytes_cnt = 0; - unsigned char k = 0; + int push_range_end = txmsg_start_push + txmsg_end_push - 1; + int pop_range_end = txmsg_start_pop + txmsg_pop - 1; + + if (txmsg_end_push && txmsg_pop && + txmsg_start_push <= pop_range_end && txmsg_start_pop <= push_range_end) { + /* The push range and the pop range overlap */ + int overlap_len; + + verify_push_start = txmsg_start_push; + verify_pop_start = txmsg_start_pop; + if (txmsg_start_push < txmsg_start_pop) + overlap_len = min(push_range_end - txmsg_start_pop + 1, txmsg_pop); + else + overlap_len = min(pop_range_end - txmsg_start_push + 1, txmsg_end_push); + verify_push_len = max(txmsg_end_push - overlap_len, 0); + verify_pop_len = max(txmsg_pop - overlap_len, 0); + } else { + /* Otherwise */ + verify_push_start = txmsg_start_push; + verify_pop_start = txmsg_start_pop; + verify_push_len = txmsg_end_push; + verify_pop_len = txmsg_pop; + } +} + +static int msg_verify_data(struct msghdr *msg, int size, int chunk_sz, + unsigned char *k_p, int *bytes_cnt_p, + int *check_cnt_p, int *push_p) +{ + int bytes_cnt = *bytes_cnt_p, check_cnt = *check_cnt_p, push = *push_p; + unsigned char k = *k_p; + int i, j; - for (i = 0; i < msg->msg_iovlen; i++) { + for (i = 0, j = 0; i < msg->msg_iovlen && size; i++, j = 0) { unsigned char *d = msg->msg_iov[i].iov_base; /* Special case test for skb ingress + ktls */ if (i == 0 && txmsg_ktls_skb) { if (msg->msg_iov[i].iov_len < 4) - return -EIO; + return -EDATAINTEGRITY; if (memcmp(d, "PASS", 4) != 0) { fprintf(stderr, "detected skb data error with skb ingress update @iov[%i]:%i \"%02x %02x %02x %02x\" != \"PASS\"\n", i, 0, d[0], d[1], d[2], d[3]); - return -EIO; + return -EDATAINTEGRITY; } j = 4; /* advance index past PASS header */ } for (; j < msg->msg_iov[i].iov_len && size; j++) { + if (push > 0 && + check_cnt == verify_push_start + verify_push_len - push) { + int skipped; +revisit_push: + skipped = push; + if (j + push >= msg->msg_iov[i].iov_len) + skipped = msg->msg_iov[i].iov_len - j; + push -= skipped; + size -= skipped; + j += skipped - 1; + check_cnt += skipped; + continue; + } + + if (verify_pop_len > 0 && check_cnt == verify_pop_start) { + bytes_cnt += verify_pop_len; + check_cnt += verify_pop_len; + k += verify_pop_len; + + if (bytes_cnt == chunk_sz) { + k = 0; + bytes_cnt = 0; + check_cnt = 0; + push = verify_push_len; + } + + if (push > 0 && + check_cnt == verify_push_start + verify_push_len - push) + goto revisit_push; + } + if (d[j] != k++) { fprintf(stderr, "detected data corruption @iov[%i]:%i %02x != %02x, %02x ?= %02x\n", i, j, d[j], k - 1, d[j+1], k); - return -EIO; + return -EDATAINTEGRITY; } bytes_cnt++; + check_cnt++; if (bytes_cnt == chunk_sz) { k = 0; bytes_cnt = 0; + check_cnt = 0; + push = verify_push_len; } size--; } } + *k_p = k; + *bytes_cnt_p = bytes_cnt; + *check_cnt_p = check_cnt; + *push_p = push; return 0; } @@ -598,10 +675,14 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, } clock_gettime(CLOCK_MONOTONIC, &s->end); } else { + float total_bytes, txmsg_pop_total, txmsg_push_total; int slct, recvp = 0, recv, max_fd = fd; - float total_bytes, txmsg_pop_total; int fd_flags = O_NONBLOCK; struct timeval timeout; + unsigned char k = 0; + int bytes_cnt = 0; + int check_cnt = 0; + int push = 0; fd_set w; fcntl(fd, fd_flags); @@ -615,12 +696,22 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, * This is really only useful for testing edge cases in code * paths. */ - total_bytes = (float)iov_count * (float)iov_length * (float)cnt; - if (txmsg_apply) + total_bytes = (float)iov_length * (float)cnt; + if (!opt->sendpage) + total_bytes *= (float)iov_count; + if (txmsg_apply) { + txmsg_push_total = txmsg_end_push * (total_bytes / txmsg_apply); txmsg_pop_total = txmsg_pop * (total_bytes / txmsg_apply); - else + } else { + txmsg_push_total = txmsg_end_push * cnt; txmsg_pop_total = txmsg_pop * cnt; + } + total_bytes += txmsg_push_total; total_bytes -= txmsg_pop_total; + if (data) { + msg_verify_date_prep(); + push = verify_push_len; + } err = clock_gettime(CLOCK_MONOTONIC, &s->start); if (err < 0) perror("recv start time"); @@ -693,10 +784,11 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, if (data) { int chunk_sz = opt->sendpage ? - iov_length * cnt : + iov_length : iov_length * iov_count; - errno = msg_verify_data(&msg, recv, chunk_sz); + errno = msg_verify_data(&msg, recv, chunk_sz, &k, &bytes_cnt, + &check_cnt, &push); if (errno) { perror("data verify msg failed"); goto out_errno; @@ -704,7 +796,11 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, if (recvp) { errno = msg_verify_data(&msg_peek, recvp, - chunk_sz); + chunk_sz, + &k, + &bytes_cnt, + &check_cnt, + &push); if (errno) { perror("data verify msg_peek failed"); goto out_errno; @@ -786,8 +882,6 @@ static int sendmsg_test(struct sockmap_options *opt) rxpid = fork(); if (rxpid == 0) { - if (txmsg_pop || txmsg_start_pop) - iov_buf -= (txmsg_pop - txmsg_start_pop + 1); if (opt->drop_expected || txmsg_ktls_skb_drop) _exit(0); @@ -812,7 +906,7 @@ static int sendmsg_test(struct sockmap_options *opt) s.bytes_sent, sent_Bps, sent_Bps/giga, s.bytes_recvd, recvd_Bps, recvd_Bps/giga, peek_flag ? "(peek_msg)" : ""); - if (err && txmsg_cork) + if (err && err != -EDATAINTEGRITY && txmsg_cork) err = 0; exit(err ? 1 : 0); } else if (rxpid == -1) { @@ -1456,8 +1550,8 @@ static void test_send_many(struct sockmap_options *opt, int cgrp) static void test_send_large(struct sockmap_options *opt, int cgrp) { - opt->iov_length = 256; - opt->iov_count = 1024; + opt->iov_length = 8192; + opt->iov_count = 32; opt->rate = 2; test_exec(cgrp, opt); } @@ -1586,17 +1680,19 @@ static void test_txmsg_cork_hangs(int cgrp, struct sockmap_options *opt) static void test_txmsg_pull(int cgrp, struct sockmap_options *opt) { /* Test basic start/end */ + txmsg_pass = 1; txmsg_start = 1; txmsg_end = 2; test_send(opt, cgrp); /* Test >4k pull */ + txmsg_pass = 1; txmsg_start = 4096; txmsg_end = 9182; test_send_large(opt, cgrp); /* Test pull + redirect */ - txmsg_redir = 0; + txmsg_redir = 1; txmsg_start = 1; txmsg_end = 2; test_send(opt, cgrp); @@ -1618,12 +1714,16 @@ static void test_txmsg_pull(int cgrp, struct sockmap_options *opt) static void test_txmsg_pop(int cgrp, struct sockmap_options *opt) { + bool data = opt->data_test; + /* Test basic pop */ + txmsg_pass = 1; txmsg_start_pop = 1; txmsg_pop = 2; test_send_many(opt, cgrp); /* Test pop with >4k */ + txmsg_pass = 1; txmsg_start_pop = 4096; txmsg_pop = 4096; test_send_large(opt, cgrp); @@ -1634,6 +1734,12 @@ static void test_txmsg_pop(int cgrp, struct sockmap_options *opt) txmsg_pop = 2; test_send_many(opt, cgrp); + /* TODO: Test for pop + cork should be different, + * - It makes the layout of the received data difficult + * - It makes it hard to calculate the total_bytes in the recvmsg + * Temporarily skip the data integrity test for this case now. + */ + opt->data_test = false; /* Test pop + cork */ txmsg_redir = 0; txmsg_cork = 512; @@ -1647,16 +1753,21 @@ static void test_txmsg_pop(int cgrp, struct sockmap_options *opt) txmsg_start_pop = 1; txmsg_pop = 2; test_send_many(opt, cgrp); + opt->data_test = data; } static void test_txmsg_push(int cgrp, struct sockmap_options *opt) { + bool data = opt->data_test; + /* Test basic push */ + txmsg_pass = 1; txmsg_start_push = 1; txmsg_end_push = 1; test_send(opt, cgrp); /* Test push 4kB >4k */ + txmsg_pass = 1; txmsg_start_push = 4096; txmsg_end_push = 4096; test_send_large(opt, cgrp); @@ -1667,21 +1778,66 @@ static void test_txmsg_push(int cgrp, struct sockmap_options *opt) txmsg_end_push = 2; test_send_many(opt, cgrp); + /* TODO: Test for push + cork should be different, + * - It makes the layout of the received data difficult + * - It makes it hard to calculate the total_bytes in the recvmsg + * Temporarily skip the data integrity test for this case now. + */ + opt->data_test = false; /* Test push + cork */ txmsg_redir = 0; txmsg_cork = 512; txmsg_start_push = 1; txmsg_end_push = 2; test_send_many(opt, cgrp); + opt->data_test = data; } static void test_txmsg_push_pop(int cgrp, struct sockmap_options *opt) { + /* Test push/pop range overlapping */ + txmsg_pass = 1; txmsg_start_push = 1; txmsg_end_push = 10; txmsg_start_pop = 5; txmsg_pop = 4; test_send_large(opt, cgrp); + + txmsg_pass = 1; + txmsg_start_push = 1; + txmsg_end_push = 10; + txmsg_start_pop = 5; + txmsg_pop = 16; + test_send_large(opt, cgrp); + + txmsg_pass = 1; + txmsg_start_push = 5; + txmsg_end_push = 4; + txmsg_start_pop = 1; + txmsg_pop = 10; + test_send_large(opt, cgrp); + + txmsg_pass = 1; + txmsg_start_push = 5; + txmsg_end_push = 16; + txmsg_start_pop = 1; + txmsg_pop = 10; + test_send_large(opt, cgrp); + + /* Test push/pop range non-overlapping */ + txmsg_pass = 1; + txmsg_start_push = 1; + txmsg_end_push = 10; + txmsg_start_pop = 16; + txmsg_pop = 4; + test_send_large(opt, cgrp); + + txmsg_pass = 1; + txmsg_start_push = 16; + txmsg_end_push = 10; + txmsg_start_pop = 5; + txmsg_pop = 4; + test_send_large(opt, cgrp); } static void test_txmsg_apply(int cgrp, struct sockmap_options *opt) diff --git a/tools/testing/selftests/bpf/test_tcp_check_syncookie.sh b/tools/testing/selftests/bpf/test_tcp_check_syncookie.sh deleted file mode 100755 index b42c24282c25..000000000000 --- a/tools/testing/selftests/bpf/test_tcp_check_syncookie.sh +++ /dev/null @@ -1,85 +0,0 @@ -#!/bin/sh -# SPDX-License-Identifier: GPL-2.0 -# Copyright (c) 2018 Facebook -# Copyright (c) 2019 Cloudflare - -set -eu -readonly NS1="ns1-$(mktemp -u XXXXXX)" - -wait_for_ip() -{ - local _i - printf "Wait for IP %s to become available " "$1" - for _i in $(seq ${MAX_PING_TRIES}); do - printf "." - if ns1_exec ping -c 1 -W 1 "$1" >/dev/null 2>&1; then - echo " OK" - return - fi - sleep 1 - done - echo 1>&2 "ERROR: Timeout waiting for test IP to become available." - exit 1 -} - -get_prog_id() -{ - awk '/ id / {sub(/.* id /, "", $0); print($1)}' -} - -ns1_exec() -{ - ip netns exec ${NS1} "$@" -} - -setup() -{ - ip netns add ${NS1} - ns1_exec ip link set lo up - - ns1_exec sysctl -w net.ipv4.tcp_syncookies=2 - ns1_exec sysctl -w net.ipv4.tcp_window_scaling=0 - ns1_exec sysctl -w net.ipv4.tcp_timestamps=0 - ns1_exec sysctl -w net.ipv4.tcp_sack=0 - - wait_for_ip 127.0.0.1 - wait_for_ip ::1 -} - -cleanup() -{ - ip netns del ns1 2>/dev/null || : -} - -main() -{ - trap cleanup EXIT 2 3 6 15 - setup - - printf "Testing clsact..." - ns1_exec tc qdisc add dev "${TEST_IF}" clsact - ns1_exec tc filter add dev "${TEST_IF}" ingress \ - bpf obj "${BPF_PROG_OBJ}" sec "${CLSACT_SECTION}" da - - BPF_PROG_ID=$(ns1_exec tc filter show dev "${TEST_IF}" ingress | \ - get_prog_id) - ns1_exec "${PROG}" "${BPF_PROG_ID}" - ns1_exec tc qdisc del dev "${TEST_IF}" clsact - - printf "Testing XDP..." - ns1_exec ip link set "${TEST_IF}" xdp \ - object "${BPF_PROG_OBJ}" section "${XDP_SECTION}" - BPF_PROG_ID=$(ns1_exec ip link show "${TEST_IF}" | get_prog_id) - ns1_exec "${PROG}" "${BPF_PROG_ID}" -} - -DIR=$(dirname $0) -TEST_IF=lo -MAX_PING_TRIES=5 -BPF_PROG_OBJ="${DIR}/test_tcp_check_syncookie_kern.bpf.o" -CLSACT_SECTION="tc" -XDP_SECTION="xdp" -BPF_PROG_ID=0 -PROG="${DIR}/test_tcp_check_syncookie_user" - -main diff --git a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c b/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c deleted file mode 100644 index 3844f9b8232a..000000000000 --- a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c +++ /dev/null @@ -1,213 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -// Copyright (c) 2018 Facebook -// Copyright (c) 2019 Cloudflare - -#include <limits.h> -#include <string.h> -#include <stdlib.h> -#include <unistd.h> - -#include <arpa/inet.h> -#include <netinet/in.h> -#include <sys/types.h> -#include <sys/socket.h> - -#include <bpf/bpf.h> -#include <bpf/libbpf.h> - -#include "cgroup_helpers.h" -#include "network_helpers.h" - -static int get_map_fd_by_prog_id(int prog_id, bool *xdp) -{ - struct bpf_prog_info info = {}; - __u32 info_len = sizeof(info); - __u32 map_ids[1]; - int prog_fd = -1; - int map_fd = -1; - - prog_fd = bpf_prog_get_fd_by_id(prog_id); - if (prog_fd < 0) { - log_err("Failed to get fd by prog id %d", prog_id); - goto err; - } - - info.nr_map_ids = 1; - info.map_ids = (__u64)(unsigned long)map_ids; - - if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) { - log_err("Failed to get info by prog fd %d", prog_fd); - goto err; - } - - if (!info.nr_map_ids) { - log_err("No maps found for prog fd %d", prog_fd); - goto err; - } - - *xdp = info.type == BPF_PROG_TYPE_XDP; - - map_fd = bpf_map_get_fd_by_id(map_ids[0]); - if (map_fd < 0) - log_err("Failed to get fd by map id %d", map_ids[0]); -err: - if (prog_fd >= 0) - close(prog_fd); - return map_fd; -} - -static int run_test(int server_fd, int results_fd, bool xdp) -{ - int client = -1, srv_client = -1; - int ret = 0; - __u32 key = 0; - __u32 key_gen = 1; - __u32 key_mss = 2; - __u32 value = 0; - __u32 value_gen = 0; - __u32 value_mss = 0; - - if (bpf_map_update_elem(results_fd, &key, &value, 0) < 0) { - log_err("Can't clear results"); - goto err; - } - - if (bpf_map_update_elem(results_fd, &key_gen, &value_gen, 0) < 0) { - log_err("Can't clear results"); - goto err; - } - - if (bpf_map_update_elem(results_fd, &key_mss, &value_mss, 0) < 0) { - log_err("Can't clear results"); - goto err; - } - - client = connect_to_fd(server_fd, 0); - if (client == -1) - goto err; - - srv_client = accept(server_fd, NULL, 0); - if (srv_client == -1) { - log_err("Can't accept connection"); - goto err; - } - - if (bpf_map_lookup_elem(results_fd, &key, &value) < 0) { - log_err("Can't lookup result"); - goto err; - } - - if (value == 0) { - log_err("Didn't match syncookie: %u", value); - goto err; - } - - if (bpf_map_lookup_elem(results_fd, &key_gen, &value_gen) < 0) { - log_err("Can't lookup result"); - goto err; - } - - if (xdp && value_gen == 0) { - // SYN packets do not get passed through generic XDP, skip the - // rest of the test. - printf("Skipping XDP cookie check\n"); - goto out; - } - - if (bpf_map_lookup_elem(results_fd, &key_mss, &value_mss) < 0) { - log_err("Can't lookup result"); - goto err; - } - - if (value != value_gen) { - log_err("BPF generated cookie does not match kernel one"); - goto err; - } - - if (value_mss < 536 || value_mss > USHRT_MAX) { - log_err("Unexpected MSS retrieved"); - goto err; - } - - goto out; - -err: - ret = 1; -out: - close(client); - close(srv_client); - return ret; -} - -static int v6only_true(int fd, void *opts) -{ - int mode = true; - - return setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &mode, sizeof(mode)); -} - -static int v6only_false(int fd, void *opts) -{ - int mode = false; - - return setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &mode, sizeof(mode)); -} - -int main(int argc, char **argv) -{ - struct network_helper_opts opts = { 0 }; - int server = -1; - int server_v6 = -1; - int server_dual = -1; - int results = -1; - int err = 0; - bool xdp; - - if (argc < 2) { - fprintf(stderr, "Usage: %s prog_id\n", argv[0]); - exit(1); - } - - /* Use libbpf 1.0 API mode */ - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); - - results = get_map_fd_by_prog_id(atoi(argv[1]), &xdp); - if (results < 0) { - log_err("Can't get map"); - goto err; - } - - server = start_server_str(AF_INET, SOCK_STREAM, "127.0.0.1", 0, NULL); - if (server == -1) - goto err; - - opts.post_socket_cb = v6only_true; - server_v6 = start_server_str(AF_INET6, SOCK_STREAM, "::1", 0, &opts); - if (server_v6 == -1) - goto err; - - opts.post_socket_cb = v6only_false; - server_dual = start_server_str(AF_INET6, SOCK_STREAM, "::0", 0, &opts); - if (server_dual == -1) - goto err; - - if (run_test(server, results, xdp)) - goto err; - - if (run_test(server_v6, results, xdp)) - goto err; - - if (run_test(server_dual, results, xdp)) - goto err; - - printf("ok\n"); - goto out; -err: - err = 1; -out: - close(server); - close(server_v6); - close(server_dual); - close(results); - return err; -} diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile index 39fb97a8c1df..0fec8f9801ad 100644 --- a/tools/testing/selftests/drivers/net/Makefile +++ b/tools/testing/selftests/drivers/net/Makefile @@ -9,6 +9,7 @@ TEST_PROGS := \ ping.py \ queues.py \ stats.py \ + shaper.py \ # end of TEST_PROGS include ../../lib.mk diff --git a/tools/testing/selftests/drivers/net/hw/.gitignore b/tools/testing/selftests/drivers/net/hw/.gitignore new file mode 100644 index 000000000000..e9fe6ede681a --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/.gitignore @@ -0,0 +1 @@ +ncdevmem diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile index c9f2f48fc30f..21ba64ce1e34 100644 --- a/tools/testing/selftests/drivers/net/hw/Makefile +++ b/tools/testing/selftests/drivers/net/hw/Makefile @@ -3,6 +3,7 @@ TEST_PROGS = \ csum.py \ devlink_port_split.py \ + devmem.py \ ethtool.sh \ ethtool_extended_state.sh \ ethtool_mm.sh \ @@ -10,6 +11,8 @@ TEST_PROGS = \ hw_stats_l3.sh \ hw_stats_l3_gre.sh \ loopback.sh \ + nic_link_layer.py \ + nic_performance.py \ pp_alloc_fail.py \ rss_ctx.py \ # @@ -26,4 +29,12 @@ TEST_INCLUDES := \ ../../../net/forwarding/tc_common.sh \ # +# YNL files, must be before "include ..lib.mk" +YNL_GEN_FILES := ncdevmem +TEST_GEN_FILES += $(YNL_GEN_FILES) + include ../../../lib.mk + +# YNL build +YNL_GENS := ethtool netdev +include ../../../net/ynl.mk diff --git a/tools/testing/selftests/drivers/net/hw/devmem.py b/tools/testing/selftests/drivers/net/hw/devmem.py new file mode 100755 index 000000000000..1223f0f5c10c --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/devmem.py @@ -0,0 +1,45 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +from lib.py import ksft_run, ksft_exit +from lib.py import ksft_eq, KsftSkipEx +from lib.py import NetDrvEpEnv +from lib.py import bkg, cmd, rand_port, wait_port_listen +from lib.py import ksft_disruptive + + +def require_devmem(cfg): + if not hasattr(cfg, "_devmem_probed"): + port = rand_port() + probe_command = f"./ncdevmem -f {cfg.ifname}" + cfg._devmem_supported = cmd(probe_command, fail=False, shell=True).ret == 0 + cfg._devmem_probed = True + + if not cfg._devmem_supported: + raise KsftSkipEx("Test requires devmem support") + + +@ksft_disruptive +def check_rx(cfg) -> None: + cfg.require_v6() + require_devmem(cfg) + + port = rand_port() + listen_cmd = f"./ncdevmem -l -f {cfg.ifname} -s {cfg.v6} -p {port}" + + with bkg(listen_cmd) as socat: + wait_port_listen(port) + cmd(f"echo -e \"hello\\nworld\"| socat -u - TCP6:[{cfg.v6}]:{port}", host=cfg.remote, shell=True) + + ksft_eq(socat.stdout.strip(), "hello\nworld") + + +def main() -> None: + with NetDrvEpEnv(__file__) as cfg: + ksft_run([check_rx], + args=(cfg, )) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py index b582885786f5..399789a9676a 100644 --- a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py +++ b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py @@ -9,6 +9,7 @@ try: sys.path.append(KSFT_DIR.as_posix()) from net.lib.py import * from drivers.net.lib.py import * + from .linkconfig import LinkConfig except ModuleNotFoundError as e: ksft_pr("Failed importing `net` library from kernel sources") ksft_pr(str(e)) diff --git a/tools/testing/selftests/drivers/net/hw/lib/py/linkconfig.py b/tools/testing/selftests/drivers/net/hw/lib/py/linkconfig.py new file mode 100644 index 000000000000..db84000fc75b --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/lib/py/linkconfig.py @@ -0,0 +1,222 @@ +# SPDX-License-Identifier: GPL-2.0 + +from lib.py import cmd, ethtool, ip +from lib.py import ksft_pr, ksft_eq, KsftSkipEx +from typing import Optional +import re +import time +import json + +#The LinkConfig class is implemented to handle the link layer configurations. +#Required minimum ethtool version is 6.10 + +class LinkConfig: + """Class for handling the link layer configurations""" + def __init__(self, cfg: object) -> None: + self.cfg = cfg + self.partner_netif = self.get_partner_netif_name() + + """Get the initial link configuration of local interface""" + self.common_link_modes = self.get_common_link_modes() + + def get_partner_netif_name(self) -> Optional[str]: + partner_netif = None + try: + if not self.verify_link_up(): + return None + """Get partner interface name""" + partner_json_output = ip("addr show", json=True, host=self.cfg.remote) + for interface in partner_json_output: + for addr in interface.get('addr_info', []): + if addr.get('local') == self.cfg.remote_addr: + partner_netif = interface['ifname'] + ksft_pr(f"Partner Interface name: {partner_netif}") + if partner_netif is None: + ksft_pr("Unable to get the partner interface name") + except Exception as e: + print(f"Unexpected error occurred while getting partner interface name: {e}") + self.partner_netif = partner_netif + return partner_netif + + def verify_link_up(self) -> bool: + """Verify whether the local interface link is up""" + with open(f"/sys/class/net/{self.cfg.ifname}/operstate", "r") as fp: + link_state = fp.read().strip() + + if link_state == "down": + ksft_pr(f"Link state of interface {self.cfg.ifname} is DOWN") + return False + else: + return True + + def reset_interface(self, local: bool = True, remote: bool = True) -> bool: + ksft_pr("Resetting interfaces in local and remote") + if remote: + if self.verify_link_up(): + if self.partner_netif is not None: + ifname = self.partner_netif + link_up_cmd = f"ip link set up {ifname}" + link_down_cmd = f"ip link set down {ifname}" + reset_cmd = f"{link_down_cmd} && sleep 5 && {link_up_cmd}" + try: + cmd(reset_cmd, host=self.cfg.remote) + except Exception as e: + ksft_pr(f"Unexpected error occurred while resetting remote: {e}") + else: + ksft_pr("Partner interface not available") + if local: + ifname = self.cfg.ifname + link_up_cmd = f"ip link set up {ifname}" + link_down_cmd = f"ip link set down {ifname}" + reset_cmd = f"{link_down_cmd} && sleep 5 && {link_up_cmd}" + try: + cmd(reset_cmd) + except Exception as e: + ksft_pr(f"Unexpected error occurred while resetting local: {e}") + time.sleep(10) + if self.verify_link_up() and self.get_ethtool_field("link-detected"): + ksft_pr("Local and remote interfaces reset to original state") + return True + else: + ksft_pr("Error occurred after resetting interfaces. Link is DOWN.") + return False + + def set_speed_and_duplex(self, speed: str, duplex: str, autoneg: bool = True) -> bool: + """Set the speed and duplex state for the interface""" + autoneg_state = "on" if autoneg is True else "off" + process = None + try: + process = ethtool(f"--change {self.cfg.ifname} speed {speed} duplex {duplex} autoneg {autoneg_state}") + except Exception as e: + ksft_pr(f"Unexpected error occurred while setting speed/duplex: {e}") + if process is None or process.ret != 0: + return False + else: + ksft_pr(f"Speed: {speed} Mbps, Duplex: {duplex} set for Interface: {self.cfg.ifname}") + return True + + def verify_speed_and_duplex(self, expected_speed: str, expected_duplex: str) -> bool: + if not self.verify_link_up(): + return False + """Verifying the speed and duplex state for the interface""" + with open(f"/sys/class/net/{self.cfg.ifname}/speed", "r") as fp: + actual_speed = fp.read().strip() + with open(f"/sys/class/net/{self.cfg.ifname}/duplex", "r") as fp: + actual_duplex = fp.read().strip() + + ksft_eq(actual_speed, expected_speed) + ksft_eq(actual_duplex, expected_duplex) + return True + + def set_autonegotiation_state(self, state: str, remote: bool = False) -> bool: + common_link_modes = self.common_link_modes + speeds, duplex_modes = self.get_speed_duplex_values(self.common_link_modes) + speed = speeds[0] + duplex = duplex_modes[0] + if not speed or not duplex: + ksft_pr("No speed or duplex modes found") + return False + + speed_duplex_cmd = f"speed {speed} duplex {duplex}" if state == "off" else "" + if remote: + if not self.verify_link_up(): + return False + """Set the autonegotiation state for the partner""" + command = f"-s {self.partner_netif} {speed_duplex_cmd} autoneg {state}" + partner_autoneg_change = None + """Set autonegotiation state for interface in remote pc""" + try: + partner_autoneg_change = ethtool(command, host=self.cfg.remote) + except Exception as e: + ksft_pr(f"Unexpected error occurred while changing auto-neg in remote: {e}") + if partner_autoneg_change is None or partner_autoneg_change.ret != 0: + ksft_pr(f"Not able to set autoneg parameter for interface {self.partner_netif}.") + return False + ksft_pr(f"Autoneg set as {state} for {self.partner_netif}") + else: + """Set the autonegotiation state for the interface""" + try: + process = ethtool(f"-s {self.cfg.ifname} {speed_duplex_cmd} autoneg {state}") + if process.ret != 0: + ksft_pr(f"Not able to set autoneg parameter for interface {self.cfg.ifname}") + return False + except Exception as e: + ksft_pr(f"Unexpected error occurred while changing auto-neg in local: {e}") + return False + ksft_pr(f"Autoneg set as {state} for {self.cfg.ifname}") + return True + + def check_autoneg_supported(self, remote: bool = False) -> bool: + if not remote: + local_autoneg = self.get_ethtool_field("supports-auto-negotiation") + if local_autoneg is None: + ksft_pr(f"Unable to fetch auto-negotiation status for interface {self.cfg.ifname}") + """Return autoneg status of the local interface""" + return local_autoneg + else: + if not self.verify_link_up(): + raise KsftSkipEx("Link is DOWN") + """Check remote auto-negotiation support status""" + partner_autoneg = False + if self.partner_netif is not None: + partner_autoneg = self.get_ethtool_field("supports-auto-negotiation", remote=True) + if partner_autoneg is None: + ksft_pr(f"Unable to fetch auto-negotiation status for interface {self.partner_netif}") + return partner_autoneg + + def get_common_link_modes(self) -> set[str]: + common_link_modes = [] + """Populate common link modes""" + link_modes = self.get_ethtool_field("supported-link-modes") + partner_link_modes = self.get_ethtool_field("link-partner-advertised-link-modes") + if link_modes is None: + raise KsftSkipEx(f"Link modes not available for {self.cfg.ifname}") + if partner_link_modes is None: + raise KsftSkipEx(f"Partner link modes not available for {self.cfg.ifname}") + common_link_modes = set(link_modes) and set(partner_link_modes) + return common_link_modes + + def get_speed_duplex_values(self, link_modes: list[str]) -> tuple[list[str], list[str]]: + speed = [] + duplex = [] + """Check the link modes""" + for data in link_modes: + parts = data.split('/') + speed_value = re.match(r'\d+', parts[0]) + if speed_value: + speed.append(speed_value.group()) + else: + ksft_pr(f"No speed value found for interface {self.ifname}") + return None, None + duplex.append(parts[1].lower()) + return speed, duplex + + def get_ethtool_field(self, field: str, remote: bool = False) -> Optional[str]: + process = None + if not remote: + """Get the ethtool field value for the local interface""" + try: + process = ethtool(self.cfg.ifname, json=True) + except Exception as e: + ksft_pr("Required minimum ethtool version is 6.10") + ksft_pr(f"Unexpected error occurred while getting ethtool field in local: {e}") + return None + else: + if not self.verify_link_up(): + return None + """Get the ethtool field value for the remote interface""" + self.cfg.require_cmd("ethtool", remote=True) + if self.partner_netif is None: + ksft_pr(f"Partner interface name is unavailable.") + return None + try: + process = ethtool(self.partner_netif, json=True, host=self.cfg.remote) + except Exception as e: + ksft_pr("Required minimum ethtool version is 6.10") + ksft_pr(f"Unexpected error occurred while getting ethtool field in remote: {e}") + return None + json_data = process[0] + """Check if the field exist in the json data""" + if field not in json_data: + raise KsftSkipEx(f"Field {field} does not exist in the output of interface {json_data["ifname"]}") + return json_data[field] diff --git a/tools/testing/selftests/drivers/net/hw/ncdevmem.c b/tools/testing/selftests/drivers/net/hw/ncdevmem.c new file mode 100644 index 000000000000..8e502a1f8f9b --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/ncdevmem.c @@ -0,0 +1,789 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * tcpdevmem netcat. Works similarly to netcat but does device memory TCP + * instead of regular TCP. Uses udmabuf to mock a dmabuf provider. + * + * Usage: + * + * On server: + * ncdevmem -s <server IP> [-c <client IP>] -f eth1 -l -p 5201 + * + * On client: + * echo -n "hello\nworld" | nc -s <server IP> 5201 -p 5201 + * + * Test data validation: + * + * On server: + * ncdevmem -s <server IP> [-c <client IP>] -f eth1 -l -p 5201 -v 7 + * + * On client: + * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \ + * tr \\n \\0 | \ + * head -c 5G | \ + * nc <server IP> 5201 -p 5201 + * + * + * Note this is compatible with regular netcat. i.e. the sender or receiver can + * be replaced with regular netcat to test the RX or TX path in isolation. + */ +#define _GNU_SOURCE +#define __EXPORTED_HEADERS__ + +#include <linux/uio.h> +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <stdbool.h> +#include <string.h> +#include <errno.h> +#define __iovec_defined +#include <fcntl.h> +#include <malloc.h> +#include <error.h> + +#include <arpa/inet.h> +#include <sys/socket.h> +#include <sys/mman.h> +#include <sys/ioctl.h> +#include <sys/syscall.h> + +#include <linux/memfd.h> +#include <linux/dma-buf.h> +#include <linux/udmabuf.h> +#include <libmnl/libmnl.h> +#include <linux/types.h> +#include <linux/netlink.h> +#include <linux/genetlink.h> +#include <linux/netdev.h> +#include <linux/ethtool_netlink.h> +#include <time.h> +#include <net/if.h> + +#include "netdev-user.h" +#include "ethtool-user.h" +#include <ynl.h> + +#define PAGE_SHIFT 12 +#define TEST_PREFIX "ncdevmem" +#define NUM_PAGES 16000 + +#ifndef MSG_SOCK_DEVMEM +#define MSG_SOCK_DEVMEM 0x2000000 +#endif + +static char *server_ip; +static char *client_ip; +static char *port; +static size_t do_validation; +static int start_queue = -1; +static int num_queues = -1; +static char *ifname; +static unsigned int ifindex; +static unsigned int dmabuf_id; + +struct memory_buffer { + int fd; + size_t size; + + int devfd; + int memfd; + char *buf_mem; +}; + +struct memory_provider { + struct memory_buffer *(*alloc)(size_t size); + void (*free)(struct memory_buffer *ctx); + void (*memcpy_from_device)(void *dst, struct memory_buffer *src, + size_t off, int n); +}; + +static struct memory_buffer *udmabuf_alloc(size_t size) +{ + struct udmabuf_create create; + struct memory_buffer *ctx; + int ret; + + ctx = malloc(sizeof(*ctx)); + if (!ctx) + error(1, ENOMEM, "malloc failed"); + + ctx->size = size; + + ctx->devfd = open("/dev/udmabuf", O_RDWR); + if (ctx->devfd < 0) + error(1, errno, + "%s: [skip,no-udmabuf: Unable to access DMA buffer device file]\n", + TEST_PREFIX); + + ctx->memfd = memfd_create("udmabuf-test", MFD_ALLOW_SEALING); + if (ctx->memfd < 0) + error(1, errno, "%s: [skip,no-memfd]\n", TEST_PREFIX); + + ret = fcntl(ctx->memfd, F_ADD_SEALS, F_SEAL_SHRINK); + if (ret < 0) + error(1, errno, "%s: [skip,fcntl-add-seals]\n", TEST_PREFIX); + + ret = ftruncate(ctx->memfd, size); + if (ret == -1) + error(1, errno, "%s: [FAIL,memfd-truncate]\n", TEST_PREFIX); + + memset(&create, 0, sizeof(create)); + + create.memfd = ctx->memfd; + create.offset = 0; + create.size = size; + ctx->fd = ioctl(ctx->devfd, UDMABUF_CREATE, &create); + if (ctx->fd < 0) + error(1, errno, "%s: [FAIL, create udmabuf]\n", TEST_PREFIX); + + ctx->buf_mem = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, + ctx->fd, 0); + if (ctx->buf_mem == MAP_FAILED) + error(1, errno, "%s: [FAIL, map udmabuf]\n", TEST_PREFIX); + + return ctx; +} + +static void udmabuf_free(struct memory_buffer *ctx) +{ + munmap(ctx->buf_mem, ctx->size); + close(ctx->fd); + close(ctx->memfd); + close(ctx->devfd); + free(ctx); +} + +static void udmabuf_memcpy_from_device(void *dst, struct memory_buffer *src, + size_t off, int n) +{ + struct dma_buf_sync sync = {}; + + sync.flags = DMA_BUF_SYNC_START; + ioctl(src->fd, DMA_BUF_IOCTL_SYNC, &sync); + + memcpy(dst, src->buf_mem + off, n); + + sync.flags = DMA_BUF_SYNC_END; + ioctl(src->fd, DMA_BUF_IOCTL_SYNC, &sync); +} + +static struct memory_provider udmabuf_memory_provider = { + .alloc = udmabuf_alloc, + .free = udmabuf_free, + .memcpy_from_device = udmabuf_memcpy_from_device, +}; + +static struct memory_provider *provider = &udmabuf_memory_provider; + +static void print_nonzero_bytes(void *ptr, size_t size) +{ + unsigned char *p = ptr; + unsigned int i; + + for (i = 0; i < size; i++) + putchar(p[i]); +} + +void validate_buffer(void *line, size_t size) +{ + static unsigned char seed = 1; + unsigned char *ptr = line; + int errors = 0; + size_t i; + + for (i = 0; i < size; i++) { + if (ptr[i] != seed) { + fprintf(stderr, + "Failed validation: expected=%u, actual=%u, index=%lu\n", + seed, ptr[i], i); + errors++; + if (errors > 20) + error(1, 0, "validation failed."); + } + seed++; + if (seed == do_validation) + seed = 0; + } + + fprintf(stdout, "Validated buffer\n"); +} + +static int rxq_num(int ifindex) +{ + struct ethtool_channels_get_req *req; + struct ethtool_channels_get_rsp *rsp; + struct ynl_error yerr; + struct ynl_sock *ys; + int num = -1; + + ys = ynl_sock_create(&ynl_ethtool_family, &yerr); + if (!ys) { + fprintf(stderr, "YNL: %s\n", yerr.msg); + return -1; + } + + req = ethtool_channels_get_req_alloc(); + ethtool_channels_get_req_set_header_dev_index(req, ifindex); + rsp = ethtool_channels_get(ys, req); + if (rsp) + num = rsp->rx_count + rsp->combined_count; + ethtool_channels_get_req_free(req); + ethtool_channels_get_rsp_free(rsp); + + ynl_sock_destroy(ys); + + return num; +} + +#define run_command(cmd, ...) \ + ({ \ + char command[256]; \ + memset(command, 0, sizeof(command)); \ + snprintf(command, sizeof(command), cmd, ##__VA_ARGS__); \ + fprintf(stderr, "Running: %s\n", command); \ + system(command); \ + }) + +static int reset_flow_steering(void) +{ + /* Depending on the NIC, toggling ntuple off and on might not + * be allowed. Additionally, attempting to delete existing filters + * will fail if no filters are present. Therefore, do not enforce + * the exit status. + */ + + run_command("sudo ethtool -K %s ntuple off >&2", ifname); + run_command("sudo ethtool -K %s ntuple on >&2", ifname); + run_command( + "sudo ethtool -n %s | grep 'Filter:' | awk '{print $2}' | xargs -n1 ethtool -N %s delete >&2", + ifname, ifname); + return 0; +} + +static const char *tcp_data_split_str(int val) +{ + switch (val) { + case 0: + return "off"; + case 1: + return "auto"; + case 2: + return "on"; + default: + return "?"; + } +} + +static int configure_headersplit(bool on) +{ + struct ethtool_rings_get_req *get_req; + struct ethtool_rings_get_rsp *get_rsp; + struct ethtool_rings_set_req *req; + struct ynl_error yerr; + struct ynl_sock *ys; + int ret; + + ys = ynl_sock_create(&ynl_ethtool_family, &yerr); + if (!ys) { + fprintf(stderr, "YNL: %s\n", yerr.msg); + return -1; + } + + req = ethtool_rings_set_req_alloc(); + ethtool_rings_set_req_set_header_dev_index(req, ifindex); + /* 0 - off, 1 - auto, 2 - on */ + ethtool_rings_set_req_set_tcp_data_split(req, on ? 2 : 0); + ret = ethtool_rings_set(ys, req); + if (ret < 0) + fprintf(stderr, "YNL failed: %s\n", ys->err.msg); + ethtool_rings_set_req_free(req); + + if (ret == 0) { + get_req = ethtool_rings_get_req_alloc(); + ethtool_rings_get_req_set_header_dev_index(get_req, ifindex); + get_rsp = ethtool_rings_get(ys, get_req); + ethtool_rings_get_req_free(get_req); + if (get_rsp) + fprintf(stderr, "TCP header split: %s\n", + tcp_data_split_str(get_rsp->tcp_data_split)); + ethtool_rings_get_rsp_free(get_rsp); + } + + ynl_sock_destroy(ys); + + return ret; +} + +static int configure_rss(void) +{ + return run_command("sudo ethtool -X %s equal %d >&2", ifname, start_queue); +} + +static int configure_channels(unsigned int rx, unsigned int tx) +{ + return run_command("sudo ethtool -L %s rx %u tx %u", ifname, rx, tx); +} + +static int configure_flow_steering(struct sockaddr_in6 *server_sin) +{ + const char *type = "tcp6"; + const char *server_addr; + char buf[40]; + + inet_ntop(AF_INET6, &server_sin->sin6_addr, buf, sizeof(buf)); + server_addr = buf; + + if (IN6_IS_ADDR_V4MAPPED(&server_sin->sin6_addr)) { + type = "tcp4"; + server_addr = strrchr(server_addr, ':') + 1; + } + + return run_command("sudo ethtool -N %s flow-type %s %s %s dst-ip %s %s %s dst-port %s queue %d >&2", + ifname, + type, + client_ip ? "src-ip" : "", + client_ip ?: "", + server_addr, + client_ip ? "src-port" : "", + client_ip ? port : "", + port, start_queue); +} + +static int bind_rx_queue(unsigned int ifindex, unsigned int dmabuf_fd, + struct netdev_queue_id *queues, + unsigned int n_queue_index, struct ynl_sock **ys) +{ + struct netdev_bind_rx_req *req = NULL; + struct netdev_bind_rx_rsp *rsp = NULL; + struct ynl_error yerr; + + *ys = ynl_sock_create(&ynl_netdev_family, &yerr); + if (!*ys) { + fprintf(stderr, "YNL: %s\n", yerr.msg); + return -1; + } + + req = netdev_bind_rx_req_alloc(); + netdev_bind_rx_req_set_ifindex(req, ifindex); + netdev_bind_rx_req_set_fd(req, dmabuf_fd); + __netdev_bind_rx_req_set_queues(req, queues, n_queue_index); + + rsp = netdev_bind_rx(*ys, req); + if (!rsp) { + perror("netdev_bind_rx"); + goto err_close; + } + + if (!rsp->_present.id) { + perror("id not present"); + goto err_close; + } + + fprintf(stderr, "got dmabuf id=%d\n", rsp->id); + dmabuf_id = rsp->id; + + netdev_bind_rx_req_free(req); + netdev_bind_rx_rsp_free(rsp); + + return 0; + +err_close: + fprintf(stderr, "YNL failed: %s\n", (*ys)->err.msg); + netdev_bind_rx_req_free(req); + ynl_sock_destroy(*ys); + return -1; +} + +static void enable_reuseaddr(int fd) +{ + int opt = 1; + int ret; + + ret = setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, &opt, sizeof(opt)); + if (ret) + error(1, errno, "%s: [FAIL, SO_REUSEPORT]\n", TEST_PREFIX); + + ret = setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)); + if (ret) + error(1, errno, "%s: [FAIL, SO_REUSEADDR]\n", TEST_PREFIX); +} + +static int parse_address(const char *str, int port, struct sockaddr_in6 *sin6) +{ + int ret; + + sin6->sin6_family = AF_INET6; + sin6->sin6_port = htons(port); + + ret = inet_pton(sin6->sin6_family, str, &sin6->sin6_addr); + if (ret != 1) { + /* fallback to plain IPv4 */ + ret = inet_pton(AF_INET, str, &sin6->sin6_addr.s6_addr32[3]); + if (ret != 1) + return -1; + + /* add ::ffff prefix */ + sin6->sin6_addr.s6_addr32[0] = 0; + sin6->sin6_addr.s6_addr32[1] = 0; + sin6->sin6_addr.s6_addr16[4] = 0; + sin6->sin6_addr.s6_addr16[5] = 0xffff; + } + + return 0; +} + +int do_server(struct memory_buffer *mem) +{ + char ctrl_data[sizeof(int) * 20000]; + struct netdev_queue_id *queues; + size_t non_page_aligned_frags = 0; + struct sockaddr_in6 client_addr; + struct sockaddr_in6 server_sin; + size_t page_aligned_frags = 0; + size_t total_received = 0; + socklen_t client_addr_len; + bool is_devmem = false; + char *tmp_mem = NULL; + struct ynl_sock *ys; + char iobuf[819200]; + char buffer[256]; + int socket_fd; + int client_fd; + size_t i = 0; + int ret; + + ret = parse_address(server_ip, atoi(port), &server_sin); + if (ret < 0) + error(1, 0, "parse server address"); + + if (reset_flow_steering()) + error(1, 0, "Failed to reset flow steering\n"); + + if (configure_headersplit(1)) + error(1, 0, "Failed to enable TCP header split\n"); + + /* Configure RSS to divert all traffic from our devmem queues */ + if (configure_rss()) + error(1, 0, "Failed to configure rss\n"); + + /* Flow steer our devmem flows to start_queue */ + if (configure_flow_steering(&server_sin)) + error(1, 0, "Failed to configure flow steering\n"); + + sleep(1); + + queues = malloc(sizeof(*queues) * num_queues); + + for (i = 0; i < num_queues; i++) { + queues[i]._present.type = 1; + queues[i]._present.id = 1; + queues[i].type = NETDEV_QUEUE_TYPE_RX; + queues[i].id = start_queue + i; + } + + if (bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) + error(1, 0, "Failed to bind\n"); + + tmp_mem = malloc(mem->size); + if (!tmp_mem) + error(1, ENOMEM, "malloc failed"); + + socket_fd = socket(AF_INET6, SOCK_STREAM, 0); + if (socket_fd < 0) + error(1, errno, "%s: [FAIL, create socket]\n", TEST_PREFIX); + + enable_reuseaddr(socket_fd); + + fprintf(stderr, "binding to address %s:%d\n", server_ip, + ntohs(server_sin.sin6_port)); + + ret = bind(socket_fd, &server_sin, sizeof(server_sin)); + if (ret) + error(1, errno, "%s: [FAIL, bind]\n", TEST_PREFIX); + + ret = listen(socket_fd, 1); + if (ret) + error(1, errno, "%s: [FAIL, listen]\n", TEST_PREFIX); + + client_addr_len = sizeof(client_addr); + + inet_ntop(AF_INET6, &server_sin.sin6_addr, buffer, + sizeof(buffer)); + fprintf(stderr, "Waiting or connection on %s:%d\n", buffer, + ntohs(server_sin.sin6_port)); + client_fd = accept(socket_fd, &client_addr, &client_addr_len); + + inet_ntop(AF_INET6, &client_addr.sin6_addr, buffer, + sizeof(buffer)); + fprintf(stderr, "Got connection from %s:%d\n", buffer, + ntohs(client_addr.sin6_port)); + + while (1) { + struct iovec iov = { .iov_base = iobuf, + .iov_len = sizeof(iobuf) }; + struct dmabuf_cmsg *dmabuf_cmsg = NULL; + struct cmsghdr *cm = NULL; + struct msghdr msg = { 0 }; + struct dmabuf_token token; + ssize_t ret; + + is_devmem = false; + + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = ctrl_data; + msg.msg_controllen = sizeof(ctrl_data); + ret = recvmsg(client_fd, &msg, MSG_SOCK_DEVMEM); + fprintf(stderr, "recvmsg ret=%ld\n", ret); + if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) + continue; + if (ret < 0) { + perror("recvmsg"); + continue; + } + if (ret == 0) { + fprintf(stderr, "client exited\n"); + goto cleanup; + } + + i++; + for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) { + if (cm->cmsg_level != SOL_SOCKET || + (cm->cmsg_type != SCM_DEVMEM_DMABUF && + cm->cmsg_type != SCM_DEVMEM_LINEAR)) { + fprintf(stderr, "skipping non-devmem cmsg\n"); + continue; + } + + dmabuf_cmsg = (struct dmabuf_cmsg *)CMSG_DATA(cm); + is_devmem = true; + + if (cm->cmsg_type == SCM_DEVMEM_LINEAR) { + /* TODO: process data copied from skb's linear + * buffer. + */ + fprintf(stderr, + "SCM_DEVMEM_LINEAR. dmabuf_cmsg->frag_size=%u\n", + dmabuf_cmsg->frag_size); + + continue; + } + + token.token_start = dmabuf_cmsg->frag_token; + token.token_count = 1; + + total_received += dmabuf_cmsg->frag_size; + fprintf(stderr, + "received frag_page=%llu, in_page_offset=%llu, frag_offset=%llu, frag_size=%u, token=%u, total_received=%lu, dmabuf_id=%u\n", + dmabuf_cmsg->frag_offset >> PAGE_SHIFT, + dmabuf_cmsg->frag_offset % getpagesize(), + dmabuf_cmsg->frag_offset, + dmabuf_cmsg->frag_size, dmabuf_cmsg->frag_token, + total_received, dmabuf_cmsg->dmabuf_id); + + if (dmabuf_cmsg->dmabuf_id != dmabuf_id) + error(1, 0, + "received on wrong dmabuf_id: flow steering error\n"); + + if (dmabuf_cmsg->frag_size % getpagesize()) + non_page_aligned_frags++; + else + page_aligned_frags++; + + provider->memcpy_from_device(tmp_mem, mem, + dmabuf_cmsg->frag_offset, + dmabuf_cmsg->frag_size); + + if (do_validation) + validate_buffer(tmp_mem, + dmabuf_cmsg->frag_size); + else + print_nonzero_bytes(tmp_mem, + dmabuf_cmsg->frag_size); + + ret = setsockopt(client_fd, SOL_SOCKET, + SO_DEVMEM_DONTNEED, &token, + sizeof(token)); + if (ret != 1) + error(1, 0, + "SO_DEVMEM_DONTNEED not enough tokens"); + } + if (!is_devmem) + error(1, 0, "flow steering error\n"); + + fprintf(stderr, "total_received=%lu\n", total_received); + } + + fprintf(stderr, "%s: ok\n", TEST_PREFIX); + + fprintf(stderr, "page_aligned_frags=%lu, non_page_aligned_frags=%lu\n", + page_aligned_frags, non_page_aligned_frags); + + fprintf(stderr, "page_aligned_frags=%lu, non_page_aligned_frags=%lu\n", + page_aligned_frags, non_page_aligned_frags); + +cleanup: + + free(tmp_mem); + close(client_fd); + close(socket_fd); + ynl_sock_destroy(ys); + + return 0; +} + +void run_devmem_tests(void) +{ + struct netdev_queue_id *queues; + struct memory_buffer *mem; + struct ynl_sock *ys; + size_t i = 0; + + mem = provider->alloc(getpagesize() * NUM_PAGES); + + /* Configure RSS to divert all traffic from our devmem queues */ + if (configure_rss()) + error(1, 0, "rss error\n"); + + queues = calloc(num_queues, sizeof(*queues)); + + if (configure_headersplit(1)) + error(1, 0, "Failed to configure header split\n"); + + if (!bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) + error(1, 0, "Binding empty queues array should have failed\n"); + + for (i = 0; i < num_queues; i++) { + queues[i]._present.type = 1; + queues[i]._present.id = 1; + queues[i].type = NETDEV_QUEUE_TYPE_RX; + queues[i].id = start_queue + i; + } + + if (configure_headersplit(0)) + error(1, 0, "Failed to configure header split\n"); + + if (!bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) + error(1, 0, "Configure dmabuf with header split off should have failed\n"); + + if (configure_headersplit(1)) + error(1, 0, "Failed to configure header split\n"); + + for (i = 0; i < num_queues; i++) { + queues[i]._present.type = 1; + queues[i]._present.id = 1; + queues[i].type = NETDEV_QUEUE_TYPE_RX; + queues[i].id = start_queue + i; + } + + if (bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) + error(1, 0, "Failed to bind\n"); + + /* Deactivating a bound queue should not be legal */ + if (!configure_channels(num_queues, num_queues - 1)) + error(1, 0, "Deactivating a bound queue should be illegal.\n"); + + /* Closing the netlink socket does an implicit unbind */ + ynl_sock_destroy(ys); + + provider->free(mem); +} + +int main(int argc, char *argv[]) +{ + struct memory_buffer *mem; + int is_server = 0, opt; + int ret; + + while ((opt = getopt(argc, argv, "ls:c:p:v:q:t:f:")) != -1) { + switch (opt) { + case 'l': + is_server = 1; + break; + case 's': + server_ip = optarg; + break; + case 'c': + client_ip = optarg; + break; + case 'p': + port = optarg; + break; + case 'v': + do_validation = atoll(optarg); + break; + case 'q': + num_queues = atoi(optarg); + break; + case 't': + start_queue = atoi(optarg); + break; + case 'f': + ifname = optarg; + break; + case '?': + fprintf(stderr, "unknown option: %c\n", optopt); + break; + } + } + + if (!ifname) + error(1, 0, "Missing -f argument\n"); + + ifindex = if_nametoindex(ifname); + + if (!server_ip && !client_ip) { + if (start_queue < 0 && num_queues < 0) { + num_queues = rxq_num(ifindex); + if (num_queues < 0) + error(1, 0, "couldn't detect number of queues\n"); + if (num_queues < 2) + error(1, 0, + "number of device queues is too low\n"); + /* make sure can bind to multiple queues */ + start_queue = num_queues / 2; + num_queues /= 2; + } + + if (start_queue < 0 || num_queues < 0) + error(1, 0, "Both -t and -q are required\n"); + + run_devmem_tests(); + return 0; + } + + if (start_queue < 0 && num_queues < 0) { + num_queues = rxq_num(ifindex); + if (num_queues < 2) + error(1, 0, "number of device queues is too low\n"); + + num_queues = 1; + start_queue = rxq_num(ifindex) - num_queues; + + if (start_queue < 0) + error(1, 0, "couldn't detect number of queues\n"); + + fprintf(stderr, "using queues %d..%d\n", start_queue, start_queue + num_queues); + } + + for (; optind < argc; optind++) + fprintf(stderr, "extra arguments: %s\n", argv[optind]); + + if (start_queue < 0) + error(1, 0, "Missing -t argument\n"); + + if (num_queues < 0) + error(1, 0, "Missing -q argument\n"); + + if (!server_ip) + error(1, 0, "Missing -s argument\n"); + + if (!port) + error(1, 0, "Missing -p argument\n"); + + mem = provider->alloc(getpagesize() * NUM_PAGES); + ret = is_server ? do_server(mem) : 1; + provider->free(mem); + + return ret; +} diff --git a/tools/testing/selftests/drivers/net/hw/nic_link_layer.py b/tools/testing/selftests/drivers/net/hw/nic_link_layer.py new file mode 100644 index 000000000000..efd921180532 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/nic_link_layer.py @@ -0,0 +1,113 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +#Introduction: +#This file has basic link layer tests for generic NIC drivers. +#The test comprises of auto-negotiation, speed and duplex checks. +# +#Setup: +#Connect the DUT PC with NIC card to partner pc back via ethernet medium of your choice(RJ45, T1) +# +# DUT PC Partner PC +#┌───────────────────────┐ ┌──────────────────────────┐ +#│ │ │ │ +#│ │ │ │ +#│ ┌───────────┐ │ │ +#│ │DUT NIC │ Eth │ │ +#│ │Interface ─┼─────────────────────────┼─ any eth Interface │ +#│ └───────────┘ │ │ +#│ │ │ │ +#│ │ │ │ +#└───────────────────────┘ └──────────────────────────┘ +# +#Configurations: +#Required minimum ethtool version is 6.10 (supports json) +#Default values: +#time_delay = 8 #time taken to wait for transitions to happen, in seconds. + +import time +import argparse +from lib.py import ksft_run, ksft_exit, ksft_pr, ksft_eq +from lib.py import KsftFailEx, KsftSkipEx +from lib.py import NetDrvEpEnv +from lib.py import LinkConfig + +def _pre_test_checks(cfg: object, link_config: LinkConfig) -> None: + if link_config.partner_netif is None: + KsftSkipEx("Partner interface is not available") + if not link_config.check_autoneg_supported() or not link_config.check_autoneg_supported(remote=True): + KsftSkipEx(f"Auto-negotiation not supported for interface {cfg.ifname} or {link_config.partner_netif}") + if not link_config.verify_link_up(): + raise KsftSkipEx(f"Link state of interface {cfg.ifname} is DOWN") + +def verify_autonegotiation(cfg: object, expected_state: str, link_config: LinkConfig) -> None: + if not link_config.verify_link_up(): + raise KsftSkipEx(f"Link state of interface {cfg.ifname} is DOWN") + """Verifying the autonegotiation state in partner""" + partner_autoneg_output = link_config.get_ethtool_field("auto-negotiation", remote=True) + if partner_autoneg_output is None: + KsftSkipEx(f"Auto-negotiation state not available for interface {link_config.partner_netif}") + partner_autoneg_state = "on" if partner_autoneg_output is True else "off" + + ksft_eq(partner_autoneg_state, expected_state) + + """Verifying the autonegotiation state of local""" + autoneg_output = link_config.get_ethtool_field("auto-negotiation") + if autoneg_output is None: + KsftSkipEx(f"Auto-negotiation state not available for interface {cfg.ifname}") + actual_state = "on" if autoneg_output is True else "off" + + ksft_eq(actual_state, expected_state) + + """Verifying the link establishment""" + link_available = link_config.get_ethtool_field("link-detected") + if link_available is None: + KsftSkipEx(f"Link status not available for interface {cfg.ifname}") + if link_available != True: + raise KsftSkipEx("Link not established at interface {cfg.ifname} after changing auto-negotiation") + +def test_autonegotiation(cfg: object, link_config: LinkConfig, time_delay: int) -> None: + _pre_test_checks(cfg, link_config) + for state in ["off", "on"]: + if not link_config.set_autonegotiation_state(state, remote=True): + raise KsftSkipEx(f"Unable to set auto-negotiation state for interface {link_config.partner_netif}") + if not link_config.set_autonegotiation_state(state): + raise KsftSkipEx(f"Unable to set auto-negotiation state for interface {cfg.ifname}") + time.sleep(time_delay) + verify_autonegotiation(cfg, state, link_config) + +def test_network_speed(cfg: object, link_config: LinkConfig, time_delay: int) -> None: + _pre_test_checks(cfg, link_config) + common_link_modes = link_config.common_link_modes + if not common_link_modes: + KsftSkipEx("No common link modes exist") + speeds, duplex_modes = link_config.get_speed_duplex_values(common_link_modes) + + if speeds and duplex_modes and len(speeds) == len(duplex_modes): + for idx in range(len(speeds)): + speed = speeds[idx] + duplex = duplex_modes[idx] + if not link_config.set_speed_and_duplex(speed, duplex): + raise KsftFailEx(f"Unable to set speed and duplex parameters for {cfg.ifname}") + time.sleep(time_delay) + if not link_config.verify_speed_and_duplex(speed, duplex): + raise KsftSkipEx(f"Error occurred while verifying speed and duplex states for interface {cfg.ifname}") + else: + if not speeds or not duplex_modes: + KsftSkipEx(f"No supported speeds or duplex modes found for interface {cfg.ifname}") + else: + KsftSkipEx("Mismatch in the number of speeds and duplex modes") + +def main() -> None: + parser = argparse.ArgumentParser(description="Run basic link layer tests for NIC driver") + parser.add_argument('--time-delay', type=int, default=8, help='Time taken to wait for transitions to happen(in seconds). Default is 8 seconds.') + args = parser.parse_args() + time_delay = args.time_delay + with NetDrvEpEnv(__file__, nsim_test=False) as cfg: + link_config = LinkConfig(cfg) + ksft_run(globs=globals(), case_pfx={"test_"}, args=(cfg, link_config, time_delay,)) + link_config.reset_interface() + ksft_exit() + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/nic_performance.py b/tools/testing/selftests/drivers/net/hw/nic_performance.py new file mode 100644 index 000000000000..201403b76ea3 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/nic_performance.py @@ -0,0 +1,137 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +#Introduction: +#This file has basic performance test for generic NIC drivers. +#The test comprises of throughput check for TCP and UDP streams. +# +#Setup: +#Connect the DUT PC with NIC card to partner pc back via ethernet medium of your choice(RJ45, T1) +# +# DUT PC Partner PC +#┌───────────────────────┐ ┌──────────────────────────┐ +#│ │ │ │ +#│ │ │ │ +#│ ┌───────────┐ │ │ +#│ │DUT NIC │ Eth │ │ +#│ │Interface ─┼─────────────────────────┼─ any eth Interface │ +#│ └───────────┘ │ │ +#│ │ │ │ +#│ │ │ │ +#└───────────────────────┘ └──────────────────────────┘ +# +#Configurations: +#To prevent interruptions, Add ethtool, ip to the sudoers list in remote PC and get the ssh key from remote. +#Required minimum ethtool version is 6.10 +#Change the below configuration based on your hw needs. +# """Default values""" +#time_delay = 8 #time taken to wait for transitions to happen, in seconds. +#test_duration = 10 #performance test duration for the throughput check, in seconds. +#send_throughput_threshold = 80 #percentage of send throughput required to pass the check +#receive_throughput_threshold = 50 #percentage of receive throughput required to pass the check + +import time +import json +import argparse +from lib.py import ksft_run, ksft_exit, ksft_pr, ksft_true +from lib.py import KsftFailEx, KsftSkipEx, GenerateTraffic +from lib.py import NetDrvEpEnv, bkg, wait_port_listen +from lib.py import cmd +from lib.py import LinkConfig + +class TestConfig: + def __init__(self, time_delay: int, test_duration: int, send_throughput_threshold: int, receive_throughput_threshold: int) -> None: + self.time_delay = time_delay + self.test_duration = test_duration + self.send_throughput_threshold = send_throughput_threshold + self.receive_throughput_threshold = receive_throughput_threshold + +def _pre_test_checks(cfg: object, link_config: LinkConfig) -> None: + if not link_config.verify_link_up(): + KsftSkipEx(f"Link state of interface {cfg.ifname} is DOWN") + common_link_modes = link_config.common_link_modes + if common_link_modes is None: + KsftSkipEx("No common link modes found") + if link_config.partner_netif == None: + KsftSkipEx("Partner interface is not available") + if link_config.check_autoneg_supported(): + KsftSkipEx("Auto-negotiation not supported by local") + if link_config.check_autoneg_supported(remote=True): + KsftSkipEx("Auto-negotiation not supported by remote") + cfg.require_cmd("iperf3", remote=True) + +def check_throughput(cfg: object, link_config: LinkConfig, test_config: TestConfig, protocol: str, traffic: GenerateTraffic) -> None: + common_link_modes = link_config.common_link_modes + speeds, duplex_modes = link_config.get_speed_duplex_values(common_link_modes) + """Test duration in seconds""" + duration = test_config.test_duration + + ksft_pr(f"{protocol} test") + test_type = "-u" if protocol == "UDP" else "" + + send_throughput = [] + receive_throughput = [] + for idx in range(0, len(speeds)): + if link_config.set_speed_and_duplex(speeds[idx], duplex_modes[idx]) == False: + raise KsftFailEx(f"Not able to set speed and duplex parameters for {cfg.ifname}") + time.sleep(test_config.time_delay) + if not link_config.verify_link_up(): + raise KsftSkipEx(f"Link state of interface {cfg.ifname} is DOWN") + + send_command=f"{test_type} -b 0 -t {duration} --json" + receive_command=f"{test_type} -b 0 -t {duration} --reverse --json" + + send_result = traffic.run_remote_test(cfg, command=send_command) + if send_result.ret != 0: + raise KsftSkipEx("Error occurred during data transmit: {send_result.stdout}") + + send_output = send_result.stdout + send_data = json.loads(send_output) + + """Convert throughput to Mbps""" + send_throughput.append(round(send_data['end']['sum_sent']['bits_per_second'] / 1e6, 2)) + ksft_pr(f"{protocol}: Send throughput: {send_throughput[idx]} Mbps") + + receive_result = traffic.run_remote_test(cfg, command=receive_command) + if receive_result.ret != 0: + raise KsftSkipEx("Error occurred during data receive: {receive_result.stdout}") + + receive_output = receive_result.stdout + receive_data = json.loads(receive_output) + + """Convert throughput to Mbps""" + receive_throughput.append(round(receive_data['end']['sum_received']['bits_per_second'] / 1e6, 2)) + ksft_pr(f"{protocol}: Receive throughput: {receive_throughput[idx]} Mbps") + + """Check whether throughput is not below the threshold (default values set at start)""" + for idx in range(0, len(speeds)): + send_threshold = float(speeds[idx]) * float(test_config.send_throughput_threshold / 100) + receive_threshold = float(speeds[idx]) * float(test_config.receive_throughput_threshold / 100) + ksft_true(send_throughput[idx] >= send_threshold, f"{protocol}: Send throughput is below threshold for {speeds[idx]} Mbps in {duplex_modes[idx]} duplex") + ksft_true(receive_throughput[idx] >= receive_threshold, f"{protocol}: Receive throughput is below threshold for {speeds[idx]} Mbps in {duplex_modes[idx]} duplex") + +def test_tcp_throughput(cfg: object, link_config: LinkConfig, test_config: TestConfig, traffic: GenerateTraffic) -> None: + _pre_test_checks(cfg, link_config) + check_throughput(cfg, link_config, test_config, 'TCP', traffic) + +def test_udp_throughput(cfg: object, link_config: LinkConfig, test_config: TestConfig, traffic: GenerateTraffic) -> None: + _pre_test_checks(cfg, link_config) + check_throughput(cfg, link_config, test_config, 'UDP', traffic) + +def main() -> None: + parser = argparse.ArgumentParser(description="Run basic performance test for NIC driver") + parser.add_argument('--time-delay', type=int, default=8, help='Time taken to wait for transitions to happen(in seconds). Default is 8 seconds.') + parser.add_argument('--test-duration', type=int, default=10, help='Performance test duration for the throughput check, in seconds. Default is 10 seconds.') + parser.add_argument('--stt', type=int, default=80, help='Send throughput Threshold: Percentage of send throughput upon actual throughput required to pass the throughput check (in percentage). Default is 80.') + parser.add_argument('--rtt', type=int, default=50, help='Receive throughput Threshold: Percentage of receive throughput upon actual throughput required to pass the throughput check (in percentage). Default is 50.') + args=parser.parse_args() + test_config = TestConfig(args.time_delay, args.test_duration, args.stt, args.rtt) + with NetDrvEpEnv(__file__, nsim_test=False) as cfg: + traffic = GenerateTraffic(cfg) + link_config = LinkConfig(cfg) + ksft_run(globs=globals(), case_pfx={"test_"}, args=(cfg, link_config, test_config, traffic, )) + link_config.reset_interface() + ksft_exit() + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index 9d7adb3cf33b..0b49ce7ae678 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -6,7 +6,7 @@ import random from lib.py import ksft_run, ksft_pr, ksft_exit, ksft_eq, ksft_ne, ksft_ge, ksft_lt from lib.py import NetDrvEpEnv from lib.py import EthtoolFamily, NetdevFamily -from lib.py import KsftSkipEx +from lib.py import KsftSkipEx, KsftFailEx from lib.py import rand_port from lib.py import ethtool, ip, defer, GenerateTraffic, CmdExitFailure @@ -215,7 +215,7 @@ def test_rss_queue_reconfigure(cfg, main_ctx=True): defer(ethtool, f"-X {cfg.ifname} default") else: other_key = 'noise' - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}") @@ -238,6 +238,32 @@ def test_rss_queue_reconfigure(cfg, main_ctx=True): else: raise Exception(f"Driver didn't prevent us from deactivating a used queue (context {ctx_id})") + if not main_ctx: + ethtool(f"-L {cfg.ifname} combined 4") + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id} action 1" + try: + # this targets queue 4, which doesn't exist + ntuple2 = ethtool_create(cfg, "-N", flow) + except CmdExitFailure: + pass + else: + raise Exception(f"Driver didn't prevent us from targeting a nonexistent queue (context {ctx_id})") + # change the table to target queues 0 and 2 + ethtool(f"-X {cfg.ifname} {ctx_ref} weight 1 0 1 0") + # ntuple rule therefore targets queues 1 and 3 + ntuple2 = ethtool_create(cfg, "-N", flow) + # should replace existing filter + ksft_eq(ntuple, ntuple2) + _send_traffic_check(cfg, port, ctx_ref, { 'target': (1, 3), + 'noise' : (0, 2) }) + # Setting queue count to 3 should fail, queue 3 is used + try: + ethtool(f"-L {cfg.ifname} combined 3") + except CmdExitFailure: + pass + else: + raise Exception(f"Driver didn't prevent us from deactivating a used queue (context {ctx_id})") + def test_rss_resize(cfg): """Test resizing of the RSS table. @@ -429,7 +455,7 @@ def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None): ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data)) ports.append(rand_port()) - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}") @@ -516,7 +542,7 @@ def test_rss_context_out_of_order(cfg, ctx_cnt=4): ctx.append(defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete")) ports.append(rand_port()) - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple.append(defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")) @@ -569,7 +595,7 @@ def test_rss_context_overlap(cfg, other_ctx=0): port = rand_port() if other_ctx: - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {other_ctx}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {other_ctx}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple = defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") @@ -587,7 +613,7 @@ def test_rss_context_overlap(cfg, other_ctx=0): # Now create a rule for context 1 and make sure traffic goes to a subset if other_ctx: ntuple.exec() - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") @@ -606,6 +632,72 @@ def test_rss_context_overlap2(cfg): test_rss_context_overlap(cfg, True) +def test_delete_rss_context_busy(cfg): + """ + Test that deletion returns -EBUSY when an rss context is being used + by an ntuple filter. + """ + + require_ntuple(cfg) + + # create additional rss context + ctx_id = ethtool_create(cfg, "-X", "context new") + ctx_deleter = defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete") + + # utilize context from ntuple filter + port = rand_port() + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" + ntuple_id = ethtool_create(cfg, "-N", flow) + defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") + + # attempt to delete in-use context + try: + ctx_deleter.exec_only() + ctx_deleter.cancel() + raise KsftFailEx(f"deleted context {ctx_id} used by rule {ntuple_id}") + except CmdExitFailure: + pass + + +def test_rss_ntuple_addition(cfg): + """ + Test that the queue offset (ring_cookie) of an ntuple rule is added + to the queue number read from the indirection table. + """ + + require_ntuple(cfg) + + queue_cnt = len(_get_rx_cnts(cfg)) + if queue_cnt < 4: + try: + ksft_pr(f"Increasing queue count {queue_cnt} -> 4") + ethtool(f"-L {cfg.ifname} combined 4") + defer(ethtool, f"-L {cfg.ifname} combined {queue_cnt}") + except: + raise KsftSkipEx("Not enough queues for the test") + + # Use queue 0 for normal traffic + ethtool(f"-X {cfg.ifname} equal 1") + defer(ethtool, f"-X {cfg.ifname} default") + + # create additional rss context + ctx_id = ethtool_create(cfg, "-X", "context new equal 2") + defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete") + + # utilize context from ntuple filter + port = rand_port() + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id} action 2" + try: + ntuple_id = ethtool_create(cfg, "-N", flow) + except CmdExitFailure: + raise KsftSkipEx("Ntuple filter with RSS and nonzero action not supported") + defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") + + _send_traffic_check(cfg, port, f"context {ctx_id}", { 'target': (2, 3), + 'empty' : (1,), + 'noise' : (0,) }) + + def main() -> None: with NetDrvEpEnv(__file__, nsim_test=False) as cfg: cfg.ethnl = EthtoolFamily() @@ -616,7 +708,8 @@ def main() -> None: test_rss_context, test_rss_context4, test_rss_context32, test_rss_context_dump, test_rss_context_queue_reconfigure, test_rss_context_overlap, test_rss_context_overlap2, - test_rss_context_out_of_order, test_rss_context4_create_with_cfg], + test_rss_context_out_of_order, test_rss_context4_create_with_cfg, + test_delete_rss_context_busy, test_rss_ntuple_addition], args=(cfg, )) ksft_exit() diff --git a/tools/testing/selftests/drivers/net/lib/py/load.py b/tools/testing/selftests/drivers/net/lib/py/load.py index d9c10613ae67..da5af2c680fa 100644 --- a/tools/testing/selftests/drivers/net/lib/py/load.py +++ b/tools/testing/selftests/drivers/net/lib/py/load.py @@ -2,7 +2,7 @@ import time -from lib.py import ksft_pr, cmd, ip, rand_port, wait_port_listen +from lib.py import ksft_pr, cmd, ip, rand_port, wait_port_listen, bkg class GenerateTraffic: def __init__(self, env, port=None): @@ -23,6 +23,24 @@ class GenerateTraffic: self.stop(verbose=True) raise Exception("iperf3 traffic did not ramp up") + def run_remote_test(self, env: object, port=None, command=None): + if port is None: + port = rand_port() + try: + server_cmd = f"iperf3 -s 1 -p {port} --one-off" + with bkg(server_cmd, host=env.remote): + #iperf3 opens TCP connection as default in server + #-u to be specified in client command for UDP + wait_port_listen(port, host=env.remote) + except Exception as e: + raise Exception(f"Unexpected error occurred while running server command: {e}") + try: + client_cmd = f"iperf3 -c {env.remote_addr} -p {port} {command}" + proc = cmd(client_cmd) + return proc + except Exception as e: + raise Exception(f"Unexpected error occurred while running client command: {e}") + def _wait_pkts(self, pkt_cnt=None, pps=None): """ Wait until we've seen pkt_cnt or until traffic ramps up to pps. diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh index 89b55e946eed..36055279ba92 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh @@ -116,7 +116,7 @@ dev_del_test() log_test "Device delete" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid } trap cleanup EXIT diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_drops.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_drops.sh index 160891dcb4bc..db5806d189bb 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_drops.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_drops.sh @@ -595,7 +595,7 @@ irif_disabled_test() log_test "Ingress RIF disabled" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid ip link set dev $rp1 nomaster __addr_add_del $rp1 add 192.0.2.2/24 2001:db8:1::2/64 ip link del dev br0 type bridge @@ -645,7 +645,7 @@ erif_disabled_test() log_test "Egress RIF disabled" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid __addr_add_del $rp1 add 192.0.2.2/24 2001:db8:1::2/64 ip link del dev br0 type bridge devlink_trap_action_set $trap_name "drop" diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_exceptions.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_exceptions.sh index 190c1b6b5365..5d6d88b600f0 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_exceptions.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l3_exceptions.sh @@ -202,7 +202,7 @@ mtu_value_is_too_small_test() mtu_restore $rp2 - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $h1 ingress protocol ip pref 1 handle 101 flower } @@ -235,7 +235,7 @@ __ttl_value_is_too_small_test() log_test "TTL value is too small: TTL=$ttl_val" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $h1 ingress protocol ip pref 1 handle 101 flower } @@ -299,7 +299,7 @@ __mc_reverse_path_forwarding_test() log_test "Multicast reverse path forwarding: $desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $rp2 egress protocol $proto pref 1 handle 101 flower } @@ -347,7 +347,7 @@ __reject_route_test() log_test "Reject route: $desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid ip route del unreachable $unreachable tc filter del dev $h1 ingress protocol $proto pref 1 handle 101 flower } @@ -542,7 +542,7 @@ ipv4_lpm_miss_test() log_test "LPM miss: IPv4" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid vrf_without_routes_destroy } @@ -569,7 +569,7 @@ ipv6_lpm_miss_test() log_test "LPM miss: IPv6" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid vrf_without_routes_destroy } diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_policer.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_policer.sh index 0bd5ffc218ac..29a672c2270f 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_policer.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_policer.sh @@ -45,63 +45,52 @@ source $lib_dir/devlink_lib.sh h1_create() { simple_if_init $h1 192.0.2.1/24 + defer simple_if_fini $h1 192.0.2.1/24 + mtu_set $h1 10000 + defer mtu_restore $h1 ip -4 route add default vrf v$h1 nexthop via 192.0.2.2 -} - -h1_destroy() -{ - ip -4 route del default vrf v$h1 nexthop via 192.0.2.2 - - mtu_restore $h1 - simple_if_fini $h1 192.0.2.1/24 + defer ip -4 route del default vrf v$h1 nexthop via 192.0.2.2 } h2_create() { simple_if_init $h2 198.51.100.1/24 + defer simple_if_fini $h2 198.51.100.1/24 + mtu_set $h2 10000 + defer mtu_restore $h2 ip -4 route add default vrf v$h2 nexthop via 198.51.100.2 -} - -h2_destroy() -{ - ip -4 route del default vrf v$h2 nexthop via 198.51.100.2 - - mtu_restore $h2 - simple_if_fini $h2 198.51.100.1/24 + defer ip -4 route del default vrf v$h2 nexthop via 198.51.100.2 } router_create() { ip link set dev $rp1 up + defer ip link set dev $rp1 down + ip link set dev $rp2 up + defer ip link set dev $rp2 down __addr_add_del $rp1 add 192.0.2.2/24 + defer __addr_add_del $rp1 del 192.0.2.2/24 + __addr_add_del $rp2 add 198.51.100.2/24 + defer __addr_add_del $rp2 del 198.51.100.2/24 + mtu_set $rp1 10000 + defer mtu_restore $rp1 + mtu_set $rp2 10000 + defer mtu_restore $rp2 ip -4 route add blackhole 198.51.100.100 + defer ip -4 route del blackhole 198.51.100.100 devlink trap set $DEVLINK_DEV trap blackhole_route action trap -} - -router_destroy() -{ - devlink trap set $DEVLINK_DEV trap blackhole_route action drop - - ip -4 route del blackhole 198.51.100.100 - - mtu_restore $rp2 - mtu_restore $rp1 - __addr_add_del $rp2 del 198.51.100.2/24 - __addr_add_del $rp1 del 192.0.2.2/24 - - ip link set dev $rp2 down - ip link set dev $rp1 down + defer devlink trap set $DEVLINK_DEV trap blackhole_route action drop } setup_prepare() @@ -114,7 +103,11 @@ setup_prepare() rp1_mac=$(mac_get $rp1) + # Reload to ensure devlink-trap settings are back to default. + defer devlink_reload + vrf_prepare + defer vrf_cleanup h1_create h2_create @@ -122,21 +115,6 @@ setup_prepare() router_create } -cleanup() -{ - pre_cleanup - - router_destroy - - h2_destroy - h1_destroy - - vrf_cleanup - - # Reload to ensure devlink-trap settings are back to default. - devlink_reload -} - rate_limits_test() { RET=0 @@ -214,7 +192,10 @@ __rate_test() # by the policer. Make sure measured received rate is about 1000 pps log_info "=== Tx rate: Highest, Policer rate: 1000 pps ===" + defer_scope_push + start_traffic $h1 192.0.2.1 198.51.100.100 $rp1_mac + defer stop_traffic $! sleep 5 # Take measurements when rate is stable @@ -229,13 +210,16 @@ __rate_test() check_err $? "Expected non-zero policer drop rate, got 0" log_info "Measured policer drop rate of $drop_rate pps" - stop_traffic + defer_scope_pop # Send packets at a rate of 1000 pps and make sure they are not dropped # by the policer log_info "=== Tx rate: 1000 pps, Policer rate: 1000 pps ===" + defer_scope_push + start_traffic $h1 192.0.2.1 198.51.100.100 $rp1_mac -d 1msec + defer stop_traffic $! sleep 5 # Take measurements when rate is stable @@ -244,7 +228,7 @@ __rate_test() check_err $? "Expected zero policer drop rate, got a drop rate of $drop_rate pps" log_info "Measured policer drop rate of $drop_rate pps" - stop_traffic + defer_scope_pop # Unbind the policer and send packets at highest possible rate. Make # sure they are not dropped by the policer and that the measured @@ -253,7 +237,10 @@ __rate_test() devlink trap group set $DEVLINK_DEV group l3_drops nopolicer + defer_scope_push + start_traffic $h1 192.0.2.1 198.51.100.100 $rp1_mac + defer stop_traffic $! rate=$(trap_rate_get) (( rate > 1000 )) @@ -265,7 +252,7 @@ __rate_test() check_err $? "Expected zero policer drop rate, got a drop rate of $drop_rate pps" log_info "Measured policer drop rate of $drop_rate pps" - stop_traffic + defer_scope_pop log_test "Trap policer rate" } diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip.sh index e9a82cae8c9a..4ac1dae92d0f 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip.sh @@ -176,7 +176,7 @@ ecn_decap_test() log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower } @@ -207,7 +207,7 @@ no_matching_tunnel_test() log_test "$desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower } diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip6.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip6.sh index 878125041fc3..fce885184404 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip6.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_ipip6.sh @@ -176,7 +176,7 @@ ecn_decap_test() log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ipv6 pref 1 handle 101 flower } @@ -207,7 +207,7 @@ no_matching_tunnel_test() log_test "$desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ipv6 pref 1 handle 101 flower } diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan.sh index 5f6eb965cfd1..7aca8e5922cf 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan.sh @@ -183,7 +183,7 @@ ecn_decap_test() log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower } @@ -253,7 +253,7 @@ corrupted_packet_test() log_test "$desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower } diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan_ipv6.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan_ipv6.sh index f6c16cbb6cf7..4599c331240b 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan_ipv6.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_tunnel_vxlan_ipv6.sh @@ -188,7 +188,7 @@ ecn_decap_test() log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ipv6 pref 1 handle 101 flower } @@ -262,7 +262,7 @@ corrupted_packet_test() log_test "$desc" - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $swp1 egress protocol ipv6 pref 1 handle 101 flower } diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_ets_strict.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_ets_strict.sh index fee74f215cec..d5b6f2cc9a29 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_ets_strict.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_ets_strict.sh @@ -58,65 +58,62 @@ source qos_lib.sh h1_create() { simple_if_init $h1 + defer simple_if_fini $h1 + mtu_set $h1 10000 + defer mtu_restore $h1 vlan_create $h1 111 v$h1 192.0.2.33/28 + defer vlan_destroy $h1 111 ip link set dev $h1.111 type vlan egress-qos-map 0:1 } -h1_destroy() -{ - vlan_destroy $h1 111 - - mtu_restore $h1 - simple_if_fini $h1 -} - h2_create() { simple_if_init $h2 + defer simple_if_fini $h2 + mtu_set $h2 10000 + defer mtu_restore $h2 vlan_create $h2 222 v$h2 192.0.2.65/28 + defer vlan_destroy $h2 222 ip link set dev $h2.222 type vlan egress-qos-map 0:2 } -h2_destroy() -{ - vlan_destroy $h2 222 - - mtu_restore $h2 - simple_if_fini $h2 -} - h3_create() { simple_if_init $h3 + defer simple_if_fini $h3 + mtu_set $h3 10000 + defer mtu_restore $h3 vlan_create $h3 111 v$h3 192.0.2.34/28 - vlan_create $h3 222 v$h3 192.0.2.66/28 -} - -h3_destroy() -{ - vlan_destroy $h3 222 - vlan_destroy $h3 111 + defer vlan_destroy $h3 111 - mtu_restore $h3 - simple_if_fini $h3 + vlan_create $h3 222 v$h3 192.0.2.66/28 + defer vlan_destroy $h3 222 } switch_create() { ip link set dev $swp1 up + defer ip link set dev $swp1 down + mtu_set $swp1 10000 + defer mtu_restore $swp1 ip link set dev $swp2 up + defer ip link set dev $swp2 down + mtu_set $swp2 10000 + defer mtu_restore $swp2 # prio n -> TC n, strict scheduling lldptool -T -i $swp3 -V ETS-CFG up2tc=0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 + defer lldptool -T -i $swp3 -V ETS-CFG up2tc=0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0 + lldptool -T -i $swp3 -V ETS-CFG tsa=$( )"0:strict,"$( )"1:strict,"$( @@ -129,85 +126,90 @@ switch_create() sleep 1 ip link set dev $swp3 up + defer ip link set dev $swp3 down + mtu_set $swp3 10000 + defer mtu_restore $swp3 + tc qdisc replace dev $swp3 root handle 101: tbf rate 1gbit \ burst 128K limit 1G + defer tc qdisc del dev $swp3 root handle 101: vlan_create $swp1 111 + defer vlan_destroy $swp1 111 + vlan_create $swp2 222 + defer vlan_destroy $swp2 222 + vlan_create $swp3 111 + defer vlan_destroy $swp3 111 + vlan_create $swp3 222 + defer vlan_destroy $swp3 222 ip link add name br111 type bridge vlan_filtering 0 + defer ip link del dev br111 ip link set dev br111 addrgenmode none + ip link set dev br111 up + defer ip link set dev br111 down + ip link set dev $swp1.111 master br111 + defer ip link set dev $swp1.111 nomaster + ip link set dev $swp3.111 master br111 + defer ip link set dev $swp3.111 nomaster ip link add name br222 type bridge vlan_filtering 0 + defer ip link del dev br222 ip link set dev br222 addrgenmode none + ip link set dev br222 up + defer ip link set dev br222 down + ip link set dev $swp2.222 master br222 + defer ip link set dev $swp2.222 nomaster + ip link set dev $swp3.222 master br222 + defer ip link set dev $swp3.222 nomaster # Make sure that ingress quotas are smaller than egress so that there is # room for both streams of traffic to be admitted to shared buffer. devlink_pool_size_thtype_save 0 devlink_pool_size_thtype_set 0 dynamic 10000000 + defer devlink_pool_size_thtype_restore 0 + devlink_pool_size_thtype_save 4 devlink_pool_size_thtype_set 4 dynamic 10000000 + defer devlink_pool_size_thtype_restore 4 devlink_port_pool_th_save $swp1 0 devlink_port_pool_th_set $swp1 0 6 + defer devlink_port_pool_th_restore $swp1 0 + devlink_tc_bind_pool_th_save $swp1 1 ingress devlink_tc_bind_pool_th_set $swp1 1 ingress 0 6 + defer devlink_tc_bind_pool_th_restore $swp1 1 ingress devlink_port_pool_th_save $swp2 0 devlink_port_pool_th_set $swp2 0 6 + defer devlink_port_pool_th_restore $swp2 0 + devlink_tc_bind_pool_th_save $swp2 2 ingress devlink_tc_bind_pool_th_set $swp2 2 ingress 0 6 + defer devlink_tc_bind_pool_th_restore $swp2 2 ingress devlink_tc_bind_pool_th_save $swp3 1 egress devlink_tc_bind_pool_th_set $swp3 1 egress 4 7 + defer devlink_tc_bind_pool_th_restore $swp3 1 egress + devlink_tc_bind_pool_th_save $swp3 2 egress devlink_tc_bind_pool_th_set $swp3 2 egress 4 7 + defer devlink_tc_bind_pool_th_restore $swp3 2 egress + devlink_port_pool_th_save $swp3 4 devlink_port_pool_th_set $swp3 4 7 -} - -switch_destroy() -{ - devlink_port_pool_th_restore $swp3 4 - devlink_tc_bind_pool_th_restore $swp3 2 egress - devlink_tc_bind_pool_th_restore $swp3 1 egress - - devlink_tc_bind_pool_th_restore $swp2 2 ingress - devlink_port_pool_th_restore $swp2 0 - - devlink_tc_bind_pool_th_restore $swp1 1 ingress - devlink_port_pool_th_restore $swp1 0 - - devlink_pool_size_thtype_restore 4 - devlink_pool_size_thtype_restore 0 - - ip link del dev br222 - ip link del dev br111 - - vlan_destroy $swp3 222 - vlan_destroy $swp3 111 - vlan_destroy $swp2 222 - vlan_destroy $swp1 111 - - tc qdisc del dev $swp3 root handle 101: - mtu_restore $swp3 - ip link set dev $swp3 down - lldptool -T -i $swp3 -V ETS-CFG up2tc=0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0 - - mtu_restore $swp2 - ip link set dev $swp2 down - - mtu_restore $swp1 - ip link set dev $swp1 down + defer devlink_port_pool_th_restore $swp3 4 } setup_prepare() @@ -224,6 +226,7 @@ setup_prepare() h3mac=$(mac_get $h3) vrf_prepare + defer vrf_cleanup h1_create h2_create @@ -231,18 +234,6 @@ setup_prepare() switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h3_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1 192.0.2.34 " from H1" @@ -261,21 +252,38 @@ rel() " } +__run_hi_measure_rate() +{ + local what=$1; shift + local -a uc_rate + + start_traffic $h2.222 192.0.2.65 192.0.2.66 $h3mac + defer stop_traffic $! + + uc_rate=($(measure_rate $swp2 $h3 rx_octets_prio_2 "$what")) + check_err $? "Could not get high enough $what ingress rate" + + echo ${uc_rate[@]} +} + +run_hi_measure_rate() +{ + in_defer_scope __run_hi_measure_rate "$@" +} + test_ets_strict() { RET=0 # Run high-prio traffic on its own. - start_traffic $h2.222 192.0.2.65 192.0.2.66 $h3mac local -a rate_2 - rate_2=($(measure_rate $swp2 $h3 rx_octets_prio_2 "prio 2")) - check_err $? "Could not get high enough prio-2 ingress rate" + rate_2=($(run_hi_measure_rate "prio 2")) local rate_2_in=${rate_2[0]} local rate_2_eg=${rate_2[1]} - stop_traffic # $h2.222 # Start low-prio stream. start_traffic $h1.111 192.0.2.33 192.0.2.34 $h3mac + defer stop_traffic $! local -a rate_1 rate_1=($(measure_rate $swp1 $h3 rx_octets_prio_1 "prio 1")) @@ -290,14 +298,9 @@ test_ets_strict() check_err $(bc <<< "$rel21 > 105") # Start the high-prio stream--now both streams run. - start_traffic $h2.222 192.0.2.65 192.0.2.66 $h3mac - rate_3=($(measure_rate $swp2 $h3 rx_octets_prio_2 "prio 2 w/ 1")) - check_err $? "Could not get high enough prio-2 ingress rate with prio-1" + rate_3=($(run_hi_measure_rate "prio 2+1")) local rate_3_in=${rate_3[0]} local rate_3_eg=${rate_3[1]} - stop_traffic # $h2.222 - - stop_traffic # $h1.111 # High-prio should have about the same throughput whether or not # low-prio is in the system. diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_max_descriptors.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_max_descriptors.sh index 5ac4f795e333..2b5d2c2751d5 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_max_descriptors.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_max_descriptors.sh @@ -69,127 +69,103 @@ mlxsw_only_on_spectrum 2+ || exit h1_create() { simple_if_init $h1 + defer simple_if_fini $h1 vlan_create $h1 111 v$h1 192.0.2.33/28 + defer vlan_destroy $h1 111 ip link set dev $h1.111 type vlan egress-qos-map 0:1 } -h1_destroy() -{ - vlan_destroy $h1 111 - - simple_if_fini $h1 -} - h2_create() { simple_if_init $h2 + defer simple_if_fini $h2 vlan_create $h2 111 v$h2 192.0.2.34/28 -} - -h2_destroy() -{ - vlan_destroy $h2 111 - - simple_if_fini $h2 + defer vlan_destroy $h2 111 } switch_create() { # pools # ----- + # devlink_pool_size_thtype_restore needs to be done first so that we can + # reset the various limits to values that are only valid for the + # original static / dynamic setting. devlink_pool_size_thtype_save 1 - devlink_pool_size_thtype_save 6 - - devlink_port_pool_th_save $swp1 1 - devlink_port_pool_th_save $swp2 6 - - devlink_tc_bind_pool_th_save $swp1 1 ingress - devlink_tc_bind_pool_th_save $swp2 1 egress - devlink_pool_size_thtype_set 1 dynamic $MAX_POOL_SIZE + defer_prio devlink_pool_size_thtype_restore 1 + + devlink_pool_size_thtype_save 6 devlink_pool_size_thtype_set 6 static $MAX_POOL_SIZE + defer_prio devlink_pool_size_thtype_restore 6 # $swp1 # ----- ip link set dev $swp1 up + defer ip link set dev $swp1 down + vlan_create $swp1 111 + defer vlan_destroy $swp1 111 ip link set dev $swp1.111 type vlan ingress-qos-map 0:0 1:1 + devlink_port_pool_th_save $swp1 1 devlink_port_pool_th_set $swp1 1 16 + defer devlink_tc_bind_pool_th_restore $swp1 1 ingress + + devlink_tc_bind_pool_th_save $swp1 1 ingress devlink_tc_bind_pool_th_set $swp1 1 ingress 1 16 + defer devlink_port_pool_th_restore $swp1 1 tc qdisc replace dev $swp1 root handle 1: \ ets bands 8 strict 8 priomap 7 6 + defer tc qdisc del dev $swp1 root + dcb buffer set dev $swp1 prio-buffer all:0 1:1 + defer dcb buffer set dev $swp1 prio-buffer all:0 # $swp2 # ----- ip link set dev $swp2 up + defer ip link set dev $swp2 down + vlan_create $swp2 111 + defer vlan_destroy $swp2 111 ip link set dev $swp2.111 type vlan egress-qos-map 0:0 1:1 + devlink_port_pool_th_save $swp2 6 devlink_port_pool_th_set $swp2 6 $MAX_POOL_SIZE + defer devlink_tc_bind_pool_th_restore $swp2 1 egress + + devlink_tc_bind_pool_th_save $swp2 1 egress devlink_tc_bind_pool_th_set $swp2 1 egress 6 $MAX_POOL_SIZE + defer devlink_port_pool_th_restore $swp2 6 tc qdisc replace dev $swp2 root handle 1: tbf rate $SHAPER_RATE \ burst 128K limit 500M + defer tc qdisc del dev $swp2 root + tc qdisc replace dev $swp2 parent 1:1 handle 11: \ ets bands 8 strict 8 priomap 7 6 + defer tc qdisc del dev $swp2 parent 1:1 handle 11: # bridge # ------ ip link add name br1 type bridge vlan_filtering 0 + defer ip link del dev br1 + ip link set dev $swp1.111 master br1 + defer ip link set dev $swp1.111 nomaster + ip link set dev br1 up + defer ip link set dev br1 down ip link set dev $swp2.111 master br1 -} - -switch_destroy() -{ - # Do this first so that we can reset the limits to values that are only - # valid for the original static / dynamic setting. - devlink_pool_size_thtype_restore 6 - devlink_pool_size_thtype_restore 1 - - # bridge - # ------ - - ip link set dev $swp2.111 nomaster - - ip link set dev br1 down - ip link set dev $swp1.111 nomaster - ip link del dev br1 - - # $swp2 - # ----- - - tc qdisc del dev $swp2 parent 1:1 handle 11: - tc qdisc del dev $swp2 root - - devlink_tc_bind_pool_th_restore $swp2 1 egress - devlink_port_pool_th_restore $swp2 6 - - vlan_destroy $swp2 111 - ip link set dev $swp2 down - - # $swp1 - # ----- - - dcb buffer set dev $swp1 prio-buffer all:0 - tc qdisc del dev $swp1 root - - devlink_tc_bind_pool_th_restore $swp1 1 ingress - devlink_port_pool_th_restore $swp1 1 - - vlan_destroy $swp1 111 - ip link set dev $swp1 down + defer ip link set dev $swp2.111 nomaster } setup_prepare() @@ -203,23 +179,13 @@ setup_prepare() h2mac=$(mac_get $h2) vrf_prepare + defer vrf_cleanup h1_create h2_create switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1 192.0.2.34 " h1->h2" @@ -251,6 +217,7 @@ max_descriptors() log_info "Send many small packets, packet size = $pktsize bytes" start_traffic_pktsize $pktsize $h1.111 192.0.2.33 192.0.2.34 $h2mac + defer stop_traffic $! # Sleep to wait for congestion. sleep 5 @@ -268,9 +235,6 @@ max_descriptors() check_err $(bc <<< "$perc_used < $exp_perc_used") \ "Expected > $exp_perc_used% of descriptors, handle $perc_used%" - stop_traffic - sleep 1 - log_test "Maximum descriptors usage. The percentage used is $perc_used%" } diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_mc_aware.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_mc_aware.sh index 6d892de43fa8..cd4a5c21360c 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_mc_aware.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_mc_aware.sh @@ -73,122 +73,114 @@ source qos_lib.sh h1_create() { simple_if_init $h1 192.0.2.65/28 - mtu_set $h1 10000 -} + defer simple_if_fini $h1 192.0.2.65/28 -h1_destroy() -{ - mtu_restore $h1 - simple_if_fini $h1 192.0.2.65/28 + mtu_set $h1 10000 + defer mtu_restore $h1 } h2_create() { simple_if_init $h2 + defer simple_if_fini $h2 + mtu_set $h2 10000 + defer mtu_restore $h2 vlan_create $h2 111 v$h2 192.0.2.129/28 + defer vlan_destroy $h2 111 ip link set dev $h2.111 type vlan egress-qos-map 0:1 } -h2_destroy() -{ - vlan_destroy $h2 111 - - mtu_restore $h2 - simple_if_fini $h2 -} - h3_create() { simple_if_init $h3 192.0.2.66/28 + defer simple_if_fini $h3 192.0.2.66/28 + mtu_set $h3 10000 + defer mtu_restore $h3 vlan_create $h3 111 v$h3 192.0.2.130/28 -} - -h3_destroy() -{ - vlan_destroy $h3 111 - - mtu_restore $h3 - simple_if_fini $h3 192.0.2.66/28 + defer vlan_destroy $h3 111 } switch_create() { ip link set dev $swp1 up + defer ip link set dev $swp1 down + mtu_set $swp1 10000 + defer mtu_restore $swp1 ip link set dev $swp2 up + defer ip link set dev $swp2 down + mtu_set $swp2 10000 + defer mtu_restore $swp2 ip link set dev $swp3 up + defer ip link set dev $swp3 down + mtu_set $swp3 10000 + defer mtu_restore $swp3 vlan_create $swp2 111 + defer vlan_destroy $swp2 111 + vlan_create $swp3 111 + defer vlan_destroy $swp3 111 tc qdisc replace dev $swp3 root handle 3: tbf rate 1gbit \ burst 128K limit 1G + defer tc qdisc del dev $swp3 root handle 3: + tc qdisc replace dev $swp3 parent 3:3 handle 33: \ prio bands 8 priomap 7 7 7 7 7 7 7 7 + defer tc qdisc del dev $swp3 parent 3:3 handle 33: ip link add name br1 type bridge vlan_filtering 0 + defer ip link del dev br1 ip link set dev br1 addrgenmode none ip link set dev br1 up + ip link set dev $swp1 master br1 + defer ip link set dev $swp1 nomaster + ip link set dev $swp3 master br1 + defer ip link set dev $swp3 nomaster ip link add name br111 type bridge vlan_filtering 0 + defer ip link del dev br111 ip link set dev br111 addrgenmode none ip link set dev br111 up + ip link set dev $swp2.111 master br111 + defer ip link set dev $swp2.111 nomaster + ip link set dev $swp3.111 master br111 + defer ip link set dev $swp3.111 nomaster # Make sure that ingress quotas are smaller than egress so that there is # room for both streams of traffic to be admitted to shared buffer. devlink_port_pool_th_save $swp1 0 devlink_port_pool_th_set $swp1 0 5 + defer devlink_port_pool_th_restore $swp1 0 + devlink_tc_bind_pool_th_save $swp1 0 ingress devlink_tc_bind_pool_th_set $swp1 0 ingress 0 5 + defer devlink_tc_bind_pool_th_restore $swp1 0 ingress devlink_port_pool_th_save $swp2 0 devlink_port_pool_th_set $swp2 0 5 + defer devlink_port_pool_th_restore $swp2 0 + devlink_tc_bind_pool_th_save $swp2 1 ingress devlink_tc_bind_pool_th_set $swp2 1 ingress 0 5 + defer devlink_tc_bind_pool_th_restore $swp2 1 ingress devlink_port_pool_th_save $swp3 4 devlink_port_pool_th_set $swp3 4 12 -} - -switch_destroy() -{ - devlink_port_pool_th_restore $swp3 4 - - devlink_tc_bind_pool_th_restore $swp2 1 ingress - devlink_port_pool_th_restore $swp2 0 - - devlink_tc_bind_pool_th_restore $swp1 0 ingress - devlink_port_pool_th_restore $swp1 0 - - ip link del dev br111 - ip link del dev br1 - - tc qdisc del dev $swp3 parent 3:3 handle 33: - tc qdisc del dev $swp3 root handle 3: - - vlan_destroy $swp3 111 - vlan_destroy $swp2 111 - - mtu_restore $swp3 - ip link set dev $swp3 down - - mtu_restore $swp2 - ip link set dev $swp2 down - - mtu_restore $swp1 - ip link set dev $swp1 down + defer devlink_port_pool_th_restore $swp3 4 } setup_prepare() @@ -205,6 +197,7 @@ setup_prepare() h3mac=$(mac_get $h3) vrf_prepare + defer vrf_cleanup h1_create h2_create @@ -212,45 +205,45 @@ setup_prepare() switch_create } -cleanup() +ping_ipv4() { - pre_cleanup + ping_test $h2 192.0.2.130 +} - switch_destroy - h3_destroy - h2_destroy - h1_destroy +__run_uc_measure_rate() +{ + local what=$1; shift + local -a uc_rate + + start_traffic $h2.111 192.0.2.129 192.0.2.130 $h3mac + defer stop_traffic $! + + uc_rate=($(measure_rate $swp2 $h3 rx_octets_prio_1 "$what")) + check_err $? "Could not get high enough $what ingress rate" - vrf_cleanup + echo ${uc_rate[@]} } -ping_ipv4() +run_uc_measure_rate() { - ping_test $h2 192.0.2.130 + in_defer_scope __run_uc_measure_rate "$@" } test_mc_aware() { RET=0 - local -a uc_rate - start_traffic $h2.111 192.0.2.129 192.0.2.130 $h3mac - uc_rate=($(measure_rate $swp2 $h3 rx_octets_prio_1 "UC-only")) - check_err $? "Could not get high enough UC-only ingress rate" - stop_traffic + local -a uc_rate=($(run_uc_measure_rate "UC-only")) local ucth1=${uc_rate[1]} start_traffic $h1 192.0.2.65 bc bc + defer stop_traffic $! local d0=$(date +%s) local t0=$(ethtool_stats_get $h3 rx_octets_prio_0) local u0=$(ethtool_stats_get $swp1 rx_octets_prio_0) - local -a uc_rate_2 - start_traffic $h2.111 192.0.2.129 192.0.2.130 $h3mac - uc_rate_2=($(measure_rate $swp2 $h3 rx_octets_prio_1 "UC+MC")) - check_err $? "Could not get high enough UC+MC ingress rate" - stop_traffic + local -a uc_rate_2=($(run_uc_measure_rate "UC+MC")) local ucth2=${uc_rate_2[1]} local d1=$(date +%s) @@ -272,8 +265,6 @@ test_mc_aware() local mc_ir=$(rate $u0 $u1 $interval) local mc_er=$(rate $t0 $t1 $interval) - stop_traffic - log_test "UC performance under MC overload" echo "UC-only throughput $(humanize $ucth1)" @@ -297,6 +288,7 @@ test_uc_aware() RET=0 start_traffic $h2.111 192.0.2.129 192.0.2.130 $h3mac + defer stop_traffic $! local d0=$(date +%s) local t0=$(ethtool_stats_get $h3 rx_octets_prio_1) @@ -326,8 +318,6 @@ test_uc_aware() ((attempts == passes)) check_err $? - stop_traffic - log_test "MC performance under UC overload" echo " ingress UC throughput $(humanize ${uc_ir})" echo " egress UC throughput $(humanize ${uc_er})" diff --git a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh index 893a693ad805..45a569618424 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh @@ -186,10 +186,7 @@ bridge_vlan_flags_test() # If we did not handle references correctly, then this should produce a # trace - devlink dev reload "$DEVLINK_DEV" - - # Allow netdevices to be re-created following the reload - sleep 20 + devlink_reload log_test "bridge vlan flags" } @@ -923,12 +920,9 @@ devlink_reload_test() # devlink reload can be performed without errors RET=0 - devlink dev reload "$DEVLINK_DEV" - check_err $? "devlink reload failed" + devlink_reload log_test "devlink reload - last test" - - sleep 20 } trap cleanup EXIT diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_ets.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_ets.sh index 139175fd03e7..4aaceb6b2b60 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/sch_ets.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_ets.sh @@ -21,6 +21,7 @@ switch_create() # Create a bottleneck so that the DWRR process can kick in. tc qdisc replace dev $swp2 root handle 3: tbf rate 1gbit \ burst 128K limit 1G + defer tc qdisc del dev $swp2 root handle 3: ets_switch_create @@ -30,16 +31,27 @@ switch_create() # for the DWRR process. devlink_port_pool_th_save $swp1 0 devlink_port_pool_th_set $swp1 0 12 + defer devlink_port_pool_th_restore $swp1 0 + devlink_tc_bind_pool_th_save $swp1 0 ingress devlink_tc_bind_pool_th_set $swp1 0 ingress 0 12 + defer devlink_tc_bind_pool_th_restore $swp1 0 ingress + devlink_port_pool_th_save $swp2 4 devlink_port_pool_th_set $swp2 4 12 + defer devlink_port_pool_th_restore $swp2 4 + devlink_tc_bind_pool_th_save $swp2 7 egress devlink_tc_bind_pool_th_set $swp2 7 egress 4 5 + defer devlink_tc_bind_pool_th_restore $swp2 7 egress + devlink_tc_bind_pool_th_save $swp2 6 egress devlink_tc_bind_pool_th_set $swp2 6 egress 4 5 + defer devlink_tc_bind_pool_th_restore $swp2 6 egress + devlink_tc_bind_pool_th_save $swp2 5 egress devlink_tc_bind_pool_th_set $swp2 5 egress 4 5 + defer devlink_tc_bind_pool_th_restore $swp2 5 egress # Note: sch_ets_core.sh uses VLAN ingress-qos-map to assign packet # priorities at $swp1 based on their 802.1p headers. ingress-qos-map is @@ -47,20 +59,6 @@ switch_create() # 1:1, which is the mapping currently hard-coded by the driver. } -switch_destroy() -{ - devlink_tc_bind_pool_th_restore $swp2 5 egress - devlink_tc_bind_pool_th_restore $swp2 6 egress - devlink_tc_bind_pool_th_restore $swp2 7 egress - devlink_port_pool_th_restore $swp2 4 - devlink_tc_bind_pool_th_restore $swp1 0 ingress - devlink_port_pool_th_restore $swp1 0 - - ets_switch_destroy - - tc qdisc del dev $swp2 root handle 3: -} - # Callback from sch_ets_tests.sh collect_stats() { diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh index 299e06a5808c..537d6baa77b7 100644 --- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh @@ -75,6 +75,18 @@ source $lib_dir/lib.sh source $lib_dir/devlink_lib.sh source mlxsw_lib.sh +stop_traffic_sleep() +{ + local pid=$1; shift + + # Issuing a kill still leaves a bunch of packets lingering in the + # buffers. This traffic then arrives at the point where a follow-up test + # is already running, and can confuse the test. Therefore sleep after + # stopping traffic to flush any leftover packets. + stop_traffic "$pid" + sleep 1 +} + ipaddr() { local host=$1; shift @@ -89,39 +101,31 @@ host_create() local host=$1; shift simple_if_init $dev + defer simple_if_fini $dev + mtu_set $dev 10000 + defer mtu_restore $dev vlan_create $dev 10 v$dev $(ipaddr $host 10)/28 + defer vlan_destroy $dev 10 ip link set dev $dev.10 type vlan egress 0:0 vlan_create $dev 11 v$dev $(ipaddr $host 11)/28 + defer vlan_destroy $dev 11 ip link set dev $dev.11 type vlan egress 0:1 } -host_destroy() -{ - local dev=$1; shift - - vlan_destroy $dev 11 - vlan_destroy $dev 10 - mtu_restore $dev - simple_if_fini $dev -} - h1_create() { host_create $h1 1 } -h1_destroy() -{ - host_destroy $h1 -} - h2_create() { host_create $h2 2 + tc qdisc add dev $h2 clsact + defer tc qdisc del dev $h2 clsact # Some of the tests in this suite use multicast traffic. As this traffic # enters BR2_10 resp. BR2_11, it is flooded to all other ports. Thus @@ -137,15 +141,9 @@ h2_create() # Prevent this by adding a shaper which limits the traffic in $h2 to # 1Gbps. - tc qdisc replace dev $h2 root handle 10: tbf rate 1gbit \ + tc qdisc replace dev $h2 root handle 10: tbf rate 200mbit \ burst 128K limit 1G -} - -h2_destroy() -{ - tc qdisc del dev $h2 root handle 10: - tc qdisc del dev $h2 clsact - host_destroy $h2 + defer tc qdisc del dev $h2 root handle 10: } h3_create() @@ -153,40 +151,54 @@ h3_create() host_create $h3 3 } -h3_destroy() -{ - host_destroy $h3 -} - switch_create() { local intf local vlan ip link add dev br1_10 type bridge + defer ip link del dev br1_10 + ip link add dev br1_11 type bridge + defer ip link del dev br1_11 ip link add dev br2_10 type bridge + defer ip link del dev br2_10 + ip link add dev br2_11 type bridge + defer ip link del dev br2_11 for intf in $swp1 $swp2 $swp3 $swp4 $swp5; do ip link set dev $intf up + defer ip link set dev $intf down + mtu_set $intf 10000 + defer mtu_restore $intf done for intf in $swp1 $swp4; do for vlan in 10 11; do vlan_create $intf $vlan + defer vlan_destroy $intf $vlan + ip link set dev $intf.$vlan master br1_$vlan + defer ip link set dev $intf.$vlan nomaster + ip link set dev $intf.$vlan up + defer ip link set dev $intf.$vlan up done done for intf in $swp2 $swp3 $swp5; do for vlan in 10 11; do vlan_create $intf $vlan + defer vlan_destroy $intf $vlan + ip link set dev $intf.$vlan master br2_$vlan + defer ip link set dev $intf.$vlan nomaster + ip link set dev $intf.$vlan up + defer ip link set dev $intf.$vlan up done done @@ -199,51 +211,27 @@ switch_create() done for intf in $swp3 $swp4; do - tc qdisc replace dev $intf root handle 1: tbf rate 1gbit \ + tc qdisc replace dev $intf root handle 1: tbf rate 200mbit \ burst 128K limit 1G + defer tc qdisc del dev $intf root handle 1: done ip link set dev br1_10 up + defer ip link set dev br1_10 down + ip link set dev br1_11 up + defer ip link set dev br1_11 down + ip link set dev br2_10 up + defer ip link set dev br2_10 down + ip link set dev br2_11 up + defer ip link set dev br2_11 down local size=$(devlink_pool_size_thtype 0 | cut -d' ' -f 1) devlink_port_pool_th_save $swp3 8 devlink_port_pool_th_set $swp3 8 $size -} - -switch_destroy() -{ - local intf - local vlan - - devlink_port_pool_th_restore $swp3 8 - - ip link set dev br2_11 down - ip link set dev br2_10 down - ip link set dev br1_11 down - ip link set dev br1_10 down - - for intf in $swp4 $swp3; do - tc qdisc del dev $intf root handle 1: - done - - for intf in $swp5 $swp3 $swp2 $swp4 $swp1; do - for vlan in 11 10; do - ip link set dev $intf.$vlan down - ip link set dev $intf.$vlan nomaster - vlan_destroy $intf $vlan - done - - mtu_restore $intf - ip link set dev $intf down - done - - ip link del dev br2_11 - ip link del dev br2_10 - ip link del dev br1_11 - ip link del dev br1_10 + defer devlink_port_pool_th_restore $swp3 8 } setup_prepare() @@ -263,6 +251,7 @@ setup_prepare() h3_mac=$(mac_get $h3) vrf_prepare + defer vrf_cleanup h1_create h2_create @@ -270,18 +259,6 @@ setup_prepare() switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h3_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1.10 $(ipaddr 3 10) " from host 1, vlan 10" @@ -372,6 +349,7 @@ build_backlog() local i=0 while :; do + sleep 1 local cur=$(busywait 1100 until_counter_is "> $cur" \ get_qdisc_backlog $vlan) local diff=$((size - cur)) @@ -449,6 +427,7 @@ __do_ecn_test() start_tcp_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) \ $h3_mac tos=0x01 + defer stop_traffic_sleep $! sleep 1 ecn_test_common "$name" "$get_nmarked" $vlan $limit @@ -460,9 +439,6 @@ __do_ecn_test() build_backlog $vlan $((2 * limit)) udp >/dev/null check_fail $? "UDP traffic went into backlog instead of being early-dropped" log_test "TC $((vlan - 10)): $name backlog > limit: UDP early-dropped" - - stop_traffic - sleep 1 } do_ecn_test() @@ -470,7 +446,8 @@ do_ecn_test() local vlan=$1; shift local limit=$1; shift - __do_ecn_test get_nmarked "$vlan" "$limit" + in_defer_scope \ + __do_ecn_test get_nmarked "$vlan" "$limit" } do_ecn_test_perband() @@ -479,10 +456,11 @@ do_ecn_test_perband() local limit=$1; shift mlxsw_only_on_spectrum 3+ || return - __do_ecn_test get_qdisc_nmarked "$vlan" "$limit" "per-band ECN" + in_defer_scope \ + __do_ecn_test get_qdisc_nmarked "$vlan" "$limit" "per-band ECN" } -do_ecn_nodrop_test() +__do_ecn_nodrop_test() { local vlan=$1; shift local limit=$1; shift @@ -490,6 +468,7 @@ do_ecn_nodrop_test() start_tcp_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) \ $h3_mac tos=0x01 + defer stop_traffic_sleep $! sleep 1 ecn_test_common "$name" get_nmarked $vlan $limit @@ -501,12 +480,15 @@ do_ecn_nodrop_test() build_backlog $vlan $((2 * limit)) udp >/dev/null check_err $? "UDP traffic was early-dropped instead of getting into backlog" log_test "TC $((vlan - 10)): $name backlog > limit: UDP not dropped" +} - stop_traffic - sleep 1 +do_ecn_nodrop_test() +{ + in_defer_scope \ + __do_ecn_nodrop_test "$@" } -do_red_test() +__do_red_test() { local vlan=$1; shift local limit=$1; shift @@ -517,6 +499,7 @@ do_red_test() # is above limit. start_tcp_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) \ $h3_mac tos=0x01 + defer stop_traffic_sleep $! # Pushing below the queue limit should work. RET=0 @@ -532,17 +515,21 @@ do_red_test() check_fail $? "Traffic went into backlog instead of being early-dropped" pct=$(check_marking get_nmarked $vlan "== 0") check_err $? "backlog $backlog / $limit Got $pct% marked packets, expected == 0." + backlog=$(get_qdisc_backlog $vlan) local diff=$((limit - backlog)) pct=$((100 * diff / limit)) - ((-10 <= pct && pct <= 10)) - check_err $? "backlog $backlog / $limit expected <= 10% distance" + ((-15 <= pct && pct <= 15)) + check_err $? "backlog $backlog / $limit expected <= 15% distance" log_test "TC $((vlan - 10)): RED backlog > limit" +} - stop_traffic - sleep 1 +do_red_test() +{ + in_defer_scope \ + __do_red_test "$@" } -do_mc_backlog_test() +__do_mc_backlog_test() { local vlan=$1; shift local limit=$1; shift @@ -552,7 +539,10 @@ do_mc_backlog_test() RET=0 start_tcp_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) bc + defer stop_traffic_sleep $! + start_tcp_traffic $h2.$vlan $(ipaddr 2 $vlan) $(ipaddr 3 $vlan) bc + defer stop_traffic_sleep $! qbl=$(busywait 5000 until_counter_is ">= 500000" \ get_qdisc_backlog $vlan) @@ -565,13 +555,16 @@ do_mc_backlog_test() get_mc_transmit_queue $vlan) check_err $? "MC backlog reported by qdisc not visible in ethtool" - stop_traffic - stop_traffic - log_test "TC $((vlan - 10)): Qdisc reports MC backlog" } -do_mark_test() +do_mc_backlog_test() +{ + in_defer_scope \ + __do_mc_backlog_test "$@" +} + +__do_mark_test() { local vlan=$1; shift local limit=$1; shift @@ -586,6 +579,7 @@ do_mark_test() start_tcp_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) \ $h3_mac tos=0x01 + defer stop_traffic_sleep $! # Create a bit of a backlog and observe no mirroring due to marks. qevent_rule_install_$subtest @@ -600,7 +594,7 @@ do_mark_test() # Above limit, everything should be mirrored, we should see lots of # packets. build_backlog $vlan $((3 * limit / 2)) tcp tos=0x01 >/dev/null - busywait_for_counter 1100 +10000 \ + busywait_for_counter 1100 +2500 \ $fetch_counter > /dev/null check_err_fail "$should_fail" $? "ECN-marked packets $subtest'd" @@ -615,12 +609,15 @@ do_mark_test() else log_test "TC $((vlan - 10)): marked packets $subtest'd" fi +} - stop_traffic - sleep 1 +do_mark_test() +{ + in_defer_scope \ + __do_mark_test "$@" } -do_drop_test() +__do_drop_test() { local vlan=$1; shift local limit=$1; shift @@ -635,6 +632,7 @@ do_drop_test() RET=0 start_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 3 $vlan) $h3_mac + defer stop_traffic_sleep $! # Create a bit of a backlog and observe no mirroring due to drops. qevent_rule_install_$subtest @@ -651,25 +649,30 @@ do_drop_test() build_backlog $vlan $((3 * limit / 2)) udp >/dev/null base=$($fetch_counter) - send_packets $vlan udp 11 + send_packets $vlan udp 100 - now=$(busywait 1100 until_counter_is ">= $((base + 10))" $fetch_counter) - check_err $? "Dropped packets not observed: 11 expected, $((now - base)) seen" + now=$(busywait 1100 until_counter_is ">= $((base + 95))" $fetch_counter) + check_err $? "${trigger}ped packets not observed: 100 expected, $((now - base)) seen" # When no extra traffic is injected, there should be no mirroring. - busywait 1100 until_counter_is ">= $((base + 20))" $fetch_counter >/dev/null + busywait 1100 until_counter_is ">= $((base + 110))" \ + $fetch_counter >/dev/null check_fail $? "Spurious packets observed" # When the rule is uninstalled, there should be no mirroring. qevent_rule_uninstall_$subtest - send_packets $vlan udp 11 - busywait 1100 until_counter_is ">= $((base + 20))" $fetch_counter >/dev/null - check_fail $? "Spurious packets observed after uninstall" + send_packets $vlan udp 100 + now=$(busywait 1100 until_counter_is ">= $((base + 110))" \ + $fetch_counter) + check_fail $? "$((now - base)) spurious packets observed after uninstall" log_test "TC $((vlan - 10)): ${trigger}ped packets $subtest'd" +} - stop_traffic - sleep 1 +do_drop_test() +{ + in_defer_scope \ + __do_drop_test "$@" } qevent_rule_install_mirror() diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_ets.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_ets.sh index 8ecddafa79b3..8902a115d9cd 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_ets.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_ets.sh @@ -20,8 +20,8 @@ source sch_red_core.sh # $BACKLOG2 are far enough not to overlap, so that we can assume that if we do # see (do not see) marking, it is actually due to the configuration of that one # TC, and not due to configuration of the other TC leaking over. -BACKLOG1=200000 -BACKLOG2=500000 +BACKLOG1=400000 +BACKLOG2=1000000 install_root_qdisc() { @@ -35,7 +35,7 @@ install_qdisc_tc0() tc qdisc add dev $swp3 parent 10:8 handle 108: red \ limit 1000000 min $BACKLOG1 max $((BACKLOG1 + 1)) \ - probability 1.0 avpkt 8000 burst 38 "${args[@]}" + probability 1.0 avpkt 8000 burst 51 "${args[@]}" } install_qdisc_tc1() @@ -44,7 +44,7 @@ install_qdisc_tc1() tc qdisc add dev $swp3 parent 10:7 handle 107: red \ limit 1000000 min $BACKLOG2 max $((BACKLOG2 + 1)) \ - probability 1.0 avpkt 8000 burst 63 "${args[@]}" + probability 1.0 avpkt 8000 burst 126 "${args[@]}" } install_qdisc() @@ -80,36 +80,34 @@ uninstall_qdisc() ecn_test() { install_qdisc ecn + defer uninstall_qdisc do_ecn_test 10 $BACKLOG1 do_ecn_test 11 $BACKLOG2 - - uninstall_qdisc } ecn_test_perband() { install_qdisc ecn + defer uninstall_qdisc do_ecn_test_perband 10 $BACKLOG1 do_ecn_test_perband 11 $BACKLOG2 - - uninstall_qdisc } ecn_nodrop_test() { install_qdisc ecn nodrop + defer uninstall_qdisc do_ecn_nodrop_test 10 $BACKLOG1 do_ecn_nodrop_test 11 $BACKLOG2 - - uninstall_qdisc } red_test() { install_qdisc + defer uninstall_qdisc # Make sure that we get the non-zero value if there is any. local cur=$(busywait 1100 until_counter_is "> 0" \ @@ -120,50 +118,44 @@ red_test() do_red_test 10 $BACKLOG1 do_red_test 11 $BACKLOG2 - - uninstall_qdisc } mc_backlog_test() { install_qdisc + defer uninstall_qdisc # Note that the backlog numbers here do not correspond to RED # configuration, but are arbitrary. do_mc_backlog_test 10 $BACKLOG1 do_mc_backlog_test 11 $BACKLOG2 - - uninstall_qdisc } red_mirror_test() { install_qdisc qevent early_drop block 10 + defer uninstall_qdisc do_drop_mirror_test 10 $BACKLOG1 early_drop do_drop_mirror_test 11 $BACKLOG2 early_drop - - uninstall_qdisc } red_trap_test() { install_qdisc qevent early_drop block 10 + defer uninstall_qdisc do_drop_trap_test 10 $BACKLOG1 early_drop do_drop_trap_test 11 $BACKLOG2 early_drop - - uninstall_qdisc } ecn_mirror_test() { install_qdisc ecn qevent mark block 10 + defer uninstall_qdisc do_mark_mirror_test 10 $BACKLOG1 do_mark_mirror_test 11 $BACKLOG2 - - uninstall_qdisc } bail_on_lldpad "configure DCB" "configure Qdiscs" diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_root.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_root.sh index 159108d02895..e9043771787b 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_root.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_root.sh @@ -32,45 +32,51 @@ uninstall_qdisc() ecn_test() { install_qdisc ecn + defer uninstall_qdisc + do_ecn_test 10 $BACKLOG - uninstall_qdisc } ecn_test_perband() { install_qdisc ecn + defer uninstall_qdisc + do_ecn_test_perband 10 $BACKLOG - uninstall_qdisc } ecn_nodrop_test() { install_qdisc ecn nodrop + defer uninstall_qdisc + do_ecn_nodrop_test 10 $BACKLOG - uninstall_qdisc } red_test() { install_qdisc + defer uninstall_qdisc + do_red_test 10 $BACKLOG - uninstall_qdisc } mc_backlog_test() { install_qdisc + defer uninstall_qdisc + # Note that the backlog value here does not correspond to RED # configuration, but is arbitrary. do_mc_backlog_test 10 $BACKLOG - uninstall_qdisc } red_mirror_test() { install_qdisc qevent early_drop block 10 + defer uninstall_qdisc + do_drop_mirror_test 10 $BACKLOG - uninstall_qdisc } bail_on_lldpad "configure DCB" "configure Qdiscs" diff --git a/tools/testing/selftests/drivers/net/mlxsw/tc_sample.sh b/tools/testing/selftests/drivers/net/mlxsw/tc_sample.sh index 83a0210e7544..bc7ea2df49fb 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/tc_sample.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/tc_sample.sh @@ -218,7 +218,7 @@ psample_capture_start() psample_capture_stop() { - { kill %% && wait %%; } 2>/dev/null + kill_process %% } __tc_sample_rate_test() @@ -499,7 +499,7 @@ tc_sample_md_out_tc_occ_test() backlog=$(tc -j -p -s qdisc show dev $rp2 | jq '.[0]["backlog"]') # Kill mausezahn. - { kill %% && wait %%; } 2>/dev/null + kill_process %% psample_capture_stop diff --git a/tools/testing/selftests/drivers/net/netcons_basic.sh b/tools/testing/selftests/drivers/net/netcons_basic.sh index 06021b2059b7..b175f4d966e5 100755 --- a/tools/testing/selftests/drivers/net/netcons_basic.sh +++ b/tools/testing/selftests/drivers/net/netcons_basic.sh @@ -20,22 +20,26 @@ SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") # Simple script to test dynamic targets in netconsole SRCIF="" # to be populated later -SRCIP=192.168.1.1 +SRCIP=192.0.2.1 DSTIF="" # to be populated later -DSTIP=192.168.1.2 +DSTIP=192.0.2.2 PORT="6666" MSG="netconsole selftest" +USERDATA_KEY="key" +USERDATA_VALUE="value" TARGET=$(mktemp -u netcons_XXXXX) DEFAULT_PRINTK_VALUES=$(cat /proc/sys/kernel/printk) NETCONS_CONFIGFS="/sys/kernel/config/netconsole" NETCONS_PATH="${NETCONS_CONFIGFS}"/"${TARGET}" +KEY_PATH="${NETCONS_PATH}/userdata/${USERDATA_KEY}" # NAMESPACE will be populated by setup_ns with a random value NAMESPACE="" # IDs for netdevsim NSIM_DEV_1_ID=$((256 + RANDOM % 256)) NSIM_DEV_2_ID=$((512 + RANDOM % 256)) +NSIM_DEV_SYS_NEW="/sys/bus/netdevsim/new_device" # Used to create and delete namespaces source "${SCRIPTDIR}"/../../net/lib.sh @@ -43,7 +47,6 @@ source "${SCRIPTDIR}"/../../net/net_helper.sh # Create netdevsim interfaces create_ifaces() { - local NSIM_DEV_SYS_NEW=/sys/bus/netdevsim/new_device echo "$NSIM_DEV_2_ID" > "$NSIM_DEV_SYS_NEW" echo "$NSIM_DEV_1_ID" > "$NSIM_DEV_SYS_NEW" @@ -122,6 +125,8 @@ function cleanup() { # delete netconsole dynamic reconfiguration echo 0 > "${NETCONS_PATH}"/enabled + # Remove key + rmdir "${KEY_PATH}" # Remove the configfs entry rmdir "${NETCONS_PATH}" @@ -136,6 +141,18 @@ function cleanup() { echo "${DEFAULT_PRINTK_VALUES}" > /proc/sys/kernel/printk } +function set_user_data() { + if [[ ! -d "${NETCONS_PATH}""/userdata" ]] + then + echo "Userdata path not available in ${NETCONS_PATH}/userdata" + exit "${ksft_skip}" + fi + + mkdir -p "${KEY_PATH}" + VALUE_PATH="${KEY_PATH}""/value" + echo "${USERDATA_VALUE}" > "${VALUE_PATH}" +} + function listen_port_and_save_to() { local OUTPUT=${1} # Just wait for 2 seconds @@ -146,6 +163,10 @@ function listen_port_and_save_to() { function validate_result() { local TMPFILENAME="$1" + # TMPFILENAME will contain something like: + # 6.11.1-0_fbk0_rc13_509_g30d75cea12f7,13,1822,115075213798,-;netconsole selftest: netcons_gtJHM + # key=value + # Check if the file exists if [ ! -f "$TMPFILENAME" ]; then echo "FAIL: File was not generated." >&2 @@ -158,6 +179,12 @@ function validate_result() { exit "${ksft_fail}" fi + if ! grep -q "${USERDATA_KEY}=${USERDATA_VALUE}" "${TMPFILENAME}"; then + echo "FAIL: ${USERDATA_KEY}=${USERDATA_VALUE} not found in ${TMPFILENAME}" >&2 + cat "${TMPFILENAME}" >&2 + exit "${ksft_fail}" + fi + # Delete the file once it is validated, otherwise keep it # for debugging purposes rm "${TMPFILENAME}" @@ -185,6 +212,11 @@ function check_for_dependencies() { exit "${ksft_skip}" fi + if [ ! -f "${NSIM_DEV_SYS_NEW}" ]; then + echo "SKIP: file ${NSIM_DEV_SYS_NEW} does not exist. Check if CONFIG_NETDEVSIM is enabled" >&2 + exit "${ksft_skip}" + fi + if [ ! -d "${NETCONS_CONFIGFS}" ]; then echo "SKIP: directory ${NETCONS_CONFIGFS} does not exist. Check if NETCONSOLE_DYNAMIC is enabled" >&2 exit "${ksft_skip}" @@ -220,6 +252,8 @@ trap cleanup EXIT set_network # Create a dynamic target for netconsole create_dynamic_target +# Set userdata "key" with the "value" value +set_user_data # Listed for netconsole port inside the namespace and destination interface listen_port_and_save_to "${OUTPUT_FILE}" & # Wait for socat to start and listen to the port. diff --git a/tools/testing/selftests/drivers/net/netdevsim/Makefile b/tools/testing/selftests/drivers/net/netdevsim/Makefile index 5bace0b7fb57..07b7c46d3311 100644 --- a/tools/testing/selftests/drivers/net/netdevsim/Makefile +++ b/tools/testing/selftests/drivers/net/netdevsim/Makefile @@ -4,11 +4,14 @@ TEST_PROGS = devlink.sh \ devlink_in_netns.sh \ devlink_trap.sh \ ethtool-coalesce.sh \ + ethtool-features.sh \ ethtool-fec.sh \ ethtool-pause.sh \ ethtool-ring.sh \ fib.sh \ + fib_notifications.sh \ hw_stats_l3.sh \ + macsec-offload.sh \ nexthop.sh \ peer.sh \ psample.sh \ diff --git a/tools/testing/selftests/drivers/net/netdevsim/config b/tools/testing/selftests/drivers/net/netdevsim/config index adf45a3a78b4..5117c78ddf0a 100644 --- a/tools/testing/selftests/drivers/net/netdevsim/config +++ b/tools/testing/selftests/drivers/net/netdevsim/config @@ -1,6 +1,7 @@ CONFIG_DUMMY=y CONFIG_GENEVE=m CONFIG_IPV6=y +CONFIG_MACSEC=m CONFIG_NETDEVSIM=m CONFIG_NET_SCH_MQPRIO=y CONFIG_NET_SCH_MULTIQ=y diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-features.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-features.sh new file mode 100644 index 000000000000..bc210dc6ad2d --- /dev/null +++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-features.sh @@ -0,0 +1,31 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-only + +source ethtool-common.sh + +NSIM_NETDEV=$(make_netdev) + +set -o pipefail + +FEATS=" + tx-checksum-ip-generic + tx-scatter-gather + tx-tcp-segmentation + generic-segmentation-offload + generic-receive-offload" + +for feat in $FEATS ; do + s=$(ethtool --json -k $NSIM_NETDEV | jq ".[].\"$feat\".active" 2>/dev/null) + check $? "$s" true + + s=$(ethtool --json -k $NSIM_NETDEV | jq ".[].\"$feat\".fixed" 2>/dev/null) + check $? "$s" false +done + +if [ $num_errors -eq 0 ]; then + echo "PASSED all $((num_passes)) checks" + exit 0 +else + echo "FAILED $num_errors/$((num_errors+num_passes)) checks" + exit 1 +fi diff --git a/tools/testing/selftests/drivers/net/netdevsim/fib_notifications.sh b/tools/testing/selftests/drivers/net/netdevsim/fib_notifications.sh index 8d91191a098c..9896580c3d85 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/fib_notifications.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/fib_notifications.sh @@ -94,7 +94,7 @@ route_addition_check() sleep 1 $IP route add $route dev dummy1 sleep 1 - kill %% && wait %% &> /dev/null + kill_process %% route_notify_check $outfile $expected_num_notifications $offload_failed rm -f $outfile @@ -148,7 +148,7 @@ route_deletion_check() sleep 1 $IP route del $route dev dummy1 sleep 1 - kill %% && wait %% &> /dev/null + kill_process %% route_notify_check $outfile $expected_num_notifications rm -f $outfile @@ -191,7 +191,7 @@ route_replacement_check() sleep 1 $IP route replace $route dev dummy2 sleep 1 - kill %% && wait %% &> /dev/null + kill_process %% route_notify_check $outfile $expected_num_notifications rm -f $outfile diff --git a/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh b/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh new file mode 100755 index 000000000000..98033e6667d2 --- /dev/null +++ b/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh @@ -0,0 +1,117 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-only + +source ethtool-common.sh + +NSIM_NETDEV=$(make_netdev) +MACSEC_NETDEV=macsec_nsim + +set -o pipefail + +if ! ethtool -k $NSIM_NETDEV | grep -q 'macsec-hw-offload: on'; then + echo "SKIP: netdevsim doesn't support MACsec offload" + exit 4 +fi + +if ! ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec offload mac 2>/dev/null; then + echo "SKIP: couldn't create macsec device" + exit 4 +fi +ip link del $MACSEC_NETDEV + +# +# test macsec offload API +# + +ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}" type macsec port 4 offload mac +check $? + +ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}2" type macsec address "aa:bb:cc:dd:ee:ff" port 5 offload mac +check $? + +ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}3" type macsec sci abbacdde01020304 offload mac +check $? + +ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}4" type macsec port 8 offload mac 2> /dev/null +check $? '' '' 1 + +ip macsec add "${MACSEC_NETDEV}" tx sa 0 pn 1024 on key 01 12345678901234567890123456789012 +check $? + +ip macsec add "${MACSEC_NETDEV}" rx port 1234 address "1c:ed:de:ad:be:ef" +check $? + +ip macsec add "${MACSEC_NETDEV}" rx port 1234 address "1c:ed:de:ad:be:ef" sa 0 pn 1 on \ + key 00 0123456789abcdef0123456789abcdef +check $? + +ip macsec add "${MACSEC_NETDEV}" rx port 1235 address "1c:ed:de:ad:be:ef" 2> /dev/null +check $? '' '' 1 + +# can't disable macsec offload when SAs are configured +ip link set "${MACSEC_NETDEV}" type macsec offload off 2> /dev/null +check $? '' '' 1 + +ip macsec offload "${MACSEC_NETDEV}" off 2> /dev/null +check $? '' '' 1 + +# toggle macsec offload via rtnetlink +ip link set "${MACSEC_NETDEV}2" type macsec offload off +check $? + +ip link set "${MACSEC_NETDEV}2" type macsec offload mac +check $? + +# toggle macsec offload via genetlink +ip macsec offload "${MACSEC_NETDEV}2" off +check $? + +ip macsec offload "${MACSEC_NETDEV}2" mac +check $? + +for dev in ${MACSEC_NETDEV}{,2,3} ; do + ip link del $dev + check $? +done + + +# +# test ethtool features when toggling offload +# + +ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec offload mac +TMP_FEATS_ON_1="$(ethtool -k $MACSEC_NETDEV)" + +ip link set $MACSEC_NETDEV type macsec offload off +TMP_FEATS_OFF_1="$(ethtool -k $MACSEC_NETDEV)" + +ip link set $MACSEC_NETDEV type macsec offload mac +TMP_FEATS_ON_2="$(ethtool -k $MACSEC_NETDEV)" + +[ "$TMP_FEATS_ON_1" = "$TMP_FEATS_ON_2" ] +check $? + +ip link del $MACSEC_NETDEV + +ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec +check $? + +TMP_FEATS_OFF_2="$(ethtool -k $MACSEC_NETDEV)" +[ "$TMP_FEATS_OFF_1" = "$TMP_FEATS_OFF_2" ] +check $? + +ip link set $MACSEC_NETDEV type macsec offload mac +check $? + +TMP_FEATS_ON_3="$(ethtool -k $MACSEC_NETDEV)" +[ "$TMP_FEATS_ON_1" = "$TMP_FEATS_ON_3" ] +check $? + + +if [ $num_errors -eq 0 ]; then + echo "PASSED all $((num_passes)) checks" + exit 0 +else + echo "FAILED $num_errors/$((num_errors+num_passes)) checks" + exit 1 +fi diff --git a/tools/testing/selftests/drivers/net/shaper.py b/tools/testing/selftests/drivers/net/shaper.py new file mode 100755 index 000000000000..11310f19bfa0 --- /dev/null +++ b/tools/testing/selftests/drivers/net/shaper.py @@ -0,0 +1,461 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_true, KsftSkipEx +from lib.py import EthtoolFamily, NetshaperFamily +from lib.py import NetDrvEnv +from lib.py import NlError +from lib.py import cmd + +def get_shapers(cfg, nl_shaper) -> None: + try: + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + + # Default configuration: no shapers configured. + ksft_eq(len(shapers), 0) + +def get_caps(cfg, nl_shaper) -> None: + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex}, dump=True) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + + # Each device implementing shaper support must support some + # features in at least a scope. + ksft_true(len(caps)> 0) + +def set_qshapers(cfg, nl_shaper) -> None: + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'queue'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + if not 'support-bw-max' in caps or not 'support-metric-bps' in caps: + raise KsftSkipEx("device does not support queue scope shapers with bw_max and metric bps") + + cfg.queues = True; + netnl = EthtoolFamily() + channels = netnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + if channels['combined-count'] == 0: + cfg.rx_type = 'rx' + cfg.nr_queues = channels['rx-count'] + else: + cfg.rx_type = 'combined' + cfg.nr_queues = channels['combined-count'] + if cfg.nr_queues < 3: + raise KsftSkipEx(f"device does not support enough queues min 3 found {cfg.nr_queues}") + + nl_shaper.set({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 10000}) + nl_shaper.set({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 2}, + 'metric': 'bps', + 'bw-max': 20000}) + + # Querying a specific shaper not yet configured must fail. + raised = False + try: + shaper_q0 = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 0}}) + except (NlError): + raised = True + ksft_eq(raised, True) + + shaper_q1 = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + ksft_eq(shaper_q1, {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 10000}) + + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 10000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 2}, + 'metric': 'bps', + 'bw-max': 20000}]) + +def del_qshapers(cfg, nl_shaper) -> None: + if not cfg.queues: + raise KsftSkipEx("queue shapers not supported by device, skipping delete") + + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 2}}) + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(len(shapers), 0) + +def set_nshapers(cfg, nl_shaper) -> None: + # Check required features. + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'netdev'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + if not 'support-bw-max' in caps or not 'support-metric-bps' in caps: + raise KsftSkipEx("device does not support nested netdev scope shapers with weight") + + cfg.netdev = True; + nl_shaper.set({'ifindex': cfg.ifindex, + 'handle': {'scope': 'netdev', 'id': 0}, + 'bw-max': 100000}) + + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'handle': {'scope': 'netdev'}, + 'metric': 'bps', + 'bw-max': 100000}]) + +def del_nshapers(cfg, nl_shaper) -> None: + if not cfg.netdev: + raise KsftSkipEx("netdev shaper not supported by device, skipping delete") + + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'netdev'}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(len(shapers), 0) + +def basic_groups(cfg, nl_shaper) -> None: + if not cfg.netdev: + raise KsftSkipEx("netdev shaper not supported by the device") + if cfg.nr_queues < 3: + raise KsftSkipEx(f"netdev does not have enough queues min 3 reported {cfg.nr_queues}") + + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'queue'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + if not 'support-weight' in caps: + raise KsftSkipEx("device does not support queue scope shapers with weight") + + node_handle = nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 1}, + 'weight': 1}, + {'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}], + 'handle': {'scope':'netdev'}, + 'metric': 'bps', + 'bw-max': 10000}) + ksft_eq(node_handle, {'ifindex': cfg.ifindex, + 'handle': {'scope': 'netdev'}}) + + shaper = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + ksft_eq(shaper, {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'weight': 1 }) + + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 2}}) + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + + # Deleting all the leaves shaper does not affect the node one + # when the latter has 'netdev' scope. + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(len(shapers), 1) + + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'netdev'}}) + +def qgroups(cfg, nl_shaper) -> None: + if cfg.nr_queues < 4: + raise KsftSkipEx(f"netdev does not have enough queues min 4 reported {cfg.nr_queues}") + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'node'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + if not 'support-bw-max' in caps or not 'support-metric-bps' in caps: + raise KsftSkipEx("device does not support node scope shapers with bw_max and metric bps") + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'queue'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("shapers not supported by the device") + raise + if not 'support-nesting' in caps or not 'support-weight' in caps or not 'support-metric-bps' in caps: + raise KsftSkipEx("device does not support nested queue scope shapers with weight") + + cfg.groups = True; + node_handle = nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}, + {'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}], + 'handle': {'scope':'node'}, + 'metric': 'bps', + 'bw-max': 10000}) + node_id = node_handle['handle']['id'] + + shaper = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + ksft_eq(shaper, {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}) + shaper = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'node', 'id': node_id}}) + ksft_eq(shaper, {'ifindex': cfg.ifindex, + 'handle': {'scope': 'node', 'id': node_id}, + 'parent': {'scope': 'netdev'}, + 'metric': 'bps', + 'bw-max': 10000}) + + # Grouping to a specified, not existing node scope shaper must fail + raised = False + try: + nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 3}, + 'weight': 3}], + 'handle': {'scope':'node', 'id': node_id + 1}, + 'metric': 'bps', + 'bw-max': 10000}) + + except (NlError): + raised = True + ksft_eq(raised, True) + + # Add to an existing node + node_handle = nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 3}, + 'weight': 4}], + 'handle': {'scope':'node', 'id': node_id}}) + ksft_eq(node_handle, {'ifindex': cfg.ifindex, + 'handle': {'scope': 'node', 'id': node_id}}) + + shaper = nl_shaper.get({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 3}}) + ksft_eq(shaper, {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 3}, + 'weight': 4}) + + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 2}}) + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 1}}) + + # Deleting a non empty node will move the leaves downstream. + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'node', 'id': node_id}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 3}, + 'weight': 4}]) + + # Finish and verify the complete cleanup. + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': 3}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(len(shapers), 0) + +def delegation(cfg, nl_shaper) -> None: + if not cfg.groups: + raise KsftSkipEx("device does not support node scope") + try: + caps = nl_shaper.cap_get({'ifindex': cfg.ifindex, + 'scope':'node'}) + except NlError as e: + if e.error == 95: + raise KsftSkipEx("node scope shapers not supported by the device") + raise + if not 'support-nesting' in caps: + raise KsftSkipEx("device does not support node scope shapers nesting") + + node_handle = nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}, + {'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}, + {'handle': {'scope': 'queue', 'id': 3}, + 'weight': 1}], + 'handle': {'scope':'node'}, + 'metric': 'bps', + 'bw-max': 10000}) + node_id = node_handle['handle']['id'] + + # Create the nested node and validate the hierarchy + nested_node_handle = nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}, + {'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}], + 'handle': {'scope':'node'}, + 'metric': 'bps', + 'bw-max': 5000}) + nested_node_id = nested_node_handle['handle']['id'] + ksft_true(nested_node_id != node_id) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': nested_node_id}, + 'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': nested_node_id}, + 'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 3}, + 'weight': 1}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'node', 'id': node_id}, + 'metric': 'bps', + 'bw-max': 10000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'node', 'id': nested_node_id}, + 'metric': 'bps', + 'bw-max': 5000}]) + + # Deleting a non empty node will move the leaves downstream. + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'node', 'id': nested_node_id}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 1}, + 'weight': 3}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 2}, + 'weight': 2}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'node', 'id': node_id}, + 'handle': {'scope': 'queue', 'id': 3}, + 'weight': 1}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'node', 'id': node_id}, + 'metric': 'bps', + 'bw-max': 10000}]) + + # Final cleanup. + for i in range(1, 4): + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': i}}) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(len(shapers), 0) + +def queue_update(cfg, nl_shaper) -> None: + if cfg.nr_queues < 4: + raise KsftSkipEx(f"netdev does not have enough queues min 4 reported {cfg.nr_queues}") + if not cfg.queues: + raise KsftSkipEx("device does not support queue scope") + + for i in range(3): + nl_shaper.set({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': i}, + 'metric': 'bps', + 'bw-max': (i + 1) * 1000}) + # Delete a channel, with no shapers configured on top of the related + # queue: no changes expected + cmd(f"ethtool -L {cfg.dev['ifname']} {cfg.rx_type} 3", timeout=10) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 0}, + 'metric': 'bps', + 'bw-max': 1000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 2000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 2}, + 'metric': 'bps', + 'bw-max': 3000}]) + + # Delete a channel, with a shaper configured on top of the related + # queue: the shaper must be deleted, too + cmd(f"ethtool -L {cfg.dev['ifname']} {cfg.rx_type} 2", timeout=10) + + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 0}, + 'metric': 'bps', + 'bw-max': 1000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 2000}]) + + # Restore the original channels number, no expected changes + cmd(f"ethtool -L {cfg.dev['ifname']} {cfg.rx_type} {cfg.nr_queues}", timeout=10) + shapers = nl_shaper.get({'ifindex': cfg.ifindex}, dump=True) + ksft_eq(shapers, [{'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 0}, + 'metric': 'bps', + 'bw-max': 1000}, + {'ifindex': cfg.ifindex, + 'parent': {'scope': 'netdev'}, + 'handle': {'scope': 'queue', 'id': 1}, + 'metric': 'bps', + 'bw-max': 2000}]) + + # Final cleanup. + for i in range(0, 2): + nl_shaper.delete({'ifindex': cfg.ifindex, + 'handle': {'scope': 'queue', 'id': i}}) + +def main() -> None: + with NetDrvEnv(__file__, queue_count=4) as cfg: + cfg.queues = False + cfg.netdev = False + cfg.groups = False + cfg.nr_queues = 0 + ksft_run([get_shapers, + get_caps, + set_qshapers, + del_qshapers, + set_nshapers, + del_nshapers, + basic_groups, + qgroups, + delegation, + queue_update], args=(cfg, NetshaperFamily())) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 0f8c110e0805..321e63955272 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -36,6 +36,17 @@ MAKEFLAGS += --no-builtin-rules CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES) $(TOOLS_INCLUDES) LDLIBS = -lrt -lpthread -lm +KDIR ?= /lib/modules/$(shell uname -r)/build +ifneq (,$(wildcard $(KDIR)/Module.symvers)) +ifneq (,$(wildcard $(KDIR)/include/linux/page_frag_cache.h)) +TEST_GEN_MODS_DIR := page_frag +else +PAGE_FRAG_WARNING = "missing page_frag_cache.h, please use a newer kernel" +endif +else +PAGE_FRAG_WARNING = "missing Module.symvers, please have the kernel built first" +endif + TEST_GEN_FILES = cow TEST_GEN_FILES += compaction_test TEST_GEN_FILES += gup_longterm @@ -126,6 +137,7 @@ TEST_FILES += test_hmm.sh TEST_FILES += va_high_addr_switch.sh TEST_FILES += charge_reserved_hugetlb.sh TEST_FILES += hugetlb_reparenting_test.sh +TEST_FILES += test_page_frag.sh # required by charge_reserved_hugetlb.sh TEST_FILES += write_hugetlb_memory.sh @@ -211,3 +223,12 @@ warn_missing_liburing: echo "Warning: missing liburing support. Some tests will be skipped." ; \ echo endif + +ifneq ($(PAGE_FRAG_WARNING),) +all: warn_missing_page_frag + +warn_missing_page_frag: + @echo ; \ + echo "Warning: $(PAGE_FRAG_WARNING). page_frag test will be skipped." ; \ + echo +endif diff --git a/tools/testing/selftests/mm/page_frag/Makefile b/tools/testing/selftests/mm/page_frag/Makefile new file mode 100644 index 000000000000..8c8bb39ffa28 --- /dev/null +++ b/tools/testing/selftests/mm/page_frag/Makefile @@ -0,0 +1,18 @@ +PAGE_FRAG_TEST_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= /lib/modules/$(shell uname -r)/build + +ifeq ($(V),1) +Q = +else +Q = @ +endif + +MODULES = page_frag_test.ko + +obj-m += page_frag_test.o + +all: + +$(Q)make -C $(KDIR) M=$(PAGE_FRAG_TEST_DIR) modules + +clean: + +$(Q)make -C $(KDIR) M=$(PAGE_FRAG_TEST_DIR) clean diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c new file mode 100644 index 000000000000..e806c1866e36 --- /dev/null +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Test module for page_frag cache + * + * Copyright (C) 2024 Yunsheng Lin <linyunsheng@huawei.com> + */ + +#include <linux/module.h> +#include <linux/cpumask.h> +#include <linux/completion.h> +#include <linux/ptr_ring.h> +#include <linux/kthread.h> +#include <linux/page_frag_cache.h> + +#define TEST_FAILED_PREFIX "page_frag_test failed: " + +static struct ptr_ring ptr_ring; +static int nr_objs = 512; +static atomic_t nthreads; +static struct completion wait; +static struct page_frag_cache test_nc; +static int test_popped; +static int test_pushed; +static bool force_exit; + +static int nr_test = 2000000; +module_param(nr_test, int, 0); +MODULE_PARM_DESC(nr_test, "number of iterations to test"); + +static bool test_align; +module_param(test_align, bool, 0); +MODULE_PARM_DESC(test_align, "use align API for testing"); + +static int test_alloc_len = 2048; +module_param(test_alloc_len, int, 0); +MODULE_PARM_DESC(test_alloc_len, "alloc len for testing"); + +static int test_push_cpu; +module_param(test_push_cpu, int, 0); +MODULE_PARM_DESC(test_push_cpu, "test cpu for pushing fragment"); + +static int test_pop_cpu; +module_param(test_pop_cpu, int, 0); +MODULE_PARM_DESC(test_pop_cpu, "test cpu for popping fragment"); + +static int page_frag_pop_thread(void *arg) +{ + struct ptr_ring *ring = arg; + + pr_info("page_frag pop test thread begins on cpu %d\n", + smp_processor_id()); + + while (test_popped < nr_test) { + void *obj = __ptr_ring_consume(ring); + + if (obj) { + test_popped++; + page_frag_free(obj); + } else { + if (force_exit) + break; + + cond_resched(); + } + } + + if (atomic_dec_and_test(&nthreads)) + complete(&wait); + + pr_info("page_frag pop test thread exits on cpu %d\n", + smp_processor_id()); + + return 0; +} + +static int page_frag_push_thread(void *arg) +{ + struct ptr_ring *ring = arg; + + pr_info("page_frag push test thread begins on cpu %d\n", + smp_processor_id()); + + while (test_pushed < nr_test && !force_exit) { + void *va; + int ret; + + if (test_align) { + va = page_frag_alloc_align(&test_nc, test_alloc_len, + GFP_KERNEL, SMP_CACHE_BYTES); + + if ((unsigned long)va & (SMP_CACHE_BYTES - 1)) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "unaligned va returned\n"); + } + } else { + va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + } + + if (!va) + continue; + + ret = __ptr_ring_produce(ring, va); + if (ret) { + page_frag_free(va); + cond_resched(); + } else { + test_pushed++; + } + } + + pr_info("page_frag push test thread exits on cpu %d\n", + smp_processor_id()); + + if (atomic_dec_and_test(&nthreads)) + complete(&wait); + + return 0; +} + +static int __init page_frag_test_init(void) +{ + struct task_struct *tsk_push, *tsk_pop; + int last_pushed = 0, last_popped = 0; + ktime_t start; + u64 duration; + int ret; + + page_frag_cache_init(&test_nc); + atomic_set(&nthreads, 2); + init_completion(&wait); + + if (test_alloc_len > PAGE_SIZE || test_alloc_len <= 0 || + !cpu_active(test_push_cpu) || !cpu_active(test_pop_cpu)) + return -EINVAL; + + ret = ptr_ring_init(&ptr_ring, nr_objs, GFP_KERNEL); + if (ret) + return ret; + + tsk_push = kthread_create_on_cpu(page_frag_push_thread, &ptr_ring, + test_push_cpu, "page_frag_push"); + if (IS_ERR(tsk_push)) + return PTR_ERR(tsk_push); + + tsk_pop = kthread_create_on_cpu(page_frag_pop_thread, &ptr_ring, + test_pop_cpu, "page_frag_pop"); + if (IS_ERR(tsk_pop)) { + kthread_stop(tsk_push); + return PTR_ERR(tsk_pop); + } + + start = ktime_get(); + wake_up_process(tsk_push); + wake_up_process(tsk_pop); + + pr_info("waiting for test to complete\n"); + + while (!wait_for_completion_timeout(&wait, msecs_to_jiffies(10000))) { + /* exit if there is no progress for push or pop size */ + if (last_pushed == test_pushed || last_popped == test_popped) { + WARN_ONCE(true, TEST_FAILED_PREFIX "no progress\n"); + force_exit = true; + continue; + } + + last_pushed = test_pushed; + last_popped = test_popped; + pr_info("page_frag_test progress: pushed = %d, popped = %d\n", + test_pushed, test_popped); + } + + if (force_exit) { + pr_err(TEST_FAILED_PREFIX "exit with error\n"); + goto out; + } + + duration = (u64)ktime_us_delta(ktime_get(), start); + pr_info("%d of iterations for %s testing took: %lluus\n", nr_test, + test_align ? "aligned" : "non-aligned", duration); + +out: + ptr_ring_cleanup(&ptr_ring, NULL); + page_frag_cache_drain(&test_nc); + + return -EAGAIN; +} + +static void __exit page_frag_test_exit(void) +{ +} + +module_init(page_frag_test_init); +module_exit(page_frag_test_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Yunsheng Lin <linyunsheng@huawei.com>"); +MODULE_DESCRIPTION("Test module for page_frag"); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index c5797ad1d37b..2c5394584af4 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -75,6 +75,8 @@ separated by spaces: read-only VMAs - mdwe test prctl(PR_SET_MDWE, ...) +- page_frag + test handling of page fragment allocation and freeing example: ./run_vmtests.sh -t "hmm mmap ksm" EOF @@ -456,6 +458,12 @@ CATEGORY="mkdirty" run_test ./mkdirty CATEGORY="mdwe" run_test ./mdwe_test +CATEGORY="page_frag" run_test ./test_page_frag.sh smoke + +CATEGORY="page_frag" run_test ./test_page_frag.sh aligned + +CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned + echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix echo "1..${count_total}" | tap_output diff --git a/tools/testing/selftests/mm/test_page_frag.sh b/tools/testing/selftests/mm/test_page_frag.sh new file mode 100755 index 000000000000..f55b105084cf --- /dev/null +++ b/tools/testing/selftests/mm/test_page_frag.sh @@ -0,0 +1,175 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) 2024 Yunsheng Lin <linyunsheng@huawei.com> +# Copyright (C) 2018 Uladzislau Rezki (Sony) <urezki@gmail.com> +# +# This is a test script for the kernel test driver to test the +# correctness and performance of page_frag's implementation. +# Therefore it is just a kernel module loader. You can specify +# and pass different parameters in order to: +# a) analyse performance of page fragment allocations; +# b) stressing and stability check of page_frag subsystem. + +DRIVER="./page_frag/page_frag_test.ko" +CPU_LIST=$(grep -m 2 processor /proc/cpuinfo | cut -d ' ' -f 2) +TEST_CPU_0=$(echo $CPU_LIST | awk '{print $1}') + +if [ $(echo $CPU_LIST | wc -w) -gt 1 ]; then + TEST_CPU_1=$(echo $CPU_LIST | awk '{print $2}') + NR_TEST=100000000 +else + TEST_CPU_1=$TEST_CPU_0 + NR_TEST=1000000 +fi + +# 1 if fails +exitcode=1 + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +check_test_failed_prefix() { + if dmesg | grep -q 'page_frag_test failed:';then + echo "page_frag_test failed, please check dmesg" + exit $exitcode + fi +} + +# +# Static templates for testing of page_frag APIs. +# Also it is possible to pass any supported parameters manually. +# +SMOKE_PARAM="test_push_cpu=$TEST_CPU_0 test_pop_cpu=$TEST_CPU_1" +NONALIGNED_PARAM="$SMOKE_PARAM test_alloc_len=75 nr_test=$NR_TEST" +ALIGNED_PARAM="$NONALIGNED_PARAM test_align=1" + +check_test_requirements() +{ + uid=$(id -u) + if [ $uid -ne 0 ]; then + echo "$0: Must be run as root" + exit $ksft_skip + fi + + if ! which insmod > /dev/null 2>&1; then + echo "$0: You need insmod installed" + exit $ksft_skip + fi + + if [ ! -f $DRIVER ]; then + echo "$0: You need to compile page_frag_test module" + exit $ksft_skip + fi +} + +run_nonaligned_check() +{ + echo "Run performance tests to evaluate how fast nonaligned alloc API is." + + insmod $DRIVER $NONALIGNED_PARAM > /dev/null 2>&1 +} + +run_aligned_check() +{ + echo "Run performance tests to evaluate how fast aligned alloc API is." + + insmod $DRIVER $ALIGNED_PARAM > /dev/null 2>&1 +} + +run_smoke_check() +{ + echo "Run smoke test." + + insmod $DRIVER $SMOKE_PARAM > /dev/null 2>&1 +} + +usage() +{ + echo -n "Usage: $0 [ aligned ] | [ nonaligned ] | | [ smoke ] | " + echo "manual parameters" + echo + echo "Valid tests and parameters:" + echo + modinfo $DRIVER + echo + echo "Example usage:" + echo + echo "# Shows help message" + echo "$0" + echo + echo "# Smoke testing" + echo "$0 smoke" + echo + echo "# Performance testing for nonaligned alloc API" + echo "$0 nonaligned" + echo + echo "# Performance testing for aligned alloc API" + echo "$0 aligned" + echo + exit 0 +} + +function validate_passed_args() +{ + VALID_ARGS=`modinfo $DRIVER | awk '/parm:/ {print $2}' | sed 's/:.*//'` + + # + # Something has been passed, check it. + # + for passed_arg in $@; do + key=${passed_arg//=*/} + valid=0 + + for valid_arg in $VALID_ARGS; do + if [[ $key = $valid_arg ]]; then + valid=1 + break + fi + done + + if [[ $valid -ne 1 ]]; then + echo "Error: key is not correct: ${key}" + exit $exitcode + fi + done +} + +function run_manual_check() +{ + # + # Validate passed parameters. If there is wrong one, + # the script exists and does not execute further. + # + validate_passed_args $@ + + echo "Run the test with following parameters: $@" + insmod $DRIVER $@ > /dev/null 2>&1 +} + +function run_test() +{ + if [ $# -eq 0 ]; then + usage + else + if [[ "$1" = "smoke" ]]; then + run_smoke_check + elif [[ "$1" = "nonaligned" ]]; then + run_nonaligned_check + elif [[ "$1" = "aligned" ]]; then + run_aligned_check + else + run_manual_check $@ + fi + fi + + check_test_failed_prefix + + echo "Done." + echo "Check the kernel ring buffer to see the summary." +} + +check_test_requirements +run_test $@ + +exit 0 diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 59fe07ee2df9..28a715a8ef2b 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -2,6 +2,7 @@ bind_bhash bind_timewait bind_wildcard +busy_poller cmsg_sender diag_uid epoll_busy_poll @@ -18,7 +19,6 @@ ipv6_flowlabel_mgr log.txt msg_oob msg_zerocopy -ncdevmem netlink-dumps nettest psock_fanout diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 5e86f7a51b43..3d487b03c4a0 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -93,14 +93,15 @@ TEST_PROGS += test_vxlan_mdb.sh TEST_PROGS += test_bridge_neigh_suppress.sh TEST_PROGS += test_vxlan_nolocalbypass.sh TEST_PROGS += test_bridge_backup_port.sh -TEST_PROGS += fdb_flush.sh +TEST_PROGS += fdb_flush.sh fdb_notify.sh TEST_PROGS += fq_band_pktlimit.sh TEST_PROGS += vlan_hw_filter.sh TEST_PROGS += bpf_offload.py +TEST_PROGS += ipv6_route_update_soft_lockup.sh +TEST_PROGS += busy_poll_test.sh # YNL files, must be before "include ..lib.mk" -EXTRA_CLEAN += $(OUTPUT)/libynl.a -YNL_GEN_FILES := ncdevmem +YNL_GEN_FILES := busy_poller TEST_GEN_FILES += $(YNL_GEN_FILES) TEST_FILES := settings diff --git a/tools/testing/selftests/net/bpf_offload.py b/tools/testing/selftests/net/bpf_offload.py index 3efe44f6e92a..d10f420e4ef6 100755 --- a/tools/testing/selftests/net/bpf_offload.py +++ b/tools/testing/selftests/net/bpf_offload.py @@ -594,8 +594,9 @@ def check_extack_nsim(output, reference, args): check_extack(output, "netdevsim: " + reference, args) def check_no_extack(res, needle): - fail((res[1] + res[2]).count(needle) or (res[1] + res[2]).count("Warning:"), - "Found '%s' in command output, leaky extack?" % (needle)) + haystack = (res[1] + res[2]).strip() + fail(haystack.count(needle) or haystack.count("Warning:"), + "Unexpected command output, leaky extack? ('%s', '%s')" % (needle, haystack)) def check_verifier_log(output, reference): lines = output.split("\n") diff --git a/tools/testing/selftests/net/busy_poll_test.sh b/tools/testing/selftests/net/busy_poll_test.sh new file mode 100755 index 000000000000..7db292ec4884 --- /dev/null +++ b/tools/testing/selftests/net/busy_poll_test.sh @@ -0,0 +1,165 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +source net_helper.sh + +NSIM_SV_ID=$((256 + RANDOM % 256)) +NSIM_SV_SYS=/sys/bus/netdevsim/devices/netdevsim$NSIM_SV_ID +NSIM_CL_ID=$((512 + RANDOM % 256)) +NSIM_CL_SYS=/sys/bus/netdevsim/devices/netdevsim$NSIM_CL_ID + +NSIM_DEV_SYS_NEW=/sys/bus/netdevsim/new_device +NSIM_DEV_SYS_DEL=/sys/bus/netdevsim/del_device +NSIM_DEV_SYS_LINK=/sys/bus/netdevsim/link_device +NSIM_DEV_SYS_UNLINK=/sys/bus/netdevsim/unlink_device + +SERVER_IP=192.168.1.1 +CLIENT_IP=192.168.1.2 +SERVER_PORT=48675 + +# busy poll config +MAX_EVENTS=8 +BUSY_POLL_USECS=0 +BUSY_POLL_BUDGET=16 +PREFER_BUSY_POLL=1 + +# IRQ deferral config +NAPI_DEFER_HARD_IRQS=100 +GRO_FLUSH_TIMEOUT=50000 +SUSPEND_TIMEOUT=20000000 + +setup_ns() +{ + set -e + ip netns add nssv + ip netns add nscl + + NSIM_SV_NAME=$(find $NSIM_SV_SYS/net -maxdepth 1 -type d ! \ + -path $NSIM_SV_SYS/net -exec basename {} \;) + NSIM_CL_NAME=$(find $NSIM_CL_SYS/net -maxdepth 1 -type d ! \ + -path $NSIM_CL_SYS/net -exec basename {} \;) + + # ensure the server has 1 queue + ethtool -L $NSIM_SV_NAME combined 1 2>/dev/null + + ip link set $NSIM_SV_NAME netns nssv + ip link set $NSIM_CL_NAME netns nscl + + ip netns exec nssv ip addr add "${SERVER_IP}/24" dev $NSIM_SV_NAME + ip netns exec nscl ip addr add "${CLIENT_IP}/24" dev $NSIM_CL_NAME + + ip netns exec nssv ip link set dev $NSIM_SV_NAME up + ip netns exec nscl ip link set dev $NSIM_CL_NAME up + + set +e +} + +cleanup_ns() +{ + ip netns del nscl + ip netns del nssv +} + +test_busypoll() +{ + suspend_value=${1:-0} + tmp_file=$(mktemp) + out_file=$(mktemp) + + # fill a test file with random data + dd if=/dev/urandom of=${tmp_file} bs=1M count=1 2> /dev/null + + timeout -k 1s 30s ip netns exec nssv ./busy_poller \ + -p${SERVER_PORT} \ + -b${SERVER_IP} \ + -m${MAX_EVENTS} \ + -u${BUSY_POLL_USECS} \ + -P${PREFER_BUSY_POLL} \ + -g${BUSY_POLL_BUDGET} \ + -i${NSIM_SV_IFIDX} \ + -s${suspend_value} \ + -o${out_file}& + + wait_local_port_listen nssv ${SERVER_PORT} tcp + + ip netns exec nscl socat -u $tmp_file TCP:${SERVER_IP}:${SERVER_PORT} + + wait + + tmp_file_md5sum=$(md5sum $tmp_file | cut -f1 -d' ') + out_file_md5sum=$(md5sum $out_file | cut -f1 -d' ') + + if [ "$tmp_file_md5sum" = "$out_file_md5sum" ]; then + res=0 + else + echo "md5sum mismatch" + echo "input file md5sum: ${tmp_file_md5sum}"; + echo "output file md5sum: ${out_file_md5sum}"; + res=1 + fi + + rm $out_file $tmp_file + + return $res +} + +test_busypoll_with_suspend() +{ + test_busypoll ${SUSPEND_TIMEOUT} + + return $? +} + +### +### Code start +### + +modprobe netdevsim + +# linking + +echo $NSIM_SV_ID > $NSIM_DEV_SYS_NEW +echo $NSIM_CL_ID > $NSIM_DEV_SYS_NEW +udevadm settle + +setup_ns + +NSIM_SV_FD=$((256 + RANDOM % 256)) +exec {NSIM_SV_FD}</var/run/netns/nssv +NSIM_SV_IFIDX=$(ip netns exec nssv cat /sys/class/net/$NSIM_SV_NAME/ifindex) + +NSIM_CL_FD=$((256 + RANDOM % 256)) +exec {NSIM_CL_FD}</var/run/netns/nscl +NSIM_CL_IFIDX=$(ip netns exec nscl cat /sys/class/net/$NSIM_CL_NAME/ifindex) + +echo "$NSIM_SV_FD:$NSIM_SV_IFIDX $NSIM_CL_FD:$NSIM_CL_IFIDX" > \ + $NSIM_DEV_SYS_LINK + +if [ $? -ne 0 ]; then + echo "linking netdevsim1 with netdevsim2 should succeed" + cleanup_ns + exit 1 +fi + +test_busypoll +if [ $? -ne 0 ]; then + echo "test_busypoll failed" + cleanup_ns + exit 1 +fi + +test_busypoll_with_suspend +if [ $? -ne 0 ]; then + echo "test_busypoll_with_suspend failed" + cleanup_ns + exit 1 +fi + +echo "$NSIM_SV_FD:$NSIM_SV_IFIDX" > $NSIM_DEV_SYS_UNLINK + +echo $NSIM_CL_ID > $NSIM_DEV_SYS_DEL + +cleanup_ns + +modprobe -r netdevsim + +exit 0 diff --git a/tools/testing/selftests/net/busy_poller.c b/tools/testing/selftests/net/busy_poller.c new file mode 100644 index 000000000000..99b0e8c17fca --- /dev/null +++ b/tools/testing/selftests/net/busy_poller.c @@ -0,0 +1,346 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <assert.h> +#include <errno.h> +#include <error.h> +#include <fcntl.h> +#include <inttypes.h> +#include <limits.h> +#include <stdlib.h> +#include <stdio.h> +#include <string.h> +#include <unistd.h> +#include <ynl.h> + +#include <arpa/inet.h> +#include <netinet/in.h> + +#include <sys/epoll.h> +#include <sys/ioctl.h> +#include <sys/socket.h> +#include <sys/types.h> + +#include <linux/genetlink.h> +#include <linux/netlink.h> + +#include "netdev-user.h" + +/* The below ifdef blob is required because: + * + * - sys/epoll.h does not (yet) have the ioctl definitions included. So, + * systems with older glibcs will not have them available. However, + * sys/epoll.h does include the type definition for epoll_data, which is + * needed by the user program (e.g. epoll_event.data.fd) + * + * - linux/eventpoll.h does not define the epoll_data type, it is simply an + * opaque __u64. It does, however, include the ioctl definition. + * + * Including both headers is impossible (types would be redefined), so I've + * opted instead to take sys/epoll.h, and include the blob below. + * + * Someday, when glibc is globally up to date, the blob below can be removed. + */ +#if !defined(EPOLL_IOC_TYPE) +struct epoll_params { + uint32_t busy_poll_usecs; + uint16_t busy_poll_budget; + uint8_t prefer_busy_poll; + + /* pad the struct to a multiple of 64bits */ + uint8_t __pad; +}; + +#define EPOLL_IOC_TYPE 0x8A +#define EPIOCSPARAMS _IOW(EPOLL_IOC_TYPE, 0x01, struct epoll_params) +#define EPIOCGPARAMS _IOR(EPOLL_IOC_TYPE, 0x02, struct epoll_params) +#endif + +static uint32_t cfg_port = 8000; +static struct in_addr cfg_bind_addr = { .s_addr = INADDR_ANY }; +static char *cfg_outfile; +static int cfg_max_events = 8; +static int cfg_ifindex; + +/* busy poll params */ +static uint32_t cfg_busy_poll_usecs; +static uint32_t cfg_busy_poll_budget; +static uint32_t cfg_prefer_busy_poll; + +/* IRQ params */ +static uint32_t cfg_defer_hard_irqs; +static uint64_t cfg_gro_flush_timeout; +static uint64_t cfg_irq_suspend_timeout; + +static void usage(const char *filepath) +{ + error(1, 0, + "Usage: %s -p<port> -b<addr> -m<max_events> -u<busy_poll_usecs> -P<prefer_busy_poll> -g<busy_poll_budget> -o<outfile> -d<defer_hard_irqs> -r<gro_flush_timeout> -s<irq_suspend_timeout> -i<ifindex>", + filepath); +} + +static void parse_opts(int argc, char **argv) +{ + int ret; + int c; + + if (argc <= 1) + usage(argv[0]); + + while ((c = getopt(argc, argv, "p:m:b:u:P:g:o:d:r:s:i:")) != -1) { + switch (c) { + case 'u': + cfg_busy_poll_usecs = strtoul(optarg, NULL, 0); + if (cfg_busy_poll_usecs == ULONG_MAX || + cfg_busy_poll_usecs > UINT32_MAX) + error(1, ERANGE, "busy_poll_usecs too large"); + break; + case 'P': + cfg_prefer_busy_poll = strtoul(optarg, NULL, 0); + if (cfg_prefer_busy_poll == ULONG_MAX || + cfg_prefer_busy_poll > 1) + error(1, ERANGE, + "prefer busy poll should be 0 or 1"); + break; + case 'g': + cfg_busy_poll_budget = strtoul(optarg, NULL, 0); + if (cfg_busy_poll_budget == ULONG_MAX || + cfg_busy_poll_budget > UINT16_MAX) + error(1, ERANGE, + "busy poll budget must be [0, UINT16_MAX]"); + break; + case 'p': + cfg_port = strtoul(optarg, NULL, 0); + if (cfg_port > UINT16_MAX) + error(1, ERANGE, "port must be <= 65535"); + break; + case 'b': + ret = inet_aton(optarg, &cfg_bind_addr); + if (ret == 0) + error(1, errno, + "bind address %s invalid", optarg); + break; + case 'o': + cfg_outfile = strdup(optarg); + if (!cfg_outfile) + error(1, 0, "outfile invalid"); + break; + case 'm': + cfg_max_events = strtol(optarg, NULL, 0); + + if (cfg_max_events == LONG_MIN || + cfg_max_events == LONG_MAX || + cfg_max_events <= 0) + error(1, ERANGE, + "max events must be > 0 and < LONG_MAX"); + break; + case 'd': + cfg_defer_hard_irqs = strtoul(optarg, NULL, 0); + + if (cfg_defer_hard_irqs == ULONG_MAX || + cfg_defer_hard_irqs > INT32_MAX) + error(1, ERANGE, + "defer_hard_irqs must be <= INT32_MAX"); + break; + case 'r': + cfg_gro_flush_timeout = strtoull(optarg, NULL, 0); + + if (cfg_gro_flush_timeout == ULLONG_MAX) + error(1, ERANGE, + "gro_flush_timeout must be < ULLONG_MAX"); + break; + case 's': + cfg_irq_suspend_timeout = strtoull(optarg, NULL, 0); + + if (cfg_irq_suspend_timeout == ULLONG_MAX) + error(1, ERANGE, + "irq_suspend_timeout must be < ULLONG_MAX"); + break; + case 'i': + cfg_ifindex = strtoul(optarg, NULL, 0); + if (cfg_ifindex == ULONG_MAX) + error(1, ERANGE, + "ifindex must be < ULONG_MAX"); + break; + } + } + + if (!cfg_ifindex) + usage(argv[0]); + + if (optind != argc) + usage(argv[0]); +} + +static void epoll_ctl_add(int epfd, int fd, uint32_t events) +{ + struct epoll_event ev; + + ev.events = events; + ev.data.fd = fd; + if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1) + error(1, errno, "epoll_ctl add fd: %d", fd); +} + +static void setnonblock(int sockfd) +{ + int flags; + + flags = fcntl(sockfd, F_GETFL, 0); + + if (fcntl(sockfd, F_SETFL, flags | O_NONBLOCK) == -1) + error(1, errno, "unable to set socket to nonblocking mode"); +} + +static void write_chunk(int fd, char *buf, ssize_t buflen) +{ + ssize_t remaining = buflen; + char *buf_offset = buf; + ssize_t writelen = 0; + ssize_t write_result; + + while (writelen < buflen) { + write_result = write(fd, buf_offset, remaining); + if (write_result == -1) + error(1, errno, "unable to write data to outfile"); + + writelen += write_result; + remaining -= write_result; + buf_offset += write_result; + } +} + +static void setup_queue(void) +{ + struct netdev_napi_get_list *napi_list = NULL; + struct netdev_napi_get_req_dump *req = NULL; + struct netdev_napi_set_req *set_req = NULL; + struct ynl_sock *ys; + struct ynl_error yerr; + uint32_t napi_id; + + ys = ynl_sock_create(&ynl_netdev_family, &yerr); + if (!ys) + error(1, 0, "YNL: %s", yerr.msg); + + req = netdev_napi_get_req_dump_alloc(); + netdev_napi_get_req_dump_set_ifindex(req, cfg_ifindex); + napi_list = netdev_napi_get_dump(ys, req); + + /* assume there is 1 NAPI configured and take the first */ + if (napi_list->obj._present.id) + napi_id = napi_list->obj.id; + else + error(1, 0, "napi ID not present?"); + + set_req = netdev_napi_set_req_alloc(); + netdev_napi_set_req_set_id(set_req, napi_id); + netdev_napi_set_req_set_defer_hard_irqs(set_req, cfg_defer_hard_irqs); + netdev_napi_set_req_set_gro_flush_timeout(set_req, + cfg_gro_flush_timeout); + netdev_napi_set_req_set_irq_suspend_timeout(set_req, + cfg_irq_suspend_timeout); + + if (netdev_napi_set(ys, set_req)) + error(1, 0, "can't set NAPI params: %s\n", yerr.msg); + + netdev_napi_get_list_free(napi_list); + netdev_napi_get_req_dump_free(req); + netdev_napi_set_req_free(set_req); + ynl_sock_destroy(ys); +} + +static void run_poller(void) +{ + struct epoll_event events[cfg_max_events]; + struct epoll_params epoll_params = {0}; + struct sockaddr_in server_addr; + int i, epfd, nfds; + ssize_t readlen; + int outfile_fd; + char buf[1024]; + int sockfd; + int conn; + int val; + + outfile_fd = open(cfg_outfile, O_WRONLY | O_CREAT, 0644); + if (outfile_fd == -1) + error(1, errno, "unable to open outfile: %s", cfg_outfile); + + sockfd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); + if (sockfd == -1) + error(1, errno, "unable to create listen socket"); + + server_addr.sin_family = AF_INET; + server_addr.sin_port = htons(cfg_port); + server_addr.sin_addr = cfg_bind_addr; + + /* these values are range checked during parse_opts, so casting is safe + * here + */ + epoll_params.busy_poll_usecs = cfg_busy_poll_usecs; + epoll_params.busy_poll_budget = (uint16_t)cfg_busy_poll_budget; + epoll_params.prefer_busy_poll = (uint8_t)cfg_prefer_busy_poll; + epoll_params.__pad = 0; + + val = 1; + if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &val, sizeof(val))) + error(1, errno, "poller setsockopt reuseaddr"); + + setnonblock(sockfd); + + if (bind(sockfd, (struct sockaddr *)&server_addr, + sizeof(struct sockaddr_in))) + error(0, errno, "poller bind to port: %d\n", cfg_port); + + if (listen(sockfd, 1)) + error(1, errno, "poller listen"); + + epfd = epoll_create1(0); + if (ioctl(epfd, EPIOCSPARAMS, &epoll_params) == -1) + error(1, errno, "unable to set busy poll params"); + + epoll_ctl_add(epfd, sockfd, EPOLLIN | EPOLLOUT | EPOLLET); + + for (;;) { + nfds = epoll_wait(epfd, events, cfg_max_events, -1); + for (i = 0; i < nfds; i++) { + if (events[i].data.fd == sockfd) { + conn = accept(sockfd, NULL, NULL); + if (conn == -1) + error(1, errno, + "accepting incoming connection failed"); + + setnonblock(conn); + epoll_ctl_add(epfd, conn, + EPOLLIN | EPOLLET | EPOLLRDHUP | + EPOLLHUP); + } else if (events[i].events & EPOLLIN) { + for (;;) { + readlen = read(events[i].data.fd, buf, + sizeof(buf)); + if (readlen > 0) + write_chunk(outfile_fd, buf, + readlen); + else + break; + } + } else { + /* spurious event ? */ + } + if (events[i].events & (EPOLLRDHUP | EPOLLHUP)) { + epoll_ctl(epfd, EPOLL_CTL_DEL, + events[i].data.fd, NULL); + close(events[i].data.fd); + close(outfile_fd); + return; + } + } + } +} + +int main(int argc, char *argv[]) +{ + parse_opts(argc, argv); + setup_queue(); + run_poller(); + return 0; +} diff --git a/tools/testing/selftests/net/drop_monitor_tests.sh b/tools/testing/selftests/net/drop_monitor_tests.sh index 7c4818c971fc..507d0a82f5f0 100755 --- a/tools/testing/selftests/net/drop_monitor_tests.sh +++ b/tools/testing/selftests/net/drop_monitor_tests.sh @@ -77,7 +77,7 @@ sw_drops_test() rm ${dir}/packets.pcap - { kill %% && wait %%; } 2>/dev/null + kill_process %% timeout 5 dwdump -o sw -w ${dir}/packets.pcap (( $(tshark -r ${dir}/packets.pcap \ -Y 'ip.dst == 192.0.2.10' 2> /dev/null | wc -l) == 0)) diff --git a/tools/testing/selftests/net/fdb_notify.sh b/tools/testing/selftests/net/fdb_notify.sh new file mode 100755 index 000000000000..c03151e7791c --- /dev/null +++ b/tools/testing/selftests/net/fdb_notify.sh @@ -0,0 +1,96 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source lib.sh + +ALL_TESTS=" + test_dup_bridge + test_dup_vxlan_self + test_dup_vxlan_master + test_dup_macvlan_self + test_dup_macvlan_master +" + +do_test_dup() +{ + local op=$1; shift + local what=$1; shift + local tmpf + + RET=0 + + tmpf=$(mktemp) + defer rm "$tmpf" + + defer_scope_push + bridge monitor fdb &> "$tmpf" & + defer kill_process $! + + sleep 0.5 + bridge fdb "$op" 00:11:22:33:44:55 vlan 1 "$@" + sleep 0.5 + defer_scope_pop + + local count=$(grep -c -e 00:11:22:33:44:55 $tmpf) + ((count == 1)) + check_err $? "Got $count notifications, expected 1" + + log_test "$what $op: Duplicate notifications" +} + +test_dup_bridge() +{ + ip_link_add br up type bridge vlan_filtering 1 + do_test_dup add "bridge" dev br self + do_test_dup del "bridge" dev br self +} + +test_dup_vxlan_self() +{ + ip_link_add br up type bridge vlan_filtering 1 + ip_link_add vx up type vxlan id 2000 dstport 4789 + ip_link_master vx br + + do_test_dup add "vxlan" dev vx self dst 192.0.2.1 + do_test_dup del "vxlan" dev vx self dst 192.0.2.1 +} + +test_dup_vxlan_master() +{ + ip_link_add br up type bridge vlan_filtering 1 + ip_link_add vx up type vxlan id 2000 dstport 4789 + ip_link_master vx br + + do_test_dup add "vxlan master" dev vx master + do_test_dup del "vxlan master" dev vx master +} + +test_dup_macvlan_self() +{ + ip_link_add dd up type dummy + ip_link_add mv up link dd type macvlan mode passthru + + do_test_dup add "macvlan self" dev mv self + do_test_dup del "macvlan self" dev mv self +} + +test_dup_macvlan_master() +{ + ip_link_add br up type bridge vlan_filtering 1 + ip_link_add dd up type dummy + ip_link_add mv up link dd type macvlan mode passthru + ip_link_master mv br + + do_test_dup add "macvlan master" dev mv self + do_test_dup del "macvlan master" dev mv self +} + +cleanup() +{ + defer_scopes_cleanup +} + +trap cleanup EXIT +tests_run + +exit $EXIT_STATUS diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 5f3c28fc8624..3ea6f886a210 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -689,7 +689,7 @@ fib6_notify_test() log_test $ret 0 "ipv6 route add notify" - { kill %% && wait %%; } 2>/dev/null + kill_process %% #rm errors.txt @@ -736,7 +736,7 @@ fib_notify_test() log_test $ret 0 "ipv4 route add notify" - { kill %% && wait %%; } 2>/dev/null + kill_process %% rm errors.txt @@ -2328,7 +2328,7 @@ ipv4_mangle_test() $IP route del table 123 172.16.101.0/24 dev veth1 $IP rule del pref 100 - { kill %% && wait %%; } 2>/dev/null + kill_process %% rm $tmp_file route_cleanup @@ -2386,7 +2386,7 @@ ipv6_mangle_test() $IP -6 route del table 123 2001:db8:101::/64 dev veth1 $IP -6 rule del pref 100 - { kill %% && wait %%; } 2>/dev/null + kill_process %% rm $tmp_file route_cleanup diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile index 224346426ef2..7d885cff8d79 100644 --- a/tools/testing/selftests/net/forwarding/Makefile +++ b/tools/testing/selftests/net/forwarding/Makefile @@ -126,6 +126,7 @@ TEST_FILES := devlink_lib.sh \ tc_common.sh TEST_INCLUDES := \ - ../lib.sh + ../lib.sh \ + $(wildcard ../lib/sh/*.sh) include ../../lib.mk diff --git a/tools/testing/selftests/net/forwarding/devlink_lib.sh b/tools/testing/selftests/net/forwarding/devlink_lib.sh index 62a05bca1e82..18afa89ebbcc 100644 --- a/tools/testing/selftests/net/forwarding/devlink_lib.sh +++ b/tools/testing/selftests/net/forwarding/devlink_lib.sh @@ -501,7 +501,7 @@ devlink_trap_drop_cleanup() local pref=$1; shift local handle=$1; shift - kill $mz_pid && wait $mz_pid &> /dev/null + kill_process $mz_pid tc filter del dev $dev egress protocol $proto pref $pref handle $handle flower } diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh index c992e385159c..7337f398f9cc 100644 --- a/tools/testing/selftests/net/forwarding/lib.sh +++ b/tools/testing/selftests/net/forwarding/lib.sh @@ -48,7 +48,6 @@ declare -A NETIFS=( : "${WAIT_TIME:=5}" # Whether to pause on, respectively, after a failure and before cleanup. -: "${PAUSE_ON_FAIL:=no}" : "${PAUSE_ON_CLEANUP:=no}" # Whether to create virtual interfaces, and what netdevice type they should be. @@ -446,191 +445,6 @@ done ############################################################################## # Helpers -# Exit status to return at the end. Set in case one of the tests fails. -EXIT_STATUS=0 -# Per-test return value. Clear at the beginning of each test. -RET=0 - -ret_set_ksft_status() -{ - local ksft_status=$1; shift - local msg=$1; shift - - RET=$(ksft_status_merge $RET $ksft_status) - if (( $? )); then - retmsg=$msg - fi -} - -# Whether FAILs should be interpreted as XFAILs. Internal. -FAIL_TO_XFAIL= - -check_err() -{ - local err=$1 - local msg=$2 - - if ((err)); then - if [[ $FAIL_TO_XFAIL = yes ]]; then - ret_set_ksft_status $ksft_xfail "$msg" - else - ret_set_ksft_status $ksft_fail "$msg" - fi - fi -} - -check_fail() -{ - local err=$1 - local msg=$2 - - check_err $((!err)) "$msg" -} - -check_err_fail() -{ - local should_fail=$1; shift - local err=$1; shift - local what=$1; shift - - if ((should_fail)); then - check_fail $err "$what succeeded, but should have failed" - else - check_err $err "$what failed" - fi -} - -xfail() -{ - FAIL_TO_XFAIL=yes "$@" -} - -xfail_on_slow() -{ - if [[ $KSFT_MACHINE_SLOW = yes ]]; then - FAIL_TO_XFAIL=yes "$@" - else - "$@" - fi -} - -omit_on_slow() -{ - if [[ $KSFT_MACHINE_SLOW != yes ]]; then - "$@" - fi -} - -xfail_on_veth() -{ - local dev=$1; shift - local kind - - kind=$(ip -j -d link show dev $dev | - jq -r '.[].linkinfo.info_kind') - if [[ $kind = veth ]]; then - FAIL_TO_XFAIL=yes "$@" - else - "$@" - fi -} - -log_test_result() -{ - local test_name=$1; shift - local opt_str=$1; shift - local result=$1; shift - local retmsg=$1; shift - - printf "TEST: %-60s [%s]\n" "$test_name $opt_str" "$result" - if [[ $retmsg ]]; then - printf "\t%s\n" "$retmsg" - fi -} - -pause_on_fail() -{ - if [[ $PAUSE_ON_FAIL == yes ]]; then - echo "Hit enter to continue, 'q' to quit" - read a - [[ $a == q ]] && exit 1 - fi -} - -handle_test_result_pass() -{ - local test_name=$1; shift - local opt_str=$1; shift - - log_test_result "$test_name" "$opt_str" " OK " -} - -handle_test_result_fail() -{ - local test_name=$1; shift - local opt_str=$1; shift - - log_test_result "$test_name" "$opt_str" FAIL "$retmsg" - pause_on_fail -} - -handle_test_result_xfail() -{ - local test_name=$1; shift - local opt_str=$1; shift - - log_test_result "$test_name" "$opt_str" XFAIL "$retmsg" - pause_on_fail -} - -handle_test_result_skip() -{ - local test_name=$1; shift - local opt_str=$1; shift - - log_test_result "$test_name" "$opt_str" SKIP "$retmsg" -} - -log_test() -{ - local test_name=$1 - local opt_str=$2 - - if [[ $# -eq 2 ]]; then - opt_str="($opt_str)" - fi - - if ((RET == ksft_pass)); then - handle_test_result_pass "$test_name" "$opt_str" - elif ((RET == ksft_xfail)); then - handle_test_result_xfail "$test_name" "$opt_str" - elif ((RET == ksft_skip)); then - handle_test_result_skip "$test_name" "$opt_str" - else - handle_test_result_fail "$test_name" "$opt_str" - fi - - EXIT_STATUS=$(ksft_exit_status_merge $EXIT_STATUS $RET) - return $RET -} - -log_test_skip() -{ - RET=$ksft_skip retmsg= log_test "$@" -} - -log_test_xfail() -{ - RET=$ksft_xfail retmsg= log_test "$@" -} - -log_info() -{ - local msg=$1 - - echo "INFO: $msg" -} - not() { "$@" @@ -1398,13 +1212,10 @@ matchall_sink_create() action drop } -tests_run() +cleanup() { - local current_test - - for current_test in ${TESTS:-$ALL_TESTS}; do - $current_test - done + pre_cleanup + defer_scopes_cleanup } multipath_eval() @@ -1761,8 +1572,9 @@ start_tcp_traffic() stop_traffic() { - # Suppress noise from killing mausezahn. - { kill %% && wait %%; } 2>/dev/null + local pid=${1-%%}; shift + + kill_process "$pid" } declare -A cappid diff --git a/tools/testing/selftests/net/forwarding/sch_ets.sh b/tools/testing/selftests/net/forwarding/sch_ets.sh index e60c8b4818cc..1f6f53e284b5 100755 --- a/tools/testing/selftests/net/forwarding/sch_ets.sh +++ b/tools/testing/selftests/net/forwarding/sch_ets.sh @@ -24,15 +24,10 @@ switch_create() # Create a bottleneck so that the DWRR process can kick in. tc qdisc add dev $swp2 root handle 1: tbf \ rate 1Gbit burst 1Mbit latency 100ms + defer tc qdisc del dev $swp2 root PARENT="parent 1:" } -switch_destroy() -{ - ets_switch_destroy - tc qdisc del dev $swp2 root -} - # Callback from sch_ets_tests.sh collect_stats() { diff --git a/tools/testing/selftests/net/forwarding/sch_ets_core.sh b/tools/testing/selftests/net/forwarding/sch_ets_core.sh index f906fcc66572..8f9922c695b0 100644 --- a/tools/testing/selftests/net/forwarding/sch_ets_core.sh +++ b/tools/testing/selftests/net/forwarding/sch_ets_core.sh @@ -166,44 +166,32 @@ h1_create() local i; simple_if_init $h1 + defer simple_if_fini $h1 + mtu_set $h1 9900 + defer mtu_restore $h1 + for i in {0..2}; do vlan_create $h1 1$i v$h1 $(sip $i)/28 + defer vlan_destroy $h1 1$i ip link set dev $h1.1$i type vlan egress 0:$i done } -h1_destroy() -{ - local i - - for i in {0..2}; do - vlan_destroy $h1 1$i - done - mtu_restore $h1 - simple_if_fini $h1 -} - h2_create() { local i simple_if_init $h2 - mtu_set $h2 9900 - for i in {0..2}; do - vlan_create $h2 1$i v$h2 $(dip $i)/28 - done -} + defer simple_if_fini $h2 -h2_destroy() -{ - local i + mtu_set $h2 9900 + defer mtu_restore $h2 for i in {0..2}; do - vlan_destroy $h2 1$i + vlan_create $h2 1$i v$h2 $(dip $i)/28 + defer vlan_destroy $h2 1$i done - mtu_restore $h2 - simple_if_fini $h2 } ets_switch_create() @@ -211,44 +199,45 @@ ets_switch_create() local i ip link set dev $swp1 up + defer ip link set dev $swp1 down + mtu_set $swp1 9900 + defer mtu_restore $swp1 ip link set dev $swp2 up + defer ip link set dev $swp2 down + mtu_set $swp2 9900 + defer mtu_restore $swp2 for i in {0..2}; do vlan_create $swp1 1$i + defer vlan_destroy $swp1 1$i ip link set dev $swp1.1$i type vlan ingress 0:0 1:1 2:2 vlan_create $swp2 1$i + defer vlan_destroy $swp2 1$i ip link add dev br1$i type bridge + defer ip link del dev br1$i + ip link set dev $swp1.1$i master br1$i + defer ip link set dev $swp1.1$i nomaster + ip link set dev $swp2.1$i master br1$i + defer ip link set dev $swp2.1$i nomaster ip link set dev br1$i up - ip link set dev $swp1.1$i up - ip link set dev $swp2.1$i up - done -} + defer ip link set dev br1$i down -ets_switch_destroy() -{ - local i - - ets_delete_qdisc + ip link set dev $swp1.1$i up + defer ip link set dev $swp1.1$i down - for i in {0..2}; do - ip link del dev br1$i - vlan_destroy $swp2 1$i - vlan_destroy $swp1 1$i + ip link set dev $swp2.1$i up + defer ip link set dev $swp2.1$i down done - mtu_restore $swp2 - ip link set dev $swp2 down - - mtu_restore $swp1 - ip link set dev $swp1 down + defer ets_delete_qdisc } setup_prepare() @@ -263,23 +252,13 @@ setup_prepare() hut=$h2 vrf_prepare + defer vrf_cleanup h1_create h2_create switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1.10 $(dip 0) " vlan 10" diff --git a/tools/testing/selftests/net/forwarding/sch_ets_tests.sh b/tools/testing/selftests/net/forwarding/sch_ets_tests.sh index f9d26a7911bb..08240d3e3c87 100644 --- a/tools/testing/selftests/net/forwarding/sch_ets_tests.sh +++ b/tools/testing/selftests/net/forwarding/sch_ets_tests.sh @@ -90,6 +90,7 @@ __ets_dwrr_test() for stream in ${streams[@]}; do ets_start_traffic $stream + defer stop_traffic $! done sleep 10 @@ -120,25 +121,24 @@ __ets_dwrr_test() ${d[0]} ${d[$i]} fi done - - for stream in ${streams[@]}; do - stop_traffic - done } ets_dwrr_test_012() { - __ets_dwrr_test 0 1 2 + in_defer_scope \ + __ets_dwrr_test 0 1 2 } ets_dwrr_test_01() { - __ets_dwrr_test 0 1 + in_defer_scope \ + __ets_dwrr_test 0 1 } ets_dwrr_test_12() { - __ets_dwrr_test 1 2 + in_defer_scope \ + __ets_dwrr_test 1 2 } ets_qdisc_setup() diff --git a/tools/testing/selftests/net/forwarding/sch_red.sh b/tools/testing/selftests/net/forwarding/sch_red.sh index 17f28644568e..af166662b78a 100755 --- a/tools/testing/selftests/net/forwarding/sch_red.sh +++ b/tools/testing/selftests/net/forwarding/sch_red.sh @@ -53,71 +53,63 @@ PKTSZ=1400 h1_create() { simple_if_init $h1 192.0.2.1/28 + defer simple_if_fini $h1 192.0.2.1/28 + mtu_set $h1 10000 + defer mtu_restore $h1 + tc qdisc replace dev $h1 root handle 1: tbf \ rate 10Mbit burst 10K limit 1M -} - -h1_destroy() -{ - tc qdisc del dev $h1 root - mtu_restore $h1 - simple_if_fini $h1 192.0.2.1/28 + defer tc qdisc del dev $h1 root } h2_create() { simple_if_init $h2 192.0.2.2/28 - mtu_set $h2 10000 -} + defer simple_if_fini $h2 192.0.2.2/28 -h2_destroy() -{ - mtu_restore $h2 - simple_if_fini $h2 192.0.2.2/28 + mtu_set $h2 10000 + defer mtu_restore $h2 } h3_create() { simple_if_init $h3 192.0.2.3/28 - mtu_set $h3 10000 -} + defer simple_if_fini $h3 192.0.2.3/28 -h3_destroy() -{ - mtu_restore $h3 - simple_if_fini $h3 192.0.2.3/28 + mtu_set $h3 10000 + defer mtu_restore $h3 } switch_create() { ip link add dev br up type bridge + defer ip link del dev br + ip link set dev $swp1 up master br + defer ip link set dev $swp1 down nomaster + ip link set dev $swp2 up master br + defer ip link set dev $swp2 down nomaster + ip link set dev $swp3 up master br + defer ip link set dev $swp3 down nomaster mtu_set $swp1 10000 + defer mtu_restore $h1 + mtu_set $swp2 10000 + defer mtu_restore $h2 + mtu_set $swp3 10000 + defer mtu_restore $h3 tc qdisc replace dev $swp3 root handle 1: tbf \ rate 10Mbit burst 10K limit 1M - ip link add name _drop_test up type dummy -} + defer tc qdisc del dev $swp3 root -switch_destroy() -{ - ip link del dev _drop_test - tc qdisc del dev $swp3 root - - mtu_restore $h3 - mtu_restore $h2 - mtu_restore $h1 - - ip link set dev $swp3 down nomaster - ip link set dev $swp2 down nomaster - ip link set dev $swp1 down nomaster - ip link del dev br + ip link add name _drop_test up type dummy + defer ip link del dev _drop_test } setup_prepare() @@ -134,6 +126,7 @@ setup_prepare() h3_mac=$(mac_get $h3) vrf_prepare + defer vrf_cleanup h1_create h2_create @@ -141,18 +134,6 @@ setup_prepare() switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h3_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1 192.0.2.3 " from host 1" @@ -287,6 +268,7 @@ do_ecn_test() $MZ $h1 -p $PKTSZ -A 192.0.2.1 -B 192.0.2.3 -c 0 \ -a own -b $h3_mac -t tcp -q tos=0x01 & + defer stop_traffic $! sleep 1 ecn_test_common "$name" $limit @@ -298,9 +280,6 @@ do_ecn_test() build_backlog $((2 * limit)) udp >/dev/null check_fail $? "UDP traffic went into backlog instead of being early-dropped" log_test "$name backlog > limit: UDP early-dropped" - - stop_traffic - sleep 1 } do_ecn_nodrop_test() @@ -310,6 +289,7 @@ do_ecn_nodrop_test() $MZ $h1 -p $PKTSZ -A 192.0.2.1 -B 192.0.2.3 -c 0 \ -a own -b $h3_mac -t tcp -q tos=0x01 & + defer stop_traffic $! sleep 1 ecn_test_common "$name" $limit @@ -321,9 +301,6 @@ do_ecn_nodrop_test() build_backlog $((2 * limit)) udp >/dev/null check_err $? "UDP traffic was early-dropped instead of getting into backlog" log_test "$name backlog > limit: UDP not dropped" - - stop_traffic - sleep 1 } do_red_test() @@ -336,6 +313,7 @@ do_red_test() # is above limit. $MZ $h1 -p $PKTSZ -A 192.0.2.1 -B 192.0.2.3 -c 0 \ -a own -b $h3_mac -t tcp -q tos=0x01 & + defer stop_traffic $! # Pushing below the queue limit should work. RET=0 @@ -352,9 +330,6 @@ do_red_test() pct=$(check_marking "== 0") check_err $? "backlog $backlog / $limit Got $pct% marked packets, expected == 0." log_test "RED backlog > limit" - - stop_traffic - sleep 1 } do_red_qevent_test() @@ -369,6 +344,7 @@ do_red_qevent_test() $MZ $h1 -p $PKTSZ -A 192.0.2.1 -B 192.0.2.3 -c 0 \ -a own -b $h3_mac -t udp -q & + defer stop_traffic $! sleep 1 tc filter add block 10 pref 1234 handle 102 matchall skip_hw \ @@ -396,9 +372,6 @@ do_red_qevent_test() check_err $? "Dropped packets still observed: 0 expected, $((now - base)) seen" log_test "RED early_dropped packets mirrored" - - stop_traffic - sleep 1 } do_ecn_qevent_test() @@ -410,6 +383,7 @@ do_ecn_qevent_test() $MZ $h1 -p $PKTSZ -A 192.0.2.1 -B 192.0.2.3 -c 0 \ -a own -b $h3_mac -t tcp -q tos=0x01 & + defer stop_traffic $! sleep 1 tc filter add block 10 pref 1234 handle 102 matchall skip_hw \ @@ -428,9 +402,6 @@ do_ecn_qevent_test() tc filter del block 10 pref 1234 handle 102 matchall log_test "ECN marked packets mirrored" - - stop_traffic - sleep 1 } install_qdisc() @@ -451,36 +422,36 @@ uninstall_qdisc() ecn_test() { install_qdisc ecn + defer uninstall_qdisc xfail_on_slow do_ecn_test $BACKLOG - uninstall_qdisc } ecn_nodrop_test() { install_qdisc ecn nodrop + defer uninstall_qdisc xfail_on_slow do_ecn_nodrop_test $BACKLOG - uninstall_qdisc } red_test() { install_qdisc + defer uninstall_qdisc xfail_on_slow do_red_test $BACKLOG - uninstall_qdisc } red_qevent_test() { install_qdisc qevent early_drop block 10 + defer uninstall_qdisc xfail_on_slow do_red_qevent_test $BACKLOG - uninstall_qdisc } ecn_qevent_test() { install_qdisc ecn qevent mark block 10 + defer uninstall_qdisc xfail_on_slow do_ecn_qevent_test $BACKLOG - uninstall_qdisc } trap cleanup EXIT diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_core.sh b/tools/testing/selftests/net/forwarding/sch_tbf_core.sh index 9cd884d4a5de..ec309a5086bc 100644 --- a/tools/testing/selftests/net/forwarding/sch_tbf_core.sh +++ b/tools/testing/selftests/net/forwarding/sch_tbf_core.sh @@ -60,68 +60,65 @@ host_create() local host=$1; shift simple_if_init $dev + defer simple_if_fini $dev + mtu_set $dev 10000 + defer mtu_restore $dev vlan_create $dev 10 v$dev $(ipaddr $host 10)/28 + defer vlan_destroy $dev 10 ip link set dev $dev.10 type vlan egress 0:0 vlan_create $dev 11 v$dev $(ipaddr $host 11)/28 + defer vlan_destroy $dev 11 ip link set dev $dev.11 type vlan egress 0:1 } -host_destroy() -{ - local dev=$1; shift - - vlan_destroy $dev 11 - vlan_destroy $dev 10 - mtu_restore $dev - simple_if_fini $dev -} - h1_create() { host_create $h1 1 } -h1_destroy() -{ - host_destroy $h1 -} - h2_create() { host_create $h2 2 tc qdisc add dev $h2 clsact + defer tc qdisc del dev $h2 clsact + tc filter add dev $h2 ingress pref 1010 prot 802.1q \ flower $TCFLAGS vlan_id 10 action pass tc filter add dev $h2 ingress pref 1011 prot 802.1q \ flower $TCFLAGS vlan_id 11 action pass } -h2_destroy() -{ - tc qdisc del dev $h2 clsact - host_destroy $h2 -} - switch_create() { local intf local vlan ip link add dev br10 type bridge + defer ip link del dev br10 + ip link add dev br11 type bridge + defer ip link del dev br11 for intf in $swp1 $swp2; do ip link set dev $intf up + defer ip link set dev $intf down + mtu_set $intf 10000 + defer mtu_restore $intf for vlan in 10 11; do vlan_create $intf $vlan + defer vlan_destroy $intf $vlan + ip link set dev $intf.$vlan master br$vlan + defer ip link set dev $intf.$vlan nomaster + ip link set dev $intf.$vlan up + defer ip link set dev $intf.$vlan down done done @@ -130,34 +127,10 @@ switch_create() done ip link set dev br10 up - ip link set dev br11 up -} - -switch_destroy() -{ - local intf - local vlan - - # A test may have been interrupted mid-run, with Qdisc installed. Delete - # it here. - tc qdisc del dev $swp2 root 2>/dev/null - - ip link set dev br11 down - ip link set dev br10 down + defer ip link set dev br10 down - for intf in $swp2 $swp1; do - for vlan in 11 10; do - ip link set dev $intf.$vlan down - ip link set dev $intf.$vlan nomaster - vlan_destroy $intf $vlan - done - - mtu_restore $intf - ip link set dev $intf down - done - - ip link del dev br11 - ip link del dev br10 + ip link set dev br11 up + defer ip link set dev br11 down } setup_prepare() @@ -177,23 +150,13 @@ setup_prepare() h2_mac=$(mac_get $h2) vrf_prepare + defer vrf_cleanup h1_create h2_create switch_create } -cleanup() -{ - pre_cleanup - - switch_destroy - h2_destroy - h1_destroy - - vrf_cleanup -} - ping_ipv4() { ping_test $h1.10 $(ipaddr 2 10) " vlan 10" @@ -207,18 +170,18 @@ tbf_get_counter() tc_rule_stats_get $h2 10$vlan ingress .bytes } -do_tbf_test() +__tbf_test() { local vlan=$1; shift local mbit=$1; shift start_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 2 $vlan) $h2_mac + defer stop_traffic $! sleep 5 # Wait for the burst to dwindle local t2=$(busywait_for_counter 1000 +1 tbf_get_counter $vlan) sleep 10 local t3=$(tbf_get_counter $vlan) - stop_traffic RET=0 @@ -231,3 +194,9 @@ do_tbf_test() log_test "TC $((vlan - 10)): TBF rate ${mbit}Mbit" } + +do_tbf_test() +{ + in_defer_scope \ + __tbf_test "$@" +} diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh b/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh index df9bcd6a811a..c182a04282bc 100644 --- a/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh +++ b/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh @@ -30,8 +30,9 @@ tbf_test() # This test is used for both ETS and PRIO. Even though we only need two # bands, PRIO demands a minimum of three. tc qdisc add dev $swp2 root handle 10: $QDISC 3 priomap 2 1 0 + defer tc qdisc del dev $swp2 root + tbf_test_one 128K - tc qdisc del dev $swp2 root } tbf_root_test() @@ -42,6 +43,8 @@ tbf_root_test() tc qdisc replace dev $swp2 root handle 1: \ tbf rate 400Mbit burst $bs limit 1M + defer tc qdisc del dev $swp2 root + tc qdisc replace dev $swp2 parent 1:1 handle 10: \ $QDISC 3 priomap 2 1 0 tc qdisc replace dev $swp2 parent 10:3 handle 103: \ @@ -53,8 +56,6 @@ tbf_root_test() do_tbf_test 10 400 $bs do_tbf_test 11 400 $bs - - tc qdisc del dev $swp2 root } if type -t sch_tbf_pre_hook >/dev/null; then diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_root.sh b/tools/testing/selftests/net/forwarding/sch_tbf_root.sh index 96c997be0d03..9f20320f8d84 100755 --- a/tools/testing/selftests/net/forwarding/sch_tbf_root.sh +++ b/tools/testing/selftests/net/forwarding/sch_tbf_root.sh @@ -14,13 +14,14 @@ tbf_test_one() tc qdisc replace dev $swp2 root handle 108: tbf \ rate 400Mbit burst $bs limit 1M + defer tc qdisc del dev $swp2 root + do_tbf_test 10 400 $bs } tbf_test() { tbf_test_one 128K - tc qdisc del dev $swp2 root } if type -t sch_tbf_pre_hook >/dev/null; then diff --git a/tools/testing/selftests/net/forwarding/tc_police.sh b/tools/testing/selftests/net/forwarding/tc_police.sh index 5103f64a71d6..509fdedfcfa1 100755 --- a/tools/testing/selftests/net/forwarding/tc_police.sh +++ b/tools/testing/selftests/net/forwarding/tc_police.sh @@ -148,7 +148,7 @@ police_common_test() log_test "$test_name" - { kill %% && wait %%; } 2>/dev/null + kill_process %% tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower } @@ -198,7 +198,7 @@ police_shared_common_test() log_test "$test_name" - { kill %% && wait %%; } 2>/dev/null + kill_process %% } police_shared_test() @@ -278,7 +278,7 @@ police_mirror_common_test() log_test "$test_name" - { kill %% && wait %%; } 2>/dev/null + kill_process %% tc filter del dev $pol_if $dir protocol ip pref 1 handle 101 flower tc filter del dev $h3 ingress protocol ip pref 1 handle 101 flower tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower @@ -320,7 +320,7 @@ police_pps_common_test() log_test "$test_name" - { kill %% && wait %%; } 2>/dev/null + kill_process %% tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower } diff --git a/tools/testing/selftests/net/hsr/config b/tools/testing/selftests/net/hsr/config index 241542441c51..555a868743f0 100644 --- a/tools/testing/selftests/net/hsr/config +++ b/tools/testing/selftests/net/hsr/config @@ -3,3 +3,4 @@ CONFIG_NET_SCH_NETEM=m CONFIG_HSR=y CONFIG_VETH=y CONFIG_BRIDGE=y +CONFIG_VLAN_8021Q=m diff --git a/tools/testing/selftests/net/hsr/hsr_common.sh b/tools/testing/selftests/net/hsr/hsr_common.sh index 8e97b1f2e7e5..1dc882ac1c74 100644 --- a/tools/testing/selftests/net/hsr/hsr_common.sh +++ b/tools/testing/selftests/net/hsr/hsr_common.sh @@ -15,7 +15,7 @@ do_ping() { local netns="$1" local connect_addr="$2" - local ping_args="-q -c 2" + local ping_args="-q -c 2 -i 0.1" if is_v6 "${connect_addr}"; then $ipv6 || return 0 @@ -36,7 +36,7 @@ do_ping_long() { local netns="$1" local connect_addr="$2" - local ping_args="-q -c 10" + local ping_args="-q -c 10 -i 0.1" if is_v6 "${connect_addr}"; then $ipv6 || return 0 diff --git a/tools/testing/selftests/net/hsr/hsr_ping.sh b/tools/testing/selftests/net/hsr/hsr_ping.sh index f5d207fc770a..5a65f4f836be 100755 --- a/tools/testing/selftests/net/hsr/hsr_ping.sh +++ b/tools/testing/selftests/net/hsr/hsr_ping.sh @@ -175,6 +175,100 @@ setup_hsr_interfaces() ip -net "$ns3" link set hsr3 up } +setup_vlan_interfaces() { + ip -net "$ns1" link add link hsr1 name hsr1.2 type vlan id 2 + ip -net "$ns1" link add link hsr1 name hsr1.3 type vlan id 3 + ip -net "$ns1" link add link hsr1 name hsr1.4 type vlan id 4 + ip -net "$ns1" link add link hsr1 name hsr1.5 type vlan id 5 + + ip -net "$ns2" link add link hsr2 name hsr2.2 type vlan id 2 + ip -net "$ns2" link add link hsr2 name hsr2.3 type vlan id 3 + ip -net "$ns2" link add link hsr2 name hsr2.4 type vlan id 4 + ip -net "$ns2" link add link hsr2 name hsr2.5 type vlan id 5 + + ip -net "$ns3" link add link hsr3 name hsr3.2 type vlan id 2 + ip -net "$ns3" link add link hsr3 name hsr3.3 type vlan id 3 + ip -net "$ns3" link add link hsr3 name hsr3.4 type vlan id 4 + ip -net "$ns3" link add link hsr3 name hsr3.5 type vlan id 5 + + ip -net "$ns1" addr add 100.64.2.1/24 dev hsr1.2 + ip -net "$ns1" addr add 100.64.3.1/24 dev hsr1.3 + ip -net "$ns1" addr add 100.64.4.1/24 dev hsr1.4 + ip -net "$ns1" addr add 100.64.5.1/24 dev hsr1.5 + + ip -net "$ns2" addr add 100.64.2.2/24 dev hsr2.2 + ip -net "$ns2" addr add 100.64.3.2/24 dev hsr2.3 + ip -net "$ns2" addr add 100.64.4.2/24 dev hsr2.4 + ip -net "$ns2" addr add 100.64.5.2/24 dev hsr2.5 + + ip -net "$ns3" addr add 100.64.2.3/24 dev hsr3.2 + ip -net "$ns3" addr add 100.64.3.3/24 dev hsr3.3 + ip -net "$ns3" addr add 100.64.4.3/24 dev hsr3.4 + ip -net "$ns3" addr add 100.64.5.3/24 dev hsr3.5 + + ip -net "$ns1" link set dev hsr1.2 up + ip -net "$ns1" link set dev hsr1.3 up + ip -net "$ns1" link set dev hsr1.4 up + ip -net "$ns1" link set dev hsr1.5 up + + ip -net "$ns2" link set dev hsr2.2 up + ip -net "$ns2" link set dev hsr2.3 up + ip -net "$ns2" link set dev hsr2.4 up + ip -net "$ns2" link set dev hsr2.5 up + + ip -net "$ns3" link set dev hsr3.2 up + ip -net "$ns3" link set dev hsr3.3 up + ip -net "$ns3" link set dev hsr3.4 up + ip -net "$ns3" link set dev hsr3.5 up + +} + +hsr_vlan_ping() { + do_ping "$ns1" 100.64.2.2 + do_ping "$ns1" 100.64.3.2 + do_ping "$ns1" 100.64.4.2 + do_ping "$ns1" 100.64.5.2 + + do_ping "$ns1" 100.64.2.3 + do_ping "$ns1" 100.64.3.3 + do_ping "$ns1" 100.64.4.3 + do_ping "$ns1" 100.64.5.3 + + do_ping "$ns2" 100.64.2.1 + do_ping "$ns2" 100.64.3.1 + do_ping "$ns2" 100.64.4.1 + do_ping "$ns2" 100.64.5.1 + + do_ping "$ns2" 100.64.2.3 + do_ping "$ns2" 100.64.3.3 + do_ping "$ns2" 100.64.4.3 + do_ping "$ns2" 100.64.5.3 + + do_ping "$ns3" 100.64.2.1 + do_ping "$ns3" 100.64.3.1 + do_ping "$ns3" 100.64.4.1 + do_ping "$ns3" 100.64.5.1 + + do_ping "$ns3" 100.64.2.2 + do_ping "$ns3" 100.64.3.2 + do_ping "$ns3" 100.64.4.2 + do_ping "$ns3" 100.64.5.2 +} + +run_vlan_tests() { + vlan_challenged_hsr1=$(ip net exec "$ns1" ethtool -k hsr1 | grep "vlan-challenged" | awk '{print $2}') + vlan_challenged_hsr2=$(ip net exec "$ns2" ethtool -k hsr2 | grep "vlan-challenged" | awk '{print $2}') + vlan_challenged_hsr3=$(ip net exec "$ns3" ethtool -k hsr3 | grep "vlan-challenged" | awk '{print $2}') + + if [[ "$vlan_challenged_hsr1" = "off" || "$vlan_challenged_hsr2" = "off" || "$vlan_challenged_hsr3" = "off" ]]; then + echo "INFO: Running VLAN tests" + setup_vlan_interfaces + hsr_vlan_ping + else + echo "INFO: Not Running VLAN tests as the device does not support VLAN" + fi +} + check_prerequisites setup_ns ns1 ns2 ns3 @@ -183,9 +277,13 @@ trap cleanup_all_ns EXIT setup_hsr_interfaces 0 do_complete_ping_test +run_vlan_tests + setup_ns ns1 ns2 ns3 setup_hsr_interfaces 1 do_complete_ping_test +run_vlan_tests + exit $ret diff --git a/tools/testing/selftests/net/hsr/settings b/tools/testing/selftests/net/hsr/settings new file mode 100644 index 000000000000..0fbc037f2aa8 --- /dev/null +++ b/tools/testing/selftests/net/hsr/settings @@ -0,0 +1 @@ +timeout=50 diff --git a/tools/testing/selftests/net/ioam6.sh b/tools/testing/selftests/net/ioam6.sh index 12491850ae98..845c26dd01a9 100755 --- a/tools/testing/selftests/net/ioam6.sh +++ b/tools/testing/selftests/net/ioam6.sh @@ -3,119 +3,106 @@ # # Author: Justin Iurman <justin.iurman@uliege.be> # -# This script evaluates the IOAM insertion for IPv6 by checking the IOAM data -# consistency directly inside packets on the receiver side. Tests are divided -# into three categories: OUTPUT (evaluates the IOAM processing by the sender), -# INPUT (evaluates the IOAM processing by a receiver) and GLOBAL (evaluates -# wider use cases that do not fall into the other two categories). Both OUTPUT -# and INPUT tests only use a two-node topology (alpha and beta), while GLOBAL -# tests use the entire three-node topology (alpha, beta, gamma). Each test is -# documented inside its own handler in the code below. +# This script evaluates IOAM for IPv6 by checking local IOAM configurations and +# IOAM data inside packets. There are three categories of tests: LOCAL, OUTPUT, +# and INPUT. The former (LOCAL) checks all IOAM related configurations locally +# without sending packets. OUTPUT tests verify the processing of an IOAM +# encapsulating node, while INPUT tests verify the processing of an IOAM transit +# node. Both OUTPUT and INPUT tests send packets. Each test is documented inside +# its own handler. # -# An IOAM domain is configured from Alpha to Gamma but not on the reverse path. -# When either Beta or Gamma is the destination (depending on the test category), -# Alpha adds an IOAM option (Pre-allocated Trace) inside a Hop-by-hop. +# The topology used for OUTPUT and INPUT tests is made of three nodes: +# - Alpha (the IOAM encapsulating node) +# - Beta (the IOAM transit node) +# - Gamma (the receiver) ** # +# An IOAM domain is configured from Alpha to Beta, but not on the reverse path. +# Alpha adds an IOAM option (Pre-allocated Trace) inside a Hop-by-hop. # -# +-------------------+ +-------------------+ -# | | | | -# | Alpha netns | | Gamma netns | -# | | | | -# | +-------------+ | | +-------------+ | -# | | veth0 | | | | veth0 | | -# | | db01::2/64 | | | | db02::2/64 | | -# | +-------------+ | | +-------------+ | -# | . | | . | -# +-------------------+ +-------------------+ -# . . -# . . -# . . -# +----------------------------------------------------+ -# | . . | -# | +-------------+ +-------------+ | -# | | veth0 | | veth1 | | -# | | db01::1/64 | ................ | db02::1/64 | | -# | +-------------+ +-------------+ | -# | | -# | Beta netns | -# | | -# +----------------------------------------------------+ +# ** Gamma is required because ioam6_parser.c uses a packet socket and we need +# to see IOAM data inserted by the very last node (Beta), which would happen +# _after_ we get a copy of the packet on Beta. Note that using an +# IPv6 raw socket with IPV6_RECVHOPOPTS on Beta would not be enough: we also +# need to access the IPv6 header to check some fields (e.g., source and +# destination addresses), which is not possible in that case. As a +# consequence, we need Gamma as a receiver to run ioam6_parser.c which uses a +# packet socket. # # +# +-----------------------+ +-----------------------+ +# | | | | +# | Alpha netns | | Gamma netns | +# | | | | +# | +-------------------+ | | +-------------------+ | +# | | veth0 | | | | veth0 | | +# | | 2001:db8:1::2/64 | | | | 2001:db8:2::2/64 | | +# | +-------------------+ | | +-------------------+ | +# | . | | . | +# +-----------.-----------+ +-----------.-----------+ +# . . +# . . +# . . +# +-----------.----------------------------------.-----------+ +# | . . | +# | +-------------------+ +-------------------+ | +# | | veth0 | | veth1 | | +# | | 2001:db8:1::1/64 | ............ | 2001:db8:2::1/64 | | +# | +-------------------+ +-------------------+ | +# | | +# | Beta netns | +# | | +# +----------------------------------------------------------+ # -# ============================================================= -# | Alpha - IOAM configuration | -# +===========================================================+ -# | Node ID | 1 | -# +-----------------------------------------------------------+ -# | Node Wide ID | 11111111 | -# +-----------------------------------------------------------+ -# | Ingress ID | 0xffff (default value) | -# +-----------------------------------------------------------+ -# | Ingress Wide ID | 0xffffffff (default value) | -# +-----------------------------------------------------------+ -# | Egress ID | 101 | -# +-----------------------------------------------------------+ -# | Egress Wide ID | 101101 | -# +-----------------------------------------------------------+ -# | Namespace Data | 0xdeadbee0 | -# +-----------------------------------------------------------+ -# | Namespace Wide Data | 0xcafec0caf00dc0de | -# +-----------------------------------------------------------+ -# | Schema ID | 777 | -# +-----------------------------------------------------------+ -# | Schema Data | something that will be 4n-aligned | -# +-----------------------------------------------------------+ # # -# ============================================================= -# | Beta - IOAM configuration | -# +===========================================================+ -# | Node ID | 2 | -# +-----------------------------------------------------------+ -# | Node Wide ID | 22222222 | -# +-----------------------------------------------------------+ -# | Ingress ID | 201 | -# +-----------------------------------------------------------+ -# | Ingress Wide ID | 201201 | -# +-----------------------------------------------------------+ -# | Egress ID | 202 | -# +-----------------------------------------------------------+ -# | Egress Wide ID | 202202 | -# +-----------------------------------------------------------+ -# | Namespace Data | 0xdeadbee1 | -# +-----------------------------------------------------------+ -# | Namespace Wide Data | 0xcafec0caf11dc0de | -# +-----------------------------------------------------------+ -# | Schema ID | 666 | -# +-----------------------------------------------------------+ -# | Schema Data | Hello there -Obi | -# +-----------------------------------------------------------+ +# +==========================================================+ +# | Alpha - IOAM configuration | +# +=====================+====================================+ +# | Node ID | 1 | +# +---------------------+------------------------------------+ +# | Node Wide ID | 11111111 | +# +---------------------+------------------------------------+ +# | Ingress ID | 0xffff (default value) | +# +---------------------+------------------------------------+ +# | Ingress Wide ID | 0xffffffff (default value) | +# +---------------------+------------------------------------+ +# | Egress ID | 101 | +# +---------------------+------------------------------------+ +# | Egress Wide ID | 101101 | +# +---------------------+------------------------------------+ +# | Namespace Data | 0xdeadbeef | +# +---------------------+------------------------------------+ +# | Namespace Wide Data | 0xcafec0caf00dc0de | +# +---------------------+------------------------------------+ +# | Schema ID | 777 | +# +---------------------+------------------------------------+ +# | Schema Data | something that will be 4n-aligned | +# +---------------------+------------------------------------+ # # -# ============================================================= -# | Gamma - IOAM configuration | -# +===========================================================+ -# | Node ID | 3 | -# +-----------------------------------------------------------+ -# | Node Wide ID | 33333333 | -# +-----------------------------------------------------------+ -# | Ingress ID | 301 | -# +-----------------------------------------------------------+ -# | Ingress Wide ID | 301301 | -# +-----------------------------------------------------------+ -# | Egress ID | 0xffff (default value) | -# +-----------------------------------------------------------+ -# | Egress Wide ID | 0xffffffff (default value) | -# +-----------------------------------------------------------+ -# | Namespace Data | 0xdeadbee2 | -# +-----------------------------------------------------------+ -# | Namespace Wide Data | 0xcafec0caf22dc0de | -# +-----------------------------------------------------------+ -# | Schema ID | 0xffffff (= None) | -# +-----------------------------------------------------------+ -# | Schema Data | | -# +-----------------------------------------------------------+ +# +==========================================================+ +# | Beta - IOAM configuration | +# +=====================+====================================+ +# | Node ID | 2 | +# +---------------------+------------------------------------+ +# | Node Wide ID | 22222222 | +# +---------------------+------------------------------------+ +# | Ingress ID | 201 | +# +---------------------+------------------------------------+ +# | Ingress Wide ID | 201201 | +# +---------------------+------------------------------------+ +# | Egress ID | 202 | +# +---------------------+------------------------------------+ +# | Egress Wide ID | 202202 | +# +---------------------+------------------------------------+ +# | Namespace Data | 0xffffffff (default value) | +# +---------------------+------------------------------------+ +# | Namespace Wide Data | 0xffffffffffffffff (default value) | +# +---------------------+------------------------------------+ +# | Schema ID | 0xffffff (= None) | +# +---------------------+------------------------------------+ +# | Schema Data | | +# +---------------------+------------------------------------+ source lib.sh @@ -128,64 +115,69 @@ source lib.sh ################################################################################ ALPHA=( - 1 # ID - 11111111 # Wide ID - 0xffff # Ingress ID - 0xffffffff # Ingress Wide ID - 101 # Egress ID - 101101 # Egress Wide ID - 0xdeadbee0 # Namespace Data - 0xcafec0caf00dc0de # Namespace Wide Data - 777 # Schema ID (0xffffff = None) - "something that will be 4n-aligned" # Schema Data + 1 # ID + 11111111 # Wide ID + 0xffff # Ingress ID (default value) + 0xffffffff # Ingress Wide ID (default value) + 101 # Egress ID + 101101 # Egress Wide ID + 0xdeadbeef # Namespace Data + 0xcafec0caf00dc0de # Namespace Wide Data + 777 # Schema ID + "something that will be 4n-aligned" # Schema Data ) BETA=( - 2 - 22222222 - 201 - 201201 - 202 - 202202 - 0xdeadbee1 - 0xcafec0caf11dc0de - 666 - "Hello there -Obi" + 2 # ID + 22222222 # Wide ID + 201 # Ingress ID + 201201 # Ingress Wide ID + 202 # Egress ID + 202202 # Egress Wide ID + 0xffffffff # Namespace Data (empty value) + 0xffffffffffffffff # Namespace Wide Data (empty value) + 0xffffff # Schema ID (empty value) + "" # Schema Data (empty value) ) -GAMMA=( - 3 - 33333333 - 301 - 301301 - 0xffff - 0xffffffff - 0xdeadbee2 - 0xcafec0caf22dc0de - 0xffffff - "" -) +TESTS_LOCAL=" + local_sysctl_ioam_id + local_sysctl_ioam_id_wide + local_sysctl_ioam_intf_id + local_sysctl_ioam_intf_id_wide + local_sysctl_ioam_intf_enabled + local_ioam_namespace + local_ioam_schema + local_ioam_schema_namespace + local_route_ns + local_route_tunsrc + local_route_tundst + local_route_trace_type + local_route_trace_size + local_route_trace_type_bits + local_route_trace_size_values +" TESTS_OUTPUT=" - out_undef_ns - out_no_room - out_bits - out_full_supp_trace + output_undef_ns + output_no_room + output_no_room_oss + output_bits + output_sizes + output_full_supp_trace " TESTS_INPUT=" - in_undef_ns - in_no_room - in_oflag - in_bits - in_full_supp_trace + input_undef_ns + input_no_room + input_no_room_oss + input_disabled + input_oflag + input_bits + input_sizes + input_full_supp_trace " -TESTS_GLOBAL=" - fwd_full_supp_trace -" - - ################################################################################ # # # LIBRARY # @@ -194,66 +186,64 @@ TESTS_GLOBAL=" check_kernel_compatibility() { - setup_ns ioam_tmp_node - ip link add name veth0 netns $ioam_tmp_node type veth \ - peer name veth1 netns $ioam_tmp_node + setup_ns ioam_tmp_node &>/dev/null + local ret=$? - ip -netns $ioam_tmp_node link set veth0 up - ip -netns $ioam_tmp_node link set veth1 up + ip link add name veth0 netns $ioam_tmp_node type veth \ + peer name veth1 netns $ioam_tmp_node &>/dev/null + ret=$((ret + $?)) - ip -netns $ioam_tmp_node ioam namespace add 0 - ns_ad=$? + ip -netns $ioam_tmp_node link set veth0 up &>/dev/null + ret=$((ret + $?)) - ip -netns $ioam_tmp_node ioam namespace show | grep -q "namespace 0" - ns_sh=$? + ip -netns $ioam_tmp_node link set veth1 up &>/dev/null + ret=$((ret + $?)) - if [[ $ns_ad != 0 || $ns_sh != 0 ]] + if [ $ret != 0 ] then - echo "SKIP: kernel version probably too old, missing ioam support" - ip link del veth0 2>/dev/null || true - cleanup_ns $ioam_tmp_node || true + echo "SKIP: Setup failed." + cleanup_ns $ioam_tmp_node exit $ksft_skip fi - ip -netns $ioam_tmp_node route add db02::/64 encap ioam6 mode inline \ - trace prealloc type 0x800000 ns 0 size 4 dev veth0 - tr_ad=$? + ip -netns $ioam_tmp_node route add 2001:db8:2::/64 \ + encap ioam6 trace prealloc type 0x800000 ns 0 size 4 dev veth0 &>/dev/null + ret=$? - ip -netns $ioam_tmp_node -6 route | grep -q "encap ioam6" - tr_sh=$? + ip -netns $ioam_tmp_node -6 route 2>/dev/null | grep -q "encap ioam6" + ret=$((ret + $?)) - if [[ $tr_ad != 0 || $tr_sh != 0 ]] + if [ $ret != 0 ] then - echo "SKIP: cannot attach an ioam trace to a route, did you compile" \ - "without CONFIG_IPV6_IOAM6_LWTUNNEL?" - ip link del veth0 2>/dev/null || true - cleanup_ns $ioam_tmp_node || true + echo "SKIP: Cannot attach an IOAM trace to a route. Was your kernel" \ + "compiled without CONFIG_IPV6_IOAM6_LWTUNNEL? Are you running an" \ + "old kernel? Are you using an old version of iproute2?" + cleanup_ns $ioam_tmp_node exit $ksft_skip fi - ip link del veth0 2>/dev/null || true - cleanup_ns $ioam_tmp_node || true + cleanup_ns $ioam_tmp_node - lsmod | grep -q "ip6_tunnel" + lsmod 2>/dev/null | grep -q "ip6_tunnel" ip6tnl_loaded=$? - if [ $ip6tnl_loaded = 0 ] + if [ $ip6tnl_loaded == 0 ] then encap_tests=0 else modprobe ip6_tunnel &>/dev/null - lsmod | grep -q "ip6_tunnel" + lsmod 2>/dev/null | grep -q "ip6_tunnel" encap_tests=$? if [ $encap_tests != 0 ] then - ip a | grep -q "ip6tnl0" + ip a 2>/dev/null | grep -q "ip6tnl0" encap_tests=$? if [ $encap_tests != 0 ] then echo "Note: ip6_tunnel not found neither as a module nor inside the" \ - "kernel, tests that require it (encap mode) will be omitted" + "kernel. Any tests that require it will be skipped." fi fi fi @@ -261,477 +251,1400 @@ check_kernel_compatibility() cleanup() { - ip link del ioam-veth-alpha 2>/dev/null || true - ip link del ioam-veth-gamma 2>/dev/null || true - - cleanup_ns $ioam_node_alpha $ioam_node_beta $ioam_node_gamma || true + cleanup_ns $ioam_node_alpha $ioam_node_beta $ioam_node_gamma if [ $ip6tnl_loaded != 0 ] then - modprobe -r ip6_tunnel 2>/dev/null || true + modprobe -r ip6_tunnel &>/dev/null fi } setup() { - setup_ns ioam_node_alpha ioam_node_beta ioam_node_gamma + setup_ns ioam_node_alpha ioam_node_beta ioam_node_gamma &>/dev/null ip link add name ioam-veth-alpha netns $ioam_node_alpha type veth \ - peer name ioam-veth-betaL netns $ioam_node_beta + peer name ioam-veth-betaL netns $ioam_node_beta &>/dev/null ip link add name ioam-veth-betaR netns $ioam_node_beta type veth \ - peer name ioam-veth-gamma netns $ioam_node_gamma - - ip -netns $ioam_node_alpha link set ioam-veth-alpha name veth0 - ip -netns $ioam_node_beta link set ioam-veth-betaL name veth0 - ip -netns $ioam_node_beta link set ioam-veth-betaR name veth1 - ip -netns $ioam_node_gamma link set ioam-veth-gamma name veth0 - - ip -netns $ioam_node_alpha addr add db01::2/64 dev veth0 - ip -netns $ioam_node_alpha link set veth0 up - ip -netns $ioam_node_alpha link set lo up - ip -netns $ioam_node_alpha route add db02::/64 via db01::1 dev veth0 - ip -netns $ioam_node_alpha route del db01::/64 - ip -netns $ioam_node_alpha route add db01::/64 dev veth0 - - ip -netns $ioam_node_beta addr add db01::1/64 dev veth0 - ip -netns $ioam_node_beta addr add db02::1/64 dev veth1 - ip -netns $ioam_node_beta link set veth0 up - ip -netns $ioam_node_beta link set veth1 up - ip -netns $ioam_node_beta link set lo up - - ip -netns $ioam_node_gamma addr add db02::2/64 dev veth0 - ip -netns $ioam_node_gamma link set veth0 up - ip -netns $ioam_node_gamma link set lo up - ip -netns $ioam_node_gamma route add db01::/64 via db02::1 dev veth0 - - # - IOAM config - - ip netns exec $ioam_node_alpha sysctl -wq net.ipv6.ioam6_id=${ALPHA[0]} - ip netns exec $ioam_node_alpha sysctl -wq net.ipv6.ioam6_id_wide=${ALPHA[1]} - ip netns exec $ioam_node_alpha sysctl -wq net.ipv6.conf.veth0.ioam6_id=${ALPHA[4]} - ip netns exec $ioam_node_alpha sysctl -wq net.ipv6.conf.veth0.ioam6_id_wide=${ALPHA[5]} - ip -netns $ioam_node_alpha ioam namespace add 123 data ${ALPHA[6]} wide ${ALPHA[7]} - ip -netns $ioam_node_alpha ioam schema add ${ALPHA[8]} "${ALPHA[9]}" - ip -netns $ioam_node_alpha ioam namespace set 123 schema ${ALPHA[8]} - - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.all.forwarding=1 - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.ioam6_id=${BETA[0]} - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.ioam6_id_wide=${BETA[1]} - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth0.ioam6_id=${BETA[2]} - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth0.ioam6_id_wide=${BETA[3]} - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth1.ioam6_id=${BETA[4]} - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth1.ioam6_id_wide=${BETA[5]} - ip -netns $ioam_node_beta ioam namespace add 123 data ${BETA[6]} wide ${BETA[7]} - ip -netns $ioam_node_beta ioam schema add ${BETA[8]} "${BETA[9]}" - ip -netns $ioam_node_beta ioam namespace set 123 schema ${BETA[8]} - - ip netns exec $ioam_node_gamma sysctl -wq net.ipv6.ioam6_id=${GAMMA[0]} - ip netns exec $ioam_node_gamma sysctl -wq net.ipv6.ioam6_id_wide=${GAMMA[1]} - ip netns exec $ioam_node_gamma sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 - ip netns exec $ioam_node_gamma sysctl -wq net.ipv6.conf.veth0.ioam6_id=${GAMMA[2]} - ip netns exec $ioam_node_gamma sysctl -wq net.ipv6.conf.veth0.ioam6_id_wide=${GAMMA[3]} - ip -netns $ioam_node_gamma ioam namespace add 123 data ${GAMMA[6]} wide ${GAMMA[7]} + peer name ioam-veth-gamma netns $ioam_node_gamma &>/dev/null + + ip -netns $ioam_node_alpha link set ioam-veth-alpha name veth0 &>/dev/null + ip -netns $ioam_node_beta link set ioam-veth-betaL name veth0 &>/dev/null + ip -netns $ioam_node_beta link set ioam-veth-betaR name veth1 &>/dev/null + ip -netns $ioam_node_gamma link set ioam-veth-gamma name veth0 &>/dev/null + + ip -netns $ioam_node_alpha addr add 2001:db8:1::50/64 dev veth0 &>/dev/null + ip -netns $ioam_node_alpha addr add 2001:db8:1::2/64 dev veth0 &>/dev/null + ip -netns $ioam_node_alpha link set veth0 up &>/dev/null + ip -netns $ioam_node_alpha link set lo up &>/dev/null + ip -netns $ioam_node_alpha route add 2001:db8:2::/64 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + ip -netns $ioam_node_beta addr add 2001:db8:1::1/64 dev veth0 &>/dev/null + ip -netns $ioam_node_beta addr add 2001:db8:2::1/64 dev veth1 &>/dev/null + ip -netns $ioam_node_beta link set veth0 up &>/dev/null + ip -netns $ioam_node_beta link set veth1 up &>/dev/null + ip -netns $ioam_node_beta link set lo up &>/dev/null + + ip -netns $ioam_node_gamma addr add 2001:db8:2::2/64 dev veth0 &>/dev/null + ip -netns $ioam_node_gamma link set veth0 up &>/dev/null + ip -netns $ioam_node_gamma link set lo up &>/dev/null + ip -netns $ioam_node_gamma route add 2001:db8:1::/64 \ + via 2001:db8:2::1 dev veth0 &>/dev/null + + # - Alpha: IOAM config - + ip netns exec $ioam_node_alpha \ + sysctl -wq net.ipv6.ioam6_id=${ALPHA[0]} &>/dev/null + ip netns exec $ioam_node_alpha \ + sysctl -wq net.ipv6.ioam6_id_wide=${ALPHA[1]} &>/dev/null + ip netns exec $ioam_node_alpha \ + sysctl -wq net.ipv6.conf.veth0.ioam6_id=${ALPHA[4]} &>/dev/null + ip netns exec $ioam_node_alpha \ + sysctl -wq net.ipv6.conf.veth0.ioam6_id_wide=${ALPHA[5]} &>/dev/null + ip -netns $ioam_node_alpha \ + ioam namespace add 123 data ${ALPHA[6]} wide ${ALPHA[7]} &>/dev/null + ip -netns $ioam_node_alpha \ + ioam schema add ${ALPHA[8]} "${ALPHA[9]}" &>/dev/null + ip -netns $ioam_node_alpha \ + ioam namespace set 123 schema ${ALPHA[8]} &>/dev/null + + # - Beta: IOAM config - + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.all.forwarding=1 &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.ioam6_id=${BETA[0]} &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.ioam6_id_wide=${BETA[1]} &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_id=${BETA[2]} &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_id_wide=${BETA[3]} &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth1.ioam6_id=${BETA[4]} &>/dev/null + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth1.ioam6_id_wide=${BETA[5]} &>/dev/null + ip -netns $ioam_node_beta ioam namespace add 123 &>/dev/null sleep 1 - ip netns exec $ioam_node_alpha ping6 -c 5 -W 1 db02::2 &>/dev/null + ip netns exec $ioam_node_alpha ping6 -c 5 -W 1 2001:db8:2::2 &>/dev/null if [ $? != 0 ] then - echo "Setup FAILED" - cleanup &>/dev/null - exit 0 + echo "SKIP: Setup failed." + cleanup + exit $ksft_skip fi } log_test_passed() { - local desc=$1 - printf "TEST: %-60s [ OK ]\n" "${desc}" + printf " - TEST: %-57s [ OK ]\n" "$1" + npassed=$((npassed+1)) } -log_test_failed() +log_test_skipped() { - local desc=$1 - printf "TEST: %-60s [FAIL]\n" "${desc}" + printf " - TEST: %-57s [SKIP]\n" "$1" + nskipped=$((nskipped+1)) } -log_results() +log_test_failed() { - echo "- Tests passed: ${npassed}" - echo "- Tests failed: ${nfailed}" + printf " - TEST: %-57s [FAIL]\n" "$1" + nfailed=$((nfailed+1)) } run_test() { local name=$1 local desc=$2 - local node_src=$3 - local node_dst=$4 - local ip6_dst=$5 - local trace_type=$6 - local ioam_ns=$7 - local type=$8 - - ip netns exec $node_dst ./ioam6_parser $name $trace_type $ioam_ns $type & + local ip6_src=$3 + local trace_type=$4 + local trace_size=$5 + local ioam_ns=$6 + local type=$7 + + ip netns exec $ioam_node_gamma \ + ./ioam6_parser veth0 $name $ip6_src 2001:db8:2::2 \ + $trace_type $trace_size $ioam_ns $type & local spid=$! sleep 0.1 - ip netns exec $node_src ping6 -t 64 -c 1 -W 1 $ip6_dst &>/dev/null + ip netns exec $ioam_node_alpha ping6 -t 64 -c 1 -W 1 2001:db8:2::2 &>/dev/null if [ $? != 0 ] then - nfailed=$((nfailed+1)) log_test_failed "${desc}" kill -2 $spid &>/dev/null else wait $spid - if [ $? = 0 ] - then - npassed=$((npassed+1)) - log_test_passed "${desc}" - else - nfailed=$((nfailed+1)) - log_test_failed "${desc}" - fi + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" fi } run() { + local test + + echo + printf "+" + printf "%0.s-" {1..72} + printf "+" + echo + printf "| %-28s LOCAL tests %-29s |" echo - printf "%0.s-" {1..74} + printf "+" + printf "%0.s-" {1..72} + printf "+" echo - echo "OUTPUT tests" - printf "%0.s-" {1..74} + + echo + echo "Global config" + for test in $TESTS_LOCAL + do + $test + done + + echo + echo "Inline mode" + for test in $TESTS_LOCAL + do + $test "inline" + done + + echo + echo "Encap mode" + for test in $TESTS_LOCAL + do + $test "encap" + done + + echo + printf "+" + printf "%0.s-" {1..72} + printf "+" + echo + printf "| %-28s OUTPUT tests %-28s |" + echo + printf "+" + printf "%0.s-" {1..72} + printf "+" echo # set OUTPUT settings - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=0 + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=0 &>/dev/null - for t in $TESTS_OUTPUT + echo + echo "Inline mode" + for test in $TESTS_OUTPUT do - $t "inline" - [ $encap_tests = 0 ] && $t "encap" + $test "inline" done - # clean OUTPUT settings - ip netns exec $ioam_node_beta sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 - ip -netns $ioam_node_alpha route change db01::/64 dev veth0 + echo + echo "Encap mode" + for test in $TESTS_OUTPUT + do + $test "encap" + done + echo + echo "Encap mode (with tunsrc)" + for test in $TESTS_OUTPUT + do + $test "encap" "tunsrc" + done + + # clean OUTPUT settings + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 &>/dev/null echo - printf "%0.s-" {1..74} + printf "+" + printf "%0.s-" {1..72} + printf "+" echo - echo "INPUT tests" - printf "%0.s-" {1..74} + printf "| %-28s INPUT tests %-29s |" + echo + printf "+" + printf "%0.s-" {1..72} + printf "+" echo # set INPUT settings - ip -netns $ioam_node_alpha ioam namespace del 123 + ip -netns $ioam_node_alpha ioam namespace del 123 &>/dev/null - for t in $TESTS_INPUT + echo + echo "Inline mode" + for test in $TESTS_INPUT do - $t "inline" - [ $encap_tests = 0 ] && $t "encap" + $test "inline" + done + + echo + echo "Encap mode" + for test in $TESTS_INPUT + do + $test "encap" done # clean INPUT settings - ip -netns $ioam_node_alpha ioam namespace add 123 \ - data ${ALPHA[6]} wide ${ALPHA[7]} - ip -netns $ioam_node_alpha ioam namespace set 123 schema ${ALPHA[8]} - ip -netns $ioam_node_alpha route change db01::/64 dev veth0 + ip -netns $ioam_node_alpha \ + ioam namespace add 123 data ${ALPHA[6]} wide ${ALPHA[7]} &>/dev/null + ip -netns $ioam_node_alpha \ + ioam namespace set 123 schema ${ALPHA[8]} &>/dev/null echo - printf "%0.s-" {1..74} + printf "+" + printf "%0.s-" {1..72} + printf "+" echo - echo "GLOBAL tests" - printf "%0.s-" {1..74} + printf "| %-30s Results %-31s |" + echo + printf "+" + printf "%0.s-" {1..72} + printf "+" echo - for t in $TESTS_GLOBAL - do - $t "inline" - [ $encap_tests = 0 ] && $t "encap" - done - echo - log_results + echo "- Passed: ${npassed}" + echo "- Skipped: ${nskipped}" + echo "- Failed: ${nfailed}" + echo } bit2type=( 0x800000 0x400000 0x200000 0x100000 0x080000 0x040000 0x020000 0x010000 0x008000 0x004000 0x002000 0x001000 0x000800 0x000400 0x000200 0x000100 - 0x000080 0x000040 0x000020 0x000010 0x000008 0x000004 0x000002 + 0x000080 0x000040 0x000020 0x000010 0x000008 0x000004 0x000002 0x000001 ) -bit2size=( 4 4 4 4 4 4 4 4 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 ) +bit2size=( 4 4 4 4 4 4 4 4 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 0 ) ################################################################################ # # -# OUTPUT tests # +# LOCAL tests # # # -# Two nodes (sender/receiver), IOAM disabled on ingress for the receiver. # ################################################################################ -out_undef_ns() +local_sysctl_ioam_id() +{ + ############################################################################## + # Make sure the sysctl "net.ipv6.ioam6_id" works as expected. # + ############################################################################## + local desc="Sysctl net.ipv6.ioam6_id" + + [ ! -z $1 ] && return + + ip netns exec $ioam_node_alpha \ + sysctl net.ipv6.ioam6_id 2>/dev/null | grep -wq ${ALPHA[0]} + + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} + +local_sysctl_ioam_id_wide() { ############################################################################## - # Make sure that the encap node won't fill the trace if the chosen IOAM # - # namespace is not configured locally. # + # Make sure the sysctl "net.ipv6.ioam6_id_wide" works as expected. # ############################################################################## - local desc="Unknown IOAM namespace" + local desc="Sysctl net.ipv6.ioam6_id_wide" - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + [ ! -z $1 ] && return - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0x800000 ns 0 size 4 dev veth0 + ip netns exec $ioam_node_alpha \ + sysctl net.ipv6.ioam6_id_wide 2>/dev/null | grep -wq ${ALPHA[1]} - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0x800000 0 $1 + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down +local_sysctl_ioam_intf_id() +{ + ############################################################################## + # Make sure the sysctl "net.ipv6.conf.XX.ioam6_id" works as expected. # + ############################################################################## + local desc="Sysctl net.ipv6.conf.XX.ioam6_id" + + [ ! -z $1 ] && return + + ip netns exec $ioam_node_alpha \ + sysctl net.ipv6.conf.veth0.ioam6_id 2>/dev/null | grep -wq ${ALPHA[4]} + + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" } -out_no_room() +local_sysctl_ioam_intf_id_wide() { ############################################################################## - # Make sure that the encap node won't fill the trace and will set the # - # Overflow flag since there is no room enough for its data. # + # Make sure the sysctl "net.ipv6.conf.XX.ioam6_id_wide" works as expected. # ############################################################################## - local desc="Missing trace room" + local desc="Sysctl net.ipv6.conf.XX.ioam6_id_wide" + + [ ! -z $1 ] && return + + ip netns exec $ioam_node_alpha \ + sysctl net.ipv6.conf.veth0.ioam6_id_wide 2>/dev/null | grep -wq ${ALPHA[5]} + + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up +local_sysctl_ioam_intf_enabled() +{ + ############################################################################## + # Make sure the sysctl "net.ipv6.conf.XX.ioam6_enabled" works as expected. # + ############################################################################## + local desc="Sysctl net.ipv6.conf.XX.ioam6_enabled" - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0xc00000 ns 123 size 4 dev veth0 + [ ! -z $1 ] && return - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0xc00000 123 $1 + ip netns exec $ioam_node_beta \ + sysctl net.ipv6.conf.veth0.ioam6_enabled 2>/dev/null | grep -wq 1 - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + [ $? == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" } -out_bits() +local_ioam_namespace() { ############################################################################## - # Make sure that, for each trace type bit, the encap node will either: # - # (i) fill the trace with its data when it is a supported bit # - # (ii) not fill the trace with its data when it is an unsupported bit # + # Make sure the creation of an IOAM Namespace works as expected. # ############################################################################## - local desc="Trace type with bit <n> only" + local desc="Create an IOAM Namespace" - local tmp=${bit2size[22]} - bit2size[22]=$(( $tmp + ${#ALPHA[9]} + ((4 - (${#ALPHA[9]} % 4)) % 4) )) + [ ! -z $1 ] && return - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + ip -netns $ioam_node_alpha \ + ioam namespace show 2>/dev/null | grep -wq 123 + local ret=$? - for i in {0..22} - do - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type ${bit2type[$i]} ns 123 size ${bit2size[$i]} \ - dev veth0 &>/dev/null + ip -netns $ioam_node_alpha \ + ioam namespace show 2>/dev/null | grep -wq ${ALPHA[6]} + ret=$((ret + $?)) - local cmd_res=$? - local descr="${desc/<n>/$i}" + ip -netns $ioam_node_alpha \ + ioam namespace show 2>/dev/null | grep -wq ${ALPHA[7]} + ret=$((ret + $?)) + + [ $ret == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} + +local_ioam_schema() +{ + ############################################################################## + # Make sure the creation of an IOAM Schema works as expected. # + ############################################################################## + local desc="Create an IOAM Schema" + + [ ! -z $1 ] && return + + ip -netns $ioam_node_alpha \ + ioam schema show 2>/dev/null | grep -wq ${ALPHA[8]} + local ret=$? + + local sc_data=$( + for i in `seq 0 $((${#ALPHA[9]}-1))` + do + chr=${ALPHA[9]:i:1} + printf "%x " "'${chr}" + done + ) + + ip -netns $ioam_node_alpha \ + ioam schema show 2>/dev/null | grep -q "$sc_data" + ret=$((ret + $?)) + + [ $ret == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} + +local_ioam_schema_namespace() +{ + ############################################################################## + # Make sure the binding of a Schema to a Namespace works as expected. # + ############################################################################## + local desc="Bind an IOAM Schema to an IOAM Namespace" + + [ ! -z $1 ] && return + + ip -netns $ioam_node_alpha \ + ioam namespace show 2>/dev/null | grep -wq ${ALPHA[8]} + local ret=$? + + ip -netns $ioam_node_alpha \ + ioam schema show 2>/dev/null | grep -wq 123 + ret=$((ret + $?)) + + [ $ret == 0 ] && log_test_passed "${desc}" || log_test_failed "${desc}" +} + +local_route_ns() +{ + ############################################################################## + # Make sure the Namespace-ID is always provided, whatever the mode. # + ############################################################################## + local desc="Mandatory Namespace-ID" + local mode + + [ -z $1 ] && return + + [ "$1" == "encap" ] && mode="$1 tundst 2001:db8:2::2" || mode="$1" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret1=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret2=$? + + [[ $ret1 == 0 || $ret2 != 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null +} + +local_route_tunsrc() +{ + ############################################################################## + # Make sure the Tunnel Source is only (and possibly) used with encap mode. # + ############################################################################## + local desc + local mode + local mode_tunsrc - if [[ $i -ge 12 && $i -le 21 ]] + [ -z $1 ] && return + + if [ "$1" == "encap" ] + then + desc="Optional Tunnel Source" + mode="$1 tundst 2001:db8:2::2" + mode_tunsrc="$1 tunsrc 2001:db8:1::50 tundst 2001:db8:2::2" + else + desc="Unneeded Tunnel Source" + mode="$1" + mode_tunsrc="$1 tunsrc 2001:db8:1::50" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret1=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode_tunsrc trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret2=$? + + if [ "$1" == "encap" ] + then + [[ $ret1 != 0 || $ret2 != 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + else + [[ $ret1 != 0 || $ret2 == 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null +} + +local_route_tundst() +{ + ############################################################################## + # Make sure the Tunnel Destination is only (and always) used with encap mode.# + ############################################################################## + local desc + + [ -z $1 ] && return + + [ "$1" == "encap" ] && desc="Mandatory Tunnel Destination" \ + || desc="Unneeded Tunnel Destination" + + local mode="$1" + local mode_tundst="$1 tundst 2001:db8:2::2" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret1=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode_tundst trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret2=$? + + if [ "$1" == "encap" ] + then + [[ $ret1 == 0 || $ret2 != 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + else + [[ $ret1 != 0 || $ret2 == 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null +} + +local_route_trace_type() +{ + ############################################################################## + # Make sure the Trace Type is always provided, whatever the mode. # + ############################################################################## + local desc="Mandatory Trace Type" + local mode + + [ -z $1 ] && return + + [ "$1" == "encap" ] && mode="$1 tundst 2001:db8:2::2" || mode="$1" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret1=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret2=$? + + [[ $ret1 == 0 || $ret2 != 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null +} + +local_route_trace_size() +{ + ############################################################################## + # Make sure the Trace Size is always provided, whatever the mode. # + ############################################################################## + local desc="Mandatory Trace Size" + local mode + + [ -z $1 ] && return + + [ "$1" == "encap" ] && mode="$1 tundst 2001:db8:2::2" || mode="$1" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret1=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + local ret2=$? + + [[ $ret1 == 0 || $ret2 != 0 ]] && log_test_failed "${desc}" \ + || log_test_passed "${desc}" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null +} + +local_route_trace_type_bits() +{ + ############################################################################## + # Make sure only allowed bits (0-11 and 22) are accepted. # + ############################################################################## + local desc="Trace Type bits" + local mode + + [ -z $1 ] && return + + [ "$1" == "encap" ] && mode="$1 tundst 2001:db8:2::2" || mode="$1" + + local i + for i in {0..23} + do + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type ${bit2type[$i]} ns 0 size 4 \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [[ ($? == 0 && (($i -ge 12 && $i -le 21) || $i == 23)) || + ($? != 0 && (($i -ge 0 && $i -le 11) || $i == 22)) ]] then - if [ $cmd_res != 0 ] - then - npassed=$((npassed+1)) - log_test_passed "$descr ($1 mode)" - else - nfailed=$((nfailed+1)) - log_test_failed "$descr ($1 mode)" - fi - else - run_test "out_bit$i" "$descr ($1 mode)" $ioam_node_alpha \ - $ioam_node_beta db01::1 ${bit2type[$i]} 123 $1 + local err=1 + break fi done - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + [ -z $err ] && log_test_passed "${desc}" || log_test_failed "${desc}" - bit2size[22]=$tmp + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null } -out_full_supp_trace() +local_route_trace_size_values() { ############################################################################## - # Make sure that the encap node will correctly fill a full trace. Be careful,# - # "full trace" here does NOT mean all bits (only supported ones). # + # Make sure only allowed sizes (multiples of four in [4,244]) are accepted. # ############################################################################## - local desc="Full supported trace" + local desc="Trace Size values" + local mode - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + [ -z $1 ] && return - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0xfff002 ns 123 size 100 dev veth0 + [ "$1" == "encap" ] && mode="$1 tundst 2001:db8:2::2" || mode="$1" - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0xfff002 123 $1 + # we also try the next multiple of four after the MAX to check it's refused + local i + for i in {0..248} + do + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type 0x800000 ns 0 size $i \ + via 2001:db8:1::1 dev veth0 &>/dev/null - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + if [[ ($? == 0 && ($i == 0 || $i == 248 || $(( $i % 4 )) != 0)) || + ($? != 0 && $i != 0 && $i != 248 && $(( $i % 4 )) == 0) ]] + then + local err=1 + break + fi + done + + [ -z $err ] && log_test_passed "${desc}" || log_test_failed "${desc}" + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null } ################################################################################ # # -# INPUT tests # +# OUTPUT tests # # # -# Two nodes (sender/receiver), the sender MUST NOT fill the trace upon # -# insertion -> the IOAM namespace configured on the sender is removed # -# and is used in the inserted trace to force the sender not to fill it. # ################################################################################ -in_undef_ns() +output_undef_ns() { ############################################################################## - # Make sure that the receiving node won't fill the trace if the related IOAM # - # namespace is not configured locally. # + # Make sure an IOAM encapsulating node does NOT fill the trace when the # + # corresponding IOAM Namespace-ID is not configured locally. # ############################################################################## - local desc="Unknown IOAM namespace" + local desc="Unknown IOAM Namespace-ID" + local ns=0 + local tr_type=0x800000 + local tr_size=4 + local mode="$1" + local saddr="2001:db8:1::2" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0x800000 ns 0 size 4 dev veth0 + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" $saddr $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0x800000 0 $1 + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null } -in_no_room() +output_no_room() { ############################################################################## - # Make sure that the receiving node won't fill the trace and will set the # - # Overflow flag if there is no room enough for its data. # + # Make sure an IOAM encapsulating node does NOT fill the trace AND sets the # + # Overflow flag when there is not enough room for its data. # ############################################################################## - local desc="Missing trace room" + local desc="Missing room for data" + local ns=123 + local tr_type=0xc00000 + local tr_size=4 + local mode="$1" + local saddr="2001:db8:1::2" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0xc00000 ns 123 size 4 dev veth0 + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0xc00000 123 $1 + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" $saddr $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null } -in_bits() +output_no_room_oss() { ############################################################################## - # Make sure that, for each trace type bit, the receiving node will either: # - # (i) fill the trace with its data when it is a supported bit # - # (ii) not fill the trace with its data when it is an unsupported bit # + # Make sure an IOAM encapsulating node does NOT fill the trace AND sets the # + # Overflow flag when there is not enough room for the Opaque State Snapshot. # ############################################################################## - local desc="Trace type with bit <n> only" + local desc="Missing room for Opaque State Snapshot" + local ns=123 + local tr_type=0x000002 + local tr_size=4 + local mode="$1" + local saddr="2001:db8:1::2" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi - local tmp=${bit2size[22]} - bit2size[22]=$(( $tmp + ${#BETA[9]} + ((4 - (${#BETA[9]} % 4)) % 4) )) + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" $saddr $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +output_bits() +{ + ############################################################################## + # Make sure an IOAM encapsulating node implements all supported bits by # + # checking it correctly fills the trace with its data. # + ############################################################################## + local desc="Trace Type with supported bit <n> only" + local ns=123 + local mode="$1" + local saddr="2001:db8:1::2" - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + if [ "$1" == "encap" ] + then + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + local tmp=${bit2size[22]} + bit2size[22]=$(( $tmp + ${#ALPHA[9]} + ((4 - (${#ALPHA[9]} % 4)) % 4) )) + local i for i in {0..11} {22..22} do - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type ${bit2type[$i]} ns 123 size ${bit2size[$i]} \ - dev veth0 + local descr="${desc/<n>/$i}" + + if [[ "$1" == "encap" && $encap_tests != 0 ]] + then + log_test_skipped "${descr}" + continue + fi - run_test "in_bit$i" "${desc/<n>/$i} ($1 mode)" $ioam_node_alpha \ - $ioam_node_beta db01::1 ${bit2type[$i]} 123 $1 + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc \ + type ${bit2type[$i]} ns $ns size ${bit2size[$i]} \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test "output_bit$i" "${descr}" $saddr \ + ${bit2type[$i]} ${bit2size[$i]} $ns $1 + else + log_test_failed "${descr}" + fi done - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null bit2size[22]=$tmp } -in_oflag() +output_sizes() { ############################################################################## - # Make sure that the receiving node won't fill the trace since the Overflow # - # flag is set. # + # Make sure an IOAM encapsulating node allocates supported sizes correctly. # ############################################################################## - local desc="Overflow flag is set" + local desc="Trace Size of <n> bytes" + local ns=0 + local tr_type=0x800000 + local mode="$1" + local saddr="2001:db8:1::2" - # Exception: - # Here, we need the sender to set the Overflow flag. For that, we will add - # back the IOAM namespace that was previously configured on the sender. - ip -netns $ioam_node_alpha ioam namespace add 123 + if [ "$1" == "encap" ] + then + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0xc00000 ns 123 size 4 dev veth0 + local i + for i in $(seq 4 4 244) + do + local descr="${desc/<n>/$i}" - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0xc00000 123 $1 + if [[ "$1" == "encap" && $encap_tests != 0 ]] + then + log_test_skipped "${descr}" + continue + fi - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $i \ + via 2001:db8:1::1 dev veth0 &>/dev/null - # And we clean the exception for this test to get things back to normal for - # other INPUT tests - ip -netns $ioam_node_alpha ioam namespace del 123 + if [ $? == 0 ] + then + run_test "output_size$i" "${descr}" $saddr $tr_type $i $ns $1 + else + log_test_failed "${descr}" + fi + done + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null } -in_full_supp_trace() +output_full_supp_trace() { ############################################################################## - # Make sure that the receiving node will correctly fill a full trace. Be # - # careful, "full trace" here does NOT mean all bits (only supported ones). # + # Make sure an IOAM encapsulating node correctly fills a trace when all # + # supported bits are set. # ############################################################################## local desc="Full supported trace" + local ns=123 + local tr_type=0xfff002 + local tr_size + local mode="$1" + local saddr="2001:db8:1::2" - [ "$1" = "encap" ] && mode="$1 tundst db01::1" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 up + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi - ip -netns $ioam_node_alpha route change db01::/64 encap ioam6 mode $mode \ - trace prealloc type 0xfff002 ns 123 size 80 dev veth0 + if [ "$2" == "tunsrc" ] + then + saddr="2001:db8:1::50" + mode+=" tunsrc 2001:db8:1::50" + fi - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ - db01::1 0xfff002 123 $1 + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi - [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down + local i + tr_size=$(( ${#ALPHA[9]} + ((4 - (${#ALPHA[9]} % 4)) % 4) )) + for i in {0..11} {22..22} + do + tr_size=$((tr_size + bit2size[$i])) + done + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" $saddr $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null } ################################################################################ # # -# GLOBAL tests # +# INPUT tests # # # -# Three nodes (sender/router/receiver), IOAM fully enabled on every node. # ################################################################################ -fwd_full_supp_trace() +input_undef_ns() +{ + ############################################################################## + # Make sure an IOAM node does NOT fill the trace when the corresponding IOAM # + # Namespace-ID is not configured locally. # + ############################################################################## + local desc="Unknown IOAM Namespace-ID" + local ns=0 + local tr_type=0x800000 + local tr_size=4 + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_no_room() +{ + ############################################################################## + # Make sure an IOAM node does NOT fill the trace AND sets the Overflow flag # + # when there is not enough room for its data. # + ############################################################################## + local desc="Missing room for data" + local ns=123 + local tr_type=0xc00000 + local tr_size=4 + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_no_room_oss() +{ + ############################################################################## + # Make sure an IOAM node does NOT fill the trace AND sets the Overflow flag # + # when there is not enough room for the Opaque State Snapshot. # + ############################################################################## + local desc="Missing room for Opaque State Snapshot" + local ns=123 + local tr_type=0x000002 + local tr_size=4 + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_disabled() +{ + ############################################################################## + # Make sure an IOAM node does NOT fill the trace when IOAM is not enabled on # + # the corresponding (ingress) interface. # + ############################################################################## + local desc="IOAM disabled on ingress interface" + local ns=123 + local tr_type=0x800000 + local tr_size=4 + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + # Exception: disable IOAM on ingress interface + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=0 &>/dev/null + local ret=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + ret=$((ret + $?)) + + if [ $ret == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + # Clean Exception + ip netns exec $ioam_node_beta \ + sysctl -wq net.ipv6.conf.veth0.ioam6_enabled=1 &>/dev/null + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_oflag() +{ + ############################################################################## + # Make sure an IOAM node does NOT fill the trace when the Overflow flag is # + # set. # + ############################################################################## + local desc="Overflow flag is set" + local ns=123 + local tr_type=0xc00000 + local tr_size=4 + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + # Exception: + # Here, we need the sender to set the Overflow flag. For that, we will add + # back the IOAM namespace that was previously configured on the sender. + ip -netns $ioam_node_alpha ioam namespace add 123 &>/dev/null + local ret=$? + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + ret=$((ret + $?)) + + if [ $ret == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + # Clean Exception + ip -netns $ioam_node_alpha ioam namespace del 123 &>/dev/null + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_bits() +{ + ############################################################################## + # Make sure an IOAM node implements all supported bits by checking it # + # correctly fills the trace with its data. # + ############################################################################## + local desc="Trace Type with supported bit <n> only" + local ns=123 + local mode="$1" + + if [ "$1" == "encap" ] + then + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + local tmp=${bit2size[22]} + bit2size[22]=$(( $tmp + ${#BETA[9]} + ((4 - (${#BETA[9]} % 4)) % 4) )) + + local i + for i in {0..11} {22..22} + do + local descr="${desc/<n>/$i}" + + if [[ "$1" == "encap" && $encap_tests != 0 ]] + then + log_test_skipped "${descr}" + continue + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc \ + type ${bit2type[$i]} ns $ns size ${bit2size[$i]} \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test "input_bit$i" "${descr}" 2001:db8:1::2 \ + ${bit2type[$i]} ${bit2size[$i]} $ns $1 + else + log_test_failed "${descr}" + fi + done + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null + + bit2size[22]=$tmp +} + +input_sizes() { ############################################################################## - # Make sure that all three nodes correctly filled the full supported trace # - # by checking that the trace data is consistent with the predefined config. # + # Make sure an IOAM node handles all supported sizes correctly. # ############################################################################## - local desc="Forward - Full supported trace" + local desc="Trace Size of <n> bytes" + local ns=123 + local tr_type=0x800000 + local mode="$1" + + if [ "$1" == "encap" ] + then + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi - [ "$1" = "encap" ] && mode="$1 tundst db02::2" || mode="$1" - [ "$1" = "encap" ] && ip -netns $ioam_node_gamma link set ip6tnl0 up + local i + for i in $(seq 4 4 244) + do + local descr="${desc/<n>/$i}" - ip -netns $ioam_node_alpha route change db02::/64 encap ioam6 mode $mode \ - trace prealloc type 0xfff002 ns 123 size 244 via db01::1 dev veth0 + if [[ "$1" == "encap" && $encap_tests != 0 ]] + then + log_test_skipped "${descr}" + continue + fi - run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_gamma \ - db02::2 0xfff002 123 $1 + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $i \ + via 2001:db8:1::1 dev veth0 &>/dev/null - [ "$1" = "encap" ] && ip -netns $ioam_node_gamma link set ip6tnl0 down + if [ $? == 0 ] + then + run_test "input_size$i" "${descr}" 2001:db8:1::2 $tr_type $i $ns $1 + else + log_test_failed "${descr}" + fi + done + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null +} + +input_full_supp_trace() +{ + ############################################################################## + # Make sure an IOAM node correctly fills a trace when all supported bits are # + # set. # + ############################################################################## + local desc="Full supported trace" + local ns=123 + local tr_type=0xfff002 + local tr_size + local mode="$1" + + if [ "$1" == "encap" ] + then + if [ $encap_tests != 0 ] + then + log_test_skipped "${desc}" + return + fi + + mode+=" tundst 2001:db8:2::2" + ip -netns $ioam_node_gamma link set ip6tnl0 up &>/dev/null + fi + + local i + tr_size=$(( ${#BETA[9]} + ((4 - (${#BETA[9]} % 4)) % 4) )) + for i in {0..11} {22..22} + do + tr_size=$((tr_size + bit2size[$i])) + done + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 \ + encap ioam6 mode $mode trace prealloc type $tr_type ns $ns size $tr_size \ + via 2001:db8:1::1 dev veth0 &>/dev/null + + if [ $? == 0 ] + then + run_test ${FUNCNAME[0]} "${desc}" 2001:db8:1::2 $tr_type $tr_size $ns $1 + else + log_test_failed "${desc}" + fi + + ip -netns $ioam_node_alpha \ + route change 2001:db8:2::/64 via 2001:db8:1::1 dev veth0 &>/dev/null + + [ "$1" == "encap" ] && ip -netns $ioam_node_gamma \ + link set ip6tnl0 down &>/dev/null } @@ -742,30 +1655,29 @@ fwd_full_supp_trace() ################################################################################ npassed=0 +nskipped=0 nfailed=0 if [ "$(id -u)" -ne 0 ] then - echo "SKIP: Need root privileges" + echo "SKIP: Need root privileges." exit $ksft_skip fi if [ ! -x "$(command -v ip)" ] then - echo "SKIP: Could not run test without ip tool" - exit $ksft_skip -fi - -ip ioam &>/dev/null -if [ $? = 1 ] -then - echo "SKIP: iproute2 too old, missing ioam command" + echo "SKIP: Could not run test without ip tool." exit $ksft_skip fi check_kernel_compatibility - -cleanup &>/dev/null setup run -cleanup &>/dev/null +cleanup + +if [ $nfailed != 0 ] +then + exit $ksft_fail +fi + +exit $ksft_pass diff --git a/tools/testing/selftests/net/ioam6_parser.c b/tools/testing/selftests/net/ioam6_parser.c index 895e5bb5044b..de4b5c9e8a74 100644 --- a/tools/testing/selftests/net/ioam6_parser.c +++ b/tools/testing/selftests/net/ioam6_parser.c @@ -8,8 +8,10 @@ #include <errno.h> #include <limits.h> #include <linux/const.h> +#include <linux/if_ether.h> #include <linux/ioam6.h> #include <linux/ipv6.h> +#include <stdbool.h> #include <stdlib.h> #include <string.h> #include <unistd.h> @@ -40,7 +42,7 @@ static struct ioam_config node1 = { .egr_id = 101, .ingr_wide = 0xffffffff, /* default value */ .egr_wide = 101101, - .ns_data = 0xdeadbee0, + .ns_data = 0xdeadbeef, .ns_wide = 0xcafec0caf00dc0de, .sc_id = 777, .sc_data = "something that will be 4n-aligned", @@ -54,33 +56,22 @@ static struct ioam_config node2 = { .egr_id = 202, .ingr_wide = 201201, .egr_wide = 202202, - .ns_data = 0xdeadbee1, - .ns_wide = 0xcafec0caf11dc0de, - .sc_id = 666, - .sc_data = "Hello there -Obi", - .hlim = 63, -}; - -static struct ioam_config node3 = { - .id = 3, - .wide = 33333333, - .ingr_id = 301, - .egr_id = 0xffff, /* default value */ - .ingr_wide = 301301, - .egr_wide = 0xffffffff, /* default value */ - .ns_data = 0xdeadbee2, - .ns_wide = 0xcafec0caf22dc0de, + .ns_data = 0xffffffff, /* default value */ + .ns_wide = 0xffffffffffffffff, /* default value */ .sc_id = 0xffffff, /* default value */ .sc_data = NULL, - .hlim = 62, + .hlim = 63, }; enum { /********** * OUTPUT * **********/ + __TEST_OUT_MIN, + TEST_OUT_UNDEF_NS, TEST_OUT_NO_ROOM, + TEST_OUT_NO_ROOM_OSS, TEST_OUT_BIT0, TEST_OUT_BIT1, TEST_OUT_BIT2, @@ -94,13 +85,80 @@ enum { TEST_OUT_BIT10, TEST_OUT_BIT11, TEST_OUT_BIT22, + TEST_OUT_SIZE4, + TEST_OUT_SIZE8, + TEST_OUT_SIZE12, + TEST_OUT_SIZE16, + TEST_OUT_SIZE20, + TEST_OUT_SIZE24, + TEST_OUT_SIZE28, + TEST_OUT_SIZE32, + TEST_OUT_SIZE36, + TEST_OUT_SIZE40, + TEST_OUT_SIZE44, + TEST_OUT_SIZE48, + TEST_OUT_SIZE52, + TEST_OUT_SIZE56, + TEST_OUT_SIZE60, + TEST_OUT_SIZE64, + TEST_OUT_SIZE68, + TEST_OUT_SIZE72, + TEST_OUT_SIZE76, + TEST_OUT_SIZE80, + TEST_OUT_SIZE84, + TEST_OUT_SIZE88, + TEST_OUT_SIZE92, + TEST_OUT_SIZE96, + TEST_OUT_SIZE100, + TEST_OUT_SIZE104, + TEST_OUT_SIZE108, + TEST_OUT_SIZE112, + TEST_OUT_SIZE116, + TEST_OUT_SIZE120, + TEST_OUT_SIZE124, + TEST_OUT_SIZE128, + TEST_OUT_SIZE132, + TEST_OUT_SIZE136, + TEST_OUT_SIZE140, + TEST_OUT_SIZE144, + TEST_OUT_SIZE148, + TEST_OUT_SIZE152, + TEST_OUT_SIZE156, + TEST_OUT_SIZE160, + TEST_OUT_SIZE164, + TEST_OUT_SIZE168, + TEST_OUT_SIZE172, + TEST_OUT_SIZE176, + TEST_OUT_SIZE180, + TEST_OUT_SIZE184, + TEST_OUT_SIZE188, + TEST_OUT_SIZE192, + TEST_OUT_SIZE196, + TEST_OUT_SIZE200, + TEST_OUT_SIZE204, + TEST_OUT_SIZE208, + TEST_OUT_SIZE212, + TEST_OUT_SIZE216, + TEST_OUT_SIZE220, + TEST_OUT_SIZE224, + TEST_OUT_SIZE228, + TEST_OUT_SIZE232, + TEST_OUT_SIZE236, + TEST_OUT_SIZE240, + TEST_OUT_SIZE244, TEST_OUT_FULL_SUPP_TRACE, + __TEST_OUT_MAX, + /********* * INPUT * *********/ + __TEST_IN_MIN, + TEST_IN_UNDEF_NS, TEST_IN_NO_ROOM, + TEST_IN_NO_ROOM_OSS, + TEST_IN_DISABLED, TEST_IN_OFLAG, TEST_IN_BIT0, TEST_IN_BIT1, @@ -115,36 +173,107 @@ enum { TEST_IN_BIT10, TEST_IN_BIT11, TEST_IN_BIT22, + TEST_IN_SIZE4, + TEST_IN_SIZE8, + TEST_IN_SIZE12, + TEST_IN_SIZE16, + TEST_IN_SIZE20, + TEST_IN_SIZE24, + TEST_IN_SIZE28, + TEST_IN_SIZE32, + TEST_IN_SIZE36, + TEST_IN_SIZE40, + TEST_IN_SIZE44, + TEST_IN_SIZE48, + TEST_IN_SIZE52, + TEST_IN_SIZE56, + TEST_IN_SIZE60, + TEST_IN_SIZE64, + TEST_IN_SIZE68, + TEST_IN_SIZE72, + TEST_IN_SIZE76, + TEST_IN_SIZE80, + TEST_IN_SIZE84, + TEST_IN_SIZE88, + TEST_IN_SIZE92, + TEST_IN_SIZE96, + TEST_IN_SIZE100, + TEST_IN_SIZE104, + TEST_IN_SIZE108, + TEST_IN_SIZE112, + TEST_IN_SIZE116, + TEST_IN_SIZE120, + TEST_IN_SIZE124, + TEST_IN_SIZE128, + TEST_IN_SIZE132, + TEST_IN_SIZE136, + TEST_IN_SIZE140, + TEST_IN_SIZE144, + TEST_IN_SIZE148, + TEST_IN_SIZE152, + TEST_IN_SIZE156, + TEST_IN_SIZE160, + TEST_IN_SIZE164, + TEST_IN_SIZE168, + TEST_IN_SIZE172, + TEST_IN_SIZE176, + TEST_IN_SIZE180, + TEST_IN_SIZE184, + TEST_IN_SIZE188, + TEST_IN_SIZE192, + TEST_IN_SIZE196, + TEST_IN_SIZE200, + TEST_IN_SIZE204, + TEST_IN_SIZE208, + TEST_IN_SIZE212, + TEST_IN_SIZE216, + TEST_IN_SIZE220, + TEST_IN_SIZE224, + TEST_IN_SIZE228, + TEST_IN_SIZE232, + TEST_IN_SIZE236, + TEST_IN_SIZE240, + TEST_IN_SIZE244, TEST_IN_FULL_SUPP_TRACE, - /********** - * GLOBAL * - **********/ - TEST_FWD_FULL_SUPP_TRACE, + __TEST_IN_MAX, __TEST_MAX, }; -static int check_ioam_header(int tid, struct ioam6_trace_hdr *ioam6h, - __u32 trace_type, __u16 ioam_ns) +static int check_header(int tid, struct ioam6_trace_hdr *trace, + __u32 trace_type, __u8 trace_size, __u16 ioam_ns) { - if (__be16_to_cpu(ioam6h->namespace_id) != ioam_ns || - __be32_to_cpu(ioam6h->type_be32) != (trace_type << 8)) + if (__be16_to_cpu(trace->namespace_id) != ioam_ns || + __be32_to_cpu(trace->type_be32) != (trace_type << 8)) return 1; switch (tid) { case TEST_OUT_UNDEF_NS: case TEST_IN_UNDEF_NS: - return ioam6h->overflow || - ioam6h->nodelen != 1 || - ioam6h->remlen != 1; + case TEST_IN_DISABLED: + return trace->overflow == 1 || + trace->nodelen != 1 || + trace->remlen != 1; case TEST_OUT_NO_ROOM: case TEST_IN_NO_ROOM: case TEST_IN_OFLAG: - return !ioam6h->overflow || - ioam6h->nodelen != 2 || - ioam6h->remlen != 1; + return trace->overflow == 0 || + trace->nodelen != 2 || + trace->remlen != 1; + + case TEST_OUT_NO_ROOM_OSS: + return trace->overflow == 0 || + trace->nodelen != 0 || + trace->remlen != 1; + + case TEST_IN_NO_ROOM_OSS: + case TEST_OUT_BIT22: + case TEST_IN_BIT22: + return trace->overflow == 1 || + trace->nodelen != 0 || + trace->remlen != 0; case TEST_OUT_BIT0: case TEST_IN_BIT0: @@ -164,9 +293,9 @@ static int check_ioam_header(int tid, struct ioam6_trace_hdr *ioam6h, case TEST_IN_BIT7: case TEST_OUT_BIT11: case TEST_IN_BIT11: - return ioam6h->overflow || - ioam6h->nodelen != 1 || - ioam6h->remlen; + return trace->overflow == 1 || + trace->nodelen != 1 || + trace->remlen != 0; case TEST_OUT_BIT8: case TEST_IN_BIT8: @@ -174,22 +303,145 @@ static int check_ioam_header(int tid, struct ioam6_trace_hdr *ioam6h, case TEST_IN_BIT9: case TEST_OUT_BIT10: case TEST_IN_BIT10: - return ioam6h->overflow || - ioam6h->nodelen != 2 || - ioam6h->remlen; - - case TEST_OUT_BIT22: - case TEST_IN_BIT22: - return ioam6h->overflow || - ioam6h->nodelen || - ioam6h->remlen; + return trace->overflow == 1 || + trace->nodelen != 2 || + trace->remlen != 0; + + case TEST_OUT_SIZE4: + case TEST_OUT_SIZE8: + case TEST_OUT_SIZE12: + case TEST_OUT_SIZE16: + case TEST_OUT_SIZE20: + case TEST_OUT_SIZE24: + case TEST_OUT_SIZE28: + case TEST_OUT_SIZE32: + case TEST_OUT_SIZE36: + case TEST_OUT_SIZE40: + case TEST_OUT_SIZE44: + case TEST_OUT_SIZE48: + case TEST_OUT_SIZE52: + case TEST_OUT_SIZE56: + case TEST_OUT_SIZE60: + case TEST_OUT_SIZE64: + case TEST_OUT_SIZE68: + case TEST_OUT_SIZE72: + case TEST_OUT_SIZE76: + case TEST_OUT_SIZE80: + case TEST_OUT_SIZE84: + case TEST_OUT_SIZE88: + case TEST_OUT_SIZE92: + case TEST_OUT_SIZE96: + case TEST_OUT_SIZE100: + case TEST_OUT_SIZE104: + case TEST_OUT_SIZE108: + case TEST_OUT_SIZE112: + case TEST_OUT_SIZE116: + case TEST_OUT_SIZE120: + case TEST_OUT_SIZE124: + case TEST_OUT_SIZE128: + case TEST_OUT_SIZE132: + case TEST_OUT_SIZE136: + case TEST_OUT_SIZE140: + case TEST_OUT_SIZE144: + case TEST_OUT_SIZE148: + case TEST_OUT_SIZE152: + case TEST_OUT_SIZE156: + case TEST_OUT_SIZE160: + case TEST_OUT_SIZE164: + case TEST_OUT_SIZE168: + case TEST_OUT_SIZE172: + case TEST_OUT_SIZE176: + case TEST_OUT_SIZE180: + case TEST_OUT_SIZE184: + case TEST_OUT_SIZE188: + case TEST_OUT_SIZE192: + case TEST_OUT_SIZE196: + case TEST_OUT_SIZE200: + case TEST_OUT_SIZE204: + case TEST_OUT_SIZE208: + case TEST_OUT_SIZE212: + case TEST_OUT_SIZE216: + case TEST_OUT_SIZE220: + case TEST_OUT_SIZE224: + case TEST_OUT_SIZE228: + case TEST_OUT_SIZE232: + case TEST_OUT_SIZE236: + case TEST_OUT_SIZE240: + case TEST_OUT_SIZE244: + return trace->overflow == 1 || + trace->nodelen != 1 || + trace->remlen != trace_size / 4; + + case TEST_IN_SIZE4: + case TEST_IN_SIZE8: + case TEST_IN_SIZE12: + case TEST_IN_SIZE16: + case TEST_IN_SIZE20: + case TEST_IN_SIZE24: + case TEST_IN_SIZE28: + case TEST_IN_SIZE32: + case TEST_IN_SIZE36: + case TEST_IN_SIZE40: + case TEST_IN_SIZE44: + case TEST_IN_SIZE48: + case TEST_IN_SIZE52: + case TEST_IN_SIZE56: + case TEST_IN_SIZE60: + case TEST_IN_SIZE64: + case TEST_IN_SIZE68: + case TEST_IN_SIZE72: + case TEST_IN_SIZE76: + case TEST_IN_SIZE80: + case TEST_IN_SIZE84: + case TEST_IN_SIZE88: + case TEST_IN_SIZE92: + case TEST_IN_SIZE96: + case TEST_IN_SIZE100: + case TEST_IN_SIZE104: + case TEST_IN_SIZE108: + case TEST_IN_SIZE112: + case TEST_IN_SIZE116: + case TEST_IN_SIZE120: + case TEST_IN_SIZE124: + case TEST_IN_SIZE128: + case TEST_IN_SIZE132: + case TEST_IN_SIZE136: + case TEST_IN_SIZE140: + case TEST_IN_SIZE144: + case TEST_IN_SIZE148: + case TEST_IN_SIZE152: + case TEST_IN_SIZE156: + case TEST_IN_SIZE160: + case TEST_IN_SIZE164: + case TEST_IN_SIZE168: + case TEST_IN_SIZE172: + case TEST_IN_SIZE176: + case TEST_IN_SIZE180: + case TEST_IN_SIZE184: + case TEST_IN_SIZE188: + case TEST_IN_SIZE192: + case TEST_IN_SIZE196: + case TEST_IN_SIZE200: + case TEST_IN_SIZE204: + case TEST_IN_SIZE208: + case TEST_IN_SIZE212: + case TEST_IN_SIZE216: + case TEST_IN_SIZE220: + case TEST_IN_SIZE224: + case TEST_IN_SIZE228: + case TEST_IN_SIZE232: + case TEST_IN_SIZE236: + case TEST_IN_SIZE240: + case TEST_IN_SIZE244: + return trace->overflow == 1 || + trace->nodelen != 1 || + trace->remlen != (trace_size / 4) - trace->nodelen; case TEST_OUT_FULL_SUPP_TRACE: case TEST_IN_FULL_SUPP_TRACE: - case TEST_FWD_FULL_SUPP_TRACE: - return ioam6h->overflow || - ioam6h->nodelen != 15 || - ioam6h->remlen; + return trace->overflow == 1 || + trace->nodelen != 15 || + trace->remlen != 0; default: break; @@ -198,167 +450,137 @@ static int check_ioam_header(int tid, struct ioam6_trace_hdr *ioam6h, return 1; } -static int check_ioam6_data(__u8 **p, struct ioam6_trace_hdr *ioam6h, - const struct ioam_config cnf) +static int check_data(struct ioam6_trace_hdr *trace, __u8 trace_size, + const struct ioam_config cnf, bool is_output) { - unsigned int len; + unsigned int len, i; __u8 aligned; __u64 raw64; __u32 raw32; + __u8 *p; - if (ioam6h->type.bit0) { - raw32 = __be32_to_cpu(*((__u32 *)*p)); - if (cnf.hlim != (raw32 >> 24) || cnf.id != (raw32 & 0xffffff)) - return 1; - *p += sizeof(__u32); - } - - if (ioam6h->type.bit1) { - raw32 = __be32_to_cpu(*((__u32 *)*p)); - if (cnf.ingr_id != (raw32 >> 16) || - cnf.egr_id != (raw32 & 0xffff)) - return 1; - *p += sizeof(__u32); - } - - if (ioam6h->type.bit2) - *p += sizeof(__u32); - - if (ioam6h->type.bit3) - *p += sizeof(__u32); - - if (ioam6h->type.bit4) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) - return 1; - *p += sizeof(__u32); - } - - if (ioam6h->type.bit5) { - if (__be32_to_cpu(*((__u32 *)*p)) != cnf.ns_data) - return 1; - *p += sizeof(__u32); - } - - if (ioam6h->type.bit6) - *p += sizeof(__u32); + if (trace->type.bit12 | trace->type.bit13 | trace->type.bit14 | + trace->type.bit15 | trace->type.bit16 | trace->type.bit17 | + trace->type.bit18 | trace->type.bit19 | trace->type.bit20 | + trace->type.bit21 | trace->type.bit23) + return 1; - if (ioam6h->type.bit7) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + for (i = 0; i < trace->remlen * 4; i++) { + if (trace->data[i] != 0) return 1; - *p += sizeof(__u32); } - if (ioam6h->type.bit8) { - raw64 = __be64_to_cpu(*((__u64 *)*p)); - if (cnf.hlim != (raw64 >> 56) || - cnf.wide != (raw64 & 0xffffffffffffff)) - return 1; - *p += sizeof(__u64); - } + if (trace->remlen * 4 == trace_size) + return 0; - if (ioam6h->type.bit9) { - if (__be32_to_cpu(*((__u32 *)*p)) != cnf.ingr_wide) - return 1; - *p += sizeof(__u32); + p = trace->data + trace->remlen * 4; - if (__be32_to_cpu(*((__u32 *)*p)) != cnf.egr_wide) + if (trace->type.bit0) { + raw32 = __be32_to_cpu(*((__u32 *)p)); + if (cnf.hlim != (raw32 >> 24) || cnf.id != (raw32 & 0xffffff)) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit10) { - if (__be64_to_cpu(*((__u64 *)*p)) != cnf.ns_wide) + if (trace->type.bit1) { + raw32 = __be32_to_cpu(*((__u32 *)p)); + if (cnf.ingr_id != (raw32 >> 16) || + cnf.egr_id != (raw32 & 0xffff)) return 1; - *p += sizeof(__u64); + p += sizeof(__u32); } - if (ioam6h->type.bit11) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit2) { + raw32 = __be32_to_cpu(*((__u32 *)p)); + if ((is_output && raw32 != 0xffffffff) || + (!is_output && (raw32 == 0 || raw32 == 0xffffffff))) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit12) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit3) { + raw32 = __be32_to_cpu(*((__u32 *)p)); + if ((is_output && raw32 != 0xffffffff) || + (!is_output && (raw32 == 0 || raw32 == 0xffffffff))) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit13) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit4) { + if (__be32_to_cpu(*((__u32 *)p)) != 0xffffffff) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit14) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit5) { + if (__be32_to_cpu(*((__u32 *)p)) != cnf.ns_data) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit15) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit6) { + if (__be32_to_cpu(*((__u32 *)p)) == 0xffffffff) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit16) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit7) { + if (__be32_to_cpu(*((__u32 *)p)) != 0xffffffff) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit17) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit8) { + raw64 = __be64_to_cpu(*((__u64 *)p)); + if (cnf.hlim != (raw64 >> 56) || + cnf.wide != (raw64 & 0xffffffffffffff)) return 1; - *p += sizeof(__u32); + p += sizeof(__u64); } - if (ioam6h->type.bit18) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit9) { + if (__be32_to_cpu(*((__u32 *)p)) != cnf.ingr_wide) return 1; - *p += sizeof(__u32); - } + p += sizeof(__u32); - if (ioam6h->type.bit19) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (__be32_to_cpu(*((__u32 *)p)) != cnf.egr_wide) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit20) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit10) { + if (__be64_to_cpu(*((__u64 *)p)) != cnf.ns_wide) return 1; - *p += sizeof(__u32); + p += sizeof(__u64); } - if (ioam6h->type.bit21) { - if (__be32_to_cpu(*((__u32 *)*p)) != 0xffffffff) + if (trace->type.bit11) { + if (__be32_to_cpu(*((__u32 *)p)) != 0xffffffff) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); } - if (ioam6h->type.bit22) { + if (trace->type.bit22) { len = cnf.sc_data ? strlen(cnf.sc_data) : 0; aligned = cnf.sc_data ? __ALIGN_KERNEL(len, 4) : 0; - raw32 = __be32_to_cpu(*((__u32 *)*p)); + raw32 = __be32_to_cpu(*((__u32 *)p)); if (aligned != (raw32 >> 24) * 4 || cnf.sc_id != (raw32 & 0xffffff)) return 1; - *p += sizeof(__u32); + p += sizeof(__u32); if (cnf.sc_data) { - if (strncmp((char *)*p, cnf.sc_data, len)) + if (strncmp((char *)p, cnf.sc_data, len)) return 1; - *p += len; + p += len; aligned -= len; while (aligned--) { - if (**p != '\0') + if (*p != '\0') return 1; - *p += sizeof(__u8); + p += sizeof(__u8); } } } @@ -366,151 +588,351 @@ static int check_ioam6_data(__u8 **p, struct ioam6_trace_hdr *ioam6h, return 0; } -static int check_ioam_header_and_data(int tid, struct ioam6_trace_hdr *ioam6h, - __u32 trace_type, __u16 ioam_ns) +static int check_ioam_trace(int tid, struct ioam6_trace_hdr *trace, + __u32 trace_type, __u8 trace_size, __u16 ioam_ns) { - __u8 *p; - - if (check_ioam_header(tid, ioam6h, trace_type, ioam_ns)) + if (check_header(tid, trace, trace_type, trace_size, ioam_ns)) return 1; - p = ioam6h->data + ioam6h->remlen * 4; - - switch (tid) { - case TEST_OUT_BIT0: - case TEST_OUT_BIT1: - case TEST_OUT_BIT2: - case TEST_OUT_BIT3: - case TEST_OUT_BIT4: - case TEST_OUT_BIT5: - case TEST_OUT_BIT6: - case TEST_OUT_BIT7: - case TEST_OUT_BIT8: - case TEST_OUT_BIT9: - case TEST_OUT_BIT10: - case TEST_OUT_BIT11: - case TEST_OUT_BIT22: - case TEST_OUT_FULL_SUPP_TRACE: - return check_ioam6_data(&p, ioam6h, node1); - - case TEST_IN_BIT0: - case TEST_IN_BIT1: - case TEST_IN_BIT2: - case TEST_IN_BIT3: - case TEST_IN_BIT4: - case TEST_IN_BIT5: - case TEST_IN_BIT6: - case TEST_IN_BIT7: - case TEST_IN_BIT8: - case TEST_IN_BIT9: - case TEST_IN_BIT10: - case TEST_IN_BIT11: - case TEST_IN_BIT22: - case TEST_IN_FULL_SUPP_TRACE: - { - __u32 tmp32 = node2.egr_wide; - __u16 tmp16 = node2.egr_id; - int res; - - node2.egr_id = 0xffff; - node2.egr_wide = 0xffffffff; + if (tid > __TEST_OUT_MIN && tid < __TEST_OUT_MAX) + return check_data(trace, trace_size, node1, true); - res = check_ioam6_data(&p, ioam6h, node2); - - node2.egr_id = tmp16; - node2.egr_wide = tmp32; - - return res; - } - - case TEST_FWD_FULL_SUPP_TRACE: - if (check_ioam6_data(&p, ioam6h, node3)) - return 1; - if (check_ioam6_data(&p, ioam6h, node2)) - return 1; - return check_ioam6_data(&p, ioam6h, node1); - - default: - break; - } + if (tid > __TEST_IN_MIN && tid < __TEST_IN_MAX) + return check_data(trace, trace_size, node2, false); return 1; } static int str2id(const char *tname) { - if (!strcmp("out_undef_ns", tname)) + if (!strcmp("output_undef_ns", tname)) return TEST_OUT_UNDEF_NS; - if (!strcmp("out_no_room", tname)) + if (!strcmp("output_no_room", tname)) return TEST_OUT_NO_ROOM; - if (!strcmp("out_bit0", tname)) + if (!strcmp("output_no_room_oss", tname)) + return TEST_OUT_NO_ROOM_OSS; + if (!strcmp("output_bit0", tname)) return TEST_OUT_BIT0; - if (!strcmp("out_bit1", tname)) + if (!strcmp("output_bit1", tname)) return TEST_OUT_BIT1; - if (!strcmp("out_bit2", tname)) + if (!strcmp("output_bit2", tname)) return TEST_OUT_BIT2; - if (!strcmp("out_bit3", tname)) + if (!strcmp("output_bit3", tname)) return TEST_OUT_BIT3; - if (!strcmp("out_bit4", tname)) + if (!strcmp("output_bit4", tname)) return TEST_OUT_BIT4; - if (!strcmp("out_bit5", tname)) + if (!strcmp("output_bit5", tname)) return TEST_OUT_BIT5; - if (!strcmp("out_bit6", tname)) + if (!strcmp("output_bit6", tname)) return TEST_OUT_BIT6; - if (!strcmp("out_bit7", tname)) + if (!strcmp("output_bit7", tname)) return TEST_OUT_BIT7; - if (!strcmp("out_bit8", tname)) + if (!strcmp("output_bit8", tname)) return TEST_OUT_BIT8; - if (!strcmp("out_bit9", tname)) + if (!strcmp("output_bit9", tname)) return TEST_OUT_BIT9; - if (!strcmp("out_bit10", tname)) + if (!strcmp("output_bit10", tname)) return TEST_OUT_BIT10; - if (!strcmp("out_bit11", tname)) + if (!strcmp("output_bit11", tname)) return TEST_OUT_BIT11; - if (!strcmp("out_bit22", tname)) + if (!strcmp("output_bit22", tname)) return TEST_OUT_BIT22; - if (!strcmp("out_full_supp_trace", tname)) + if (!strcmp("output_size4", tname)) + return TEST_OUT_SIZE4; + if (!strcmp("output_size8", tname)) + return TEST_OUT_SIZE8; + if (!strcmp("output_size12", tname)) + return TEST_OUT_SIZE12; + if (!strcmp("output_size16", tname)) + return TEST_OUT_SIZE16; + if (!strcmp("output_size20", tname)) + return TEST_OUT_SIZE20; + if (!strcmp("output_size24", tname)) + return TEST_OUT_SIZE24; + if (!strcmp("output_size28", tname)) + return TEST_OUT_SIZE28; + if (!strcmp("output_size32", tname)) + return TEST_OUT_SIZE32; + if (!strcmp("output_size36", tname)) + return TEST_OUT_SIZE36; + if (!strcmp("output_size40", tname)) + return TEST_OUT_SIZE40; + if (!strcmp("output_size44", tname)) + return TEST_OUT_SIZE44; + if (!strcmp("output_size48", tname)) + return TEST_OUT_SIZE48; + if (!strcmp("output_size52", tname)) + return TEST_OUT_SIZE52; + if (!strcmp("output_size56", tname)) + return TEST_OUT_SIZE56; + if (!strcmp("output_size60", tname)) + return TEST_OUT_SIZE60; + if (!strcmp("output_size64", tname)) + return TEST_OUT_SIZE64; + if (!strcmp("output_size68", tname)) + return TEST_OUT_SIZE68; + if (!strcmp("output_size72", tname)) + return TEST_OUT_SIZE72; + if (!strcmp("output_size76", tname)) + return TEST_OUT_SIZE76; + if (!strcmp("output_size80", tname)) + return TEST_OUT_SIZE80; + if (!strcmp("output_size84", tname)) + return TEST_OUT_SIZE84; + if (!strcmp("output_size88", tname)) + return TEST_OUT_SIZE88; + if (!strcmp("output_size92", tname)) + return TEST_OUT_SIZE92; + if (!strcmp("output_size96", tname)) + return TEST_OUT_SIZE96; + if (!strcmp("output_size100", tname)) + return TEST_OUT_SIZE100; + if (!strcmp("output_size104", tname)) + return TEST_OUT_SIZE104; + if (!strcmp("output_size108", tname)) + return TEST_OUT_SIZE108; + if (!strcmp("output_size112", tname)) + return TEST_OUT_SIZE112; + if (!strcmp("output_size116", tname)) + return TEST_OUT_SIZE116; + if (!strcmp("output_size120", tname)) + return TEST_OUT_SIZE120; + if (!strcmp("output_size124", tname)) + return TEST_OUT_SIZE124; + if (!strcmp("output_size128", tname)) + return TEST_OUT_SIZE128; + if (!strcmp("output_size132", tname)) + return TEST_OUT_SIZE132; + if (!strcmp("output_size136", tname)) + return TEST_OUT_SIZE136; + if (!strcmp("output_size140", tname)) + return TEST_OUT_SIZE140; + if (!strcmp("output_size144", tname)) + return TEST_OUT_SIZE144; + if (!strcmp("output_size148", tname)) + return TEST_OUT_SIZE148; + if (!strcmp("output_size152", tname)) + return TEST_OUT_SIZE152; + if (!strcmp("output_size156", tname)) + return TEST_OUT_SIZE156; + if (!strcmp("output_size160", tname)) + return TEST_OUT_SIZE160; + if (!strcmp("output_size164", tname)) + return TEST_OUT_SIZE164; + if (!strcmp("output_size168", tname)) + return TEST_OUT_SIZE168; + if (!strcmp("output_size172", tname)) + return TEST_OUT_SIZE172; + if (!strcmp("output_size176", tname)) + return TEST_OUT_SIZE176; + if (!strcmp("output_size180", tname)) + return TEST_OUT_SIZE180; + if (!strcmp("output_size184", tname)) + return TEST_OUT_SIZE184; + if (!strcmp("output_size188", tname)) + return TEST_OUT_SIZE188; + if (!strcmp("output_size192", tname)) + return TEST_OUT_SIZE192; + if (!strcmp("output_size196", tname)) + return TEST_OUT_SIZE196; + if (!strcmp("output_size200", tname)) + return TEST_OUT_SIZE200; + if (!strcmp("output_size204", tname)) + return TEST_OUT_SIZE204; + if (!strcmp("output_size208", tname)) + return TEST_OUT_SIZE208; + if (!strcmp("output_size212", tname)) + return TEST_OUT_SIZE212; + if (!strcmp("output_size216", tname)) + return TEST_OUT_SIZE216; + if (!strcmp("output_size220", tname)) + return TEST_OUT_SIZE220; + if (!strcmp("output_size224", tname)) + return TEST_OUT_SIZE224; + if (!strcmp("output_size228", tname)) + return TEST_OUT_SIZE228; + if (!strcmp("output_size232", tname)) + return TEST_OUT_SIZE232; + if (!strcmp("output_size236", tname)) + return TEST_OUT_SIZE236; + if (!strcmp("output_size240", tname)) + return TEST_OUT_SIZE240; + if (!strcmp("output_size244", tname)) + return TEST_OUT_SIZE244; + if (!strcmp("output_full_supp_trace", tname)) return TEST_OUT_FULL_SUPP_TRACE; - if (!strcmp("in_undef_ns", tname)) + if (!strcmp("input_undef_ns", tname)) return TEST_IN_UNDEF_NS; - if (!strcmp("in_no_room", tname)) + if (!strcmp("input_no_room", tname)) return TEST_IN_NO_ROOM; - if (!strcmp("in_oflag", tname)) + if (!strcmp("input_no_room_oss", tname)) + return TEST_IN_NO_ROOM_OSS; + if (!strcmp("input_disabled", tname)) + return TEST_IN_DISABLED; + if (!strcmp("input_oflag", tname)) return TEST_IN_OFLAG; - if (!strcmp("in_bit0", tname)) + if (!strcmp("input_bit0", tname)) return TEST_IN_BIT0; - if (!strcmp("in_bit1", tname)) + if (!strcmp("input_bit1", tname)) return TEST_IN_BIT1; - if (!strcmp("in_bit2", tname)) + if (!strcmp("input_bit2", tname)) return TEST_IN_BIT2; - if (!strcmp("in_bit3", tname)) + if (!strcmp("input_bit3", tname)) return TEST_IN_BIT3; - if (!strcmp("in_bit4", tname)) + if (!strcmp("input_bit4", tname)) return TEST_IN_BIT4; - if (!strcmp("in_bit5", tname)) + if (!strcmp("input_bit5", tname)) return TEST_IN_BIT5; - if (!strcmp("in_bit6", tname)) + if (!strcmp("input_bit6", tname)) return TEST_IN_BIT6; - if (!strcmp("in_bit7", tname)) + if (!strcmp("input_bit7", tname)) return TEST_IN_BIT7; - if (!strcmp("in_bit8", tname)) + if (!strcmp("input_bit8", tname)) return TEST_IN_BIT8; - if (!strcmp("in_bit9", tname)) + if (!strcmp("input_bit9", tname)) return TEST_IN_BIT9; - if (!strcmp("in_bit10", tname)) + if (!strcmp("input_bit10", tname)) return TEST_IN_BIT10; - if (!strcmp("in_bit11", tname)) + if (!strcmp("input_bit11", tname)) return TEST_IN_BIT11; - if (!strcmp("in_bit22", tname)) + if (!strcmp("input_bit22", tname)) return TEST_IN_BIT22; - if (!strcmp("in_full_supp_trace", tname)) + if (!strcmp("input_size4", tname)) + return TEST_IN_SIZE4; + if (!strcmp("input_size8", tname)) + return TEST_IN_SIZE8; + if (!strcmp("input_size12", tname)) + return TEST_IN_SIZE12; + if (!strcmp("input_size16", tname)) + return TEST_IN_SIZE16; + if (!strcmp("input_size20", tname)) + return TEST_IN_SIZE20; + if (!strcmp("input_size24", tname)) + return TEST_IN_SIZE24; + if (!strcmp("input_size28", tname)) + return TEST_IN_SIZE28; + if (!strcmp("input_size32", tname)) + return TEST_IN_SIZE32; + if (!strcmp("input_size36", tname)) + return TEST_IN_SIZE36; + if (!strcmp("input_size40", tname)) + return TEST_IN_SIZE40; + if (!strcmp("input_size44", tname)) + return TEST_IN_SIZE44; + if (!strcmp("input_size48", tname)) + return TEST_IN_SIZE48; + if (!strcmp("input_size52", tname)) + return TEST_IN_SIZE52; + if (!strcmp("input_size56", tname)) + return TEST_IN_SIZE56; + if (!strcmp("input_size60", tname)) + return TEST_IN_SIZE60; + if (!strcmp("input_size64", tname)) + return TEST_IN_SIZE64; + if (!strcmp("input_size68", tname)) + return TEST_IN_SIZE68; + if (!strcmp("input_size72", tname)) + return TEST_IN_SIZE72; + if (!strcmp("input_size76", tname)) + return TEST_IN_SIZE76; + if (!strcmp("input_size80", tname)) + return TEST_IN_SIZE80; + if (!strcmp("input_size84", tname)) + return TEST_IN_SIZE84; + if (!strcmp("input_size88", tname)) + return TEST_IN_SIZE88; + if (!strcmp("input_size92", tname)) + return TEST_IN_SIZE92; + if (!strcmp("input_size96", tname)) + return TEST_IN_SIZE96; + if (!strcmp("input_size100", tname)) + return TEST_IN_SIZE100; + if (!strcmp("input_size104", tname)) + return TEST_IN_SIZE104; + if (!strcmp("input_size108", tname)) + return TEST_IN_SIZE108; + if (!strcmp("input_size112", tname)) + return TEST_IN_SIZE112; + if (!strcmp("input_size116", tname)) + return TEST_IN_SIZE116; + if (!strcmp("input_size120", tname)) + return TEST_IN_SIZE120; + if (!strcmp("input_size124", tname)) + return TEST_IN_SIZE124; + if (!strcmp("input_size128", tname)) + return TEST_IN_SIZE128; + if (!strcmp("input_size132", tname)) + return TEST_IN_SIZE132; + if (!strcmp("input_size136", tname)) + return TEST_IN_SIZE136; + if (!strcmp("input_size140", tname)) + return TEST_IN_SIZE140; + if (!strcmp("input_size144", tname)) + return TEST_IN_SIZE144; + if (!strcmp("input_size148", tname)) + return TEST_IN_SIZE148; + if (!strcmp("input_size152", tname)) + return TEST_IN_SIZE152; + if (!strcmp("input_size156", tname)) + return TEST_IN_SIZE156; + if (!strcmp("input_size160", tname)) + return TEST_IN_SIZE160; + if (!strcmp("input_size164", tname)) + return TEST_IN_SIZE164; + if (!strcmp("input_size168", tname)) + return TEST_IN_SIZE168; + if (!strcmp("input_size172", tname)) + return TEST_IN_SIZE172; + if (!strcmp("input_size176", tname)) + return TEST_IN_SIZE176; + if (!strcmp("input_size180", tname)) + return TEST_IN_SIZE180; + if (!strcmp("input_size184", tname)) + return TEST_IN_SIZE184; + if (!strcmp("input_size188", tname)) + return TEST_IN_SIZE188; + if (!strcmp("input_size192", tname)) + return TEST_IN_SIZE192; + if (!strcmp("input_size196", tname)) + return TEST_IN_SIZE196; + if (!strcmp("input_size200", tname)) + return TEST_IN_SIZE200; + if (!strcmp("input_size204", tname)) + return TEST_IN_SIZE204; + if (!strcmp("input_size208", tname)) + return TEST_IN_SIZE208; + if (!strcmp("input_size212", tname)) + return TEST_IN_SIZE212; + if (!strcmp("input_size216", tname)) + return TEST_IN_SIZE216; + if (!strcmp("input_size220", tname)) + return TEST_IN_SIZE220; + if (!strcmp("input_size224", tname)) + return TEST_IN_SIZE224; + if (!strcmp("input_size228", tname)) + return TEST_IN_SIZE228; + if (!strcmp("input_size232", tname)) + return TEST_IN_SIZE232; + if (!strcmp("input_size236", tname)) + return TEST_IN_SIZE236; + if (!strcmp("input_size240", tname)) + return TEST_IN_SIZE240; + if (!strcmp("input_size244", tname)) + return TEST_IN_SIZE244; + if (!strcmp("input_full_supp_trace", tname)) return TEST_IN_FULL_SUPP_TRACE; - if (!strcmp("fwd_full_supp_trace", tname)) - return TEST_FWD_FULL_SUPP_TRACE; return -1; } +static int ipv6_addr_equal(const struct in6_addr *a1, const struct in6_addr *a2) +{ + return ((a1->s6_addr32[0] ^ a2->s6_addr32[0]) | + (a1->s6_addr32[1] ^ a2->s6_addr32[1]) | + (a1->s6_addr32[2] ^ a2->s6_addr32[2]) | + (a1->s6_addr32[3] ^ a2->s6_addr32[3])) == 0; +} + static int get_u32(__u32 *val, const char *arg, int base) { unsigned long res; @@ -555,119 +977,124 @@ static int get_u16(__u16 *val, const char *arg, int base) return 0; } -static int (*func[__TEST_MAX])(int, struct ioam6_trace_hdr *, __u32, __u16) = { - [TEST_OUT_UNDEF_NS] = check_ioam_header, - [TEST_OUT_NO_ROOM] = check_ioam_header, - [TEST_OUT_BIT0] = check_ioam_header_and_data, - [TEST_OUT_BIT1] = check_ioam_header_and_data, - [TEST_OUT_BIT2] = check_ioam_header_and_data, - [TEST_OUT_BIT3] = check_ioam_header_and_data, - [TEST_OUT_BIT4] = check_ioam_header_and_data, - [TEST_OUT_BIT5] = check_ioam_header_and_data, - [TEST_OUT_BIT6] = check_ioam_header_and_data, - [TEST_OUT_BIT7] = check_ioam_header_and_data, - [TEST_OUT_BIT8] = check_ioam_header_and_data, - [TEST_OUT_BIT9] = check_ioam_header_and_data, - [TEST_OUT_BIT10] = check_ioam_header_and_data, - [TEST_OUT_BIT11] = check_ioam_header_and_data, - [TEST_OUT_BIT22] = check_ioam_header_and_data, - [TEST_OUT_FULL_SUPP_TRACE] = check_ioam_header_and_data, - [TEST_IN_UNDEF_NS] = check_ioam_header, - [TEST_IN_NO_ROOM] = check_ioam_header, - [TEST_IN_OFLAG] = check_ioam_header, - [TEST_IN_BIT0] = check_ioam_header_and_data, - [TEST_IN_BIT1] = check_ioam_header_and_data, - [TEST_IN_BIT2] = check_ioam_header_and_data, - [TEST_IN_BIT3] = check_ioam_header_and_data, - [TEST_IN_BIT4] = check_ioam_header_and_data, - [TEST_IN_BIT5] = check_ioam_header_and_data, - [TEST_IN_BIT6] = check_ioam_header_and_data, - [TEST_IN_BIT7] = check_ioam_header_and_data, - [TEST_IN_BIT8] = check_ioam_header_and_data, - [TEST_IN_BIT9] = check_ioam_header_and_data, - [TEST_IN_BIT10] = check_ioam_header_and_data, - [TEST_IN_BIT11] = check_ioam_header_and_data, - [TEST_IN_BIT22] = check_ioam_header_and_data, - [TEST_IN_FULL_SUPP_TRACE] = check_ioam_header_and_data, - [TEST_FWD_FULL_SUPP_TRACE] = check_ioam_header_and_data, -}; +static int get_u8(__u8 *val, const char *arg, int base) +{ + unsigned long res; + char *ptr; + + if (!arg || !*arg) + return -1; + res = strtoul(arg, &ptr, base); + + if (!ptr || ptr == arg || *ptr) + return -1; + + if (res == ULONG_MAX && errno == ERANGE) + return -1; + + if (res > 0xFFUL) + return -1; + + *val = res; + return 0; +} int main(int argc, char **argv) { - int fd, size, hoplen, tid, ret = 1, on = 1; - struct ioam6_hdr *opt; - struct cmsghdr *cmsg; - struct msghdr msg; - struct iovec iov; - __u8 buffer[512]; + __u8 buffer[512], *ptr, nexthdr, tr_size; + struct ioam6_trace_hdr *trace; + unsigned int hoplen, ret = 1; + struct ipv6_hopopt_hdr *hbh; + int fd, size, testname_id; + struct in6_addr src, dst; + struct ioam6_hdr *ioam6; + struct timeval timeout; + struct ipv6hdr *ipv6; __u32 tr_type; __u16 ioam_ns; - __u8 *ptr; - if (argc != 5) + if (argc != 9) goto out; - tid = str2id(argv[1]); - if (tid < 0 || !func[tid]) - goto out; + testname_id = str2id(argv[2]); - if (get_u32(&tr_type, argv[2], 16) || - get_u16(&ioam_ns, argv[3], 0)) + if (testname_id < 0 || + inet_pton(AF_INET6, argv[3], &src) != 1 || + inet_pton(AF_INET6, argv[4], &dst) != 1 || + get_u32(&tr_type, argv[5], 16) || + get_u8(&tr_size, argv[6], 0) || + get_u16(&ioam_ns, argv[7], 0)) goto out; - fd = socket(PF_INET6, SOCK_RAW, - !strcmp(argv[4], "encap") ? IPPROTO_IPV6 : IPPROTO_ICMPV6); + nexthdr = (!strcmp(argv[8], "encap") ? IPPROTO_IPV6 : IPPROTO_ICMPV6); + + hoplen = sizeof(*hbh); + hoplen += 2; // 2-byte padding for alignment + hoplen += sizeof(*ioam6); // IOAM option header + hoplen += sizeof(*trace); // IOAM trace header + hoplen += tr_size; // IOAM trace size + hoplen += (tr_size % 8); // optional padding + + fd = socket(AF_PACKET, SOCK_DGRAM, __cpu_to_be16(ETH_P_IPV6)); if (fd < 0) goto out; - setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPOPTS, &on, sizeof(on)); + if (setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, + argv[1], strlen(argv[1]))) + goto close; - iov.iov_len = 1; - iov.iov_base = malloc(CMSG_SPACE(sizeof(buffer))); - if (!iov.iov_base) + timeout.tv_sec = 1; + timeout.tv_usec = 0; + if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, + (const char *)&timeout, sizeof(timeout))) goto close; recv: - memset(&msg, 0, sizeof(msg)); - msg.msg_iov = &iov; - msg.msg_iovlen = 1; - msg.msg_control = buffer; - msg.msg_controllen = CMSG_SPACE(sizeof(buffer)); - - size = recvmsg(fd, &msg, 0); + size = recv(fd, buffer, sizeof(buffer), 0); if (size <= 0) goto close; - for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) { - if (cmsg->cmsg_level != IPPROTO_IPV6 || - cmsg->cmsg_type != IPV6_HOPOPTS || - cmsg->cmsg_len < sizeof(struct ipv6_hopopt_hdr)) - continue; + ipv6 = (struct ipv6hdr *)buffer; + + /* Skip packets that do not have the expected src/dst address or that + * do not have a Hop-by-hop. + */ + if (!ipv6_addr_equal(&ipv6->saddr, &src) || + !ipv6_addr_equal(&ipv6->daddr, &dst) || + ipv6->nexthdr != IPPROTO_HOPOPTS) + goto recv; + + /* Check Hbh's Next Header and Size. */ + hbh = (struct ipv6_hopopt_hdr *)(buffer + sizeof(*ipv6)); + if (hbh->nexthdr != nexthdr || hbh->hdrlen != (hoplen >> 3) - 1) + goto close; - ptr = (__u8 *)CMSG_DATA(cmsg); + /* Check we have a 2-byte padding for alignment. */ + ptr = (__u8 *)hbh + sizeof(*hbh); + if (ptr[0] != IPV6_TLV_PADN && ptr[1] != 0) + goto close; - hoplen = (ptr[1] + 1) << 3; - ptr += sizeof(struct ipv6_hopopt_hdr); + /* Check we now have the IOAM option. */ + ptr += 2; + if (ptr[0] != IPV6_TLV_IOAM) + goto close; - while (hoplen > 0) { - opt = (struct ioam6_hdr *)ptr; + /* Check its size and the IOAM option type. */ + ioam6 = (struct ioam6_hdr *)ptr; + if (ioam6->opt_len != sizeof(*ioam6) - 2 + sizeof(*trace) + tr_size || + ioam6->type != IOAM6_TYPE_PREALLOC) + goto close; - if (opt->opt_type == IPV6_TLV_IOAM && - opt->type == IOAM6_TYPE_PREALLOC) { - ptr += sizeof(*opt); - ret = func[tid](tid, - (struct ioam6_trace_hdr *)ptr, - tr_type, ioam_ns); - goto close; - } + trace = (struct ioam6_trace_hdr *)(ptr + sizeof(*ioam6)); - ptr += opt->opt_len + 2; - hoplen -= opt->opt_len + 2; - } - } + /* Check the trailing 4-byte padding (potentially). */ + ptr = (__u8 *)trace + sizeof(*trace) + tr_size; + if (tr_size % 8 && ptr[0] != IPV6_TLV_PADN && ptr[1] != 2 && + ptr[2] != 0 && ptr[3] != 0) + goto close; - goto recv; + /* Check the IOAM header and data. */ + ret = check_ioam_trace(testname_id, trace, tr_type, tr_size, ioam_ns); close: - free(iov.iov_base); close(fd); out: return ret; diff --git a/tools/testing/selftests/net/ipv6_route_update_soft_lockup.sh b/tools/testing/selftests/net/ipv6_route_update_soft_lockup.sh new file mode 100755 index 000000000000..a6b2b1f9c641 --- /dev/null +++ b/tools/testing/selftests/net/ipv6_route_update_soft_lockup.sh @@ -0,0 +1,262 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Testing for potential kernel soft lockup during IPv6 routing table +# refresh under heavy outgoing IPv6 traffic. If a kernel soft lockup +# occurs, a kernel panic will be triggered to prevent associated issues. +# +# +# Test Environment Layout +# +# ┌----------------┐ ┌----------------┐ +# | SOURCE_NS | | SINK_NS | +# | NAMESPACE | | NAMESPACE | +# |(iperf3 clients)| |(iperf3 servers)| +# | | | | +# | | | | +# | ┌-----------| nexthops |---------┐ | +# | |veth_source|<--------------------------------------->|veth_sink|<┐ | +# | └-----------|2001:0DB8:1::0:1/96 2001:0DB8:1::1:1/96 |---------┘ | | +# | | ^ 2001:0DB8:1::1:2/96 | | | +# | | . . | fwd | | +# | ┌---------┐ | . . | | | +# | | IPv6 | | . . | V | +# | | routing | | . 2001:0DB8:1::1:80/96| ┌-----┐ | +# | | table | | . | | lo | | +# | | nexthop | | . └--------┴-----┴-┘ +# | | update | | ............................> 2001:0DB8:2::1:1/128 +# | └-------- ┘ | +# └----------------┘ +# +# The test script sets up two network namespaces, source_ns and sink_ns, +# connected via a veth link. Within source_ns, it continuously updates the +# IPv6 routing table by flushing and inserting IPV6_NEXTHOP_ADDR_COUNT nexthop +# IPs destined for SINK_LOOPBACK_IP_ADDR in sink_ns. This refresh occurs at a +# rate of 1/ROUTING_TABLE_REFRESH_PERIOD per second for TEST_DURATION seconds. +# +# Simultaneously, multiple iperf3 clients within source_ns generate heavy +# outgoing IPv6 traffic. Each client is assigned a unique port number starting +# at 5000 and incrementing sequentially. Each client targets a unique iperf3 +# server running in sink_ns, connected to the SINK_LOOPBACK_IFACE interface +# using the same port number. +# +# The number of iperf3 servers and clients is set to half of the total +# available cores on each machine. +# +# NOTE: We have tested this script on machines with various CPU specifications, +# ranging from lower to higher performance as listed below. The test script +# effectively triggered a kernel soft lockup on machines running an unpatched +# kernel in under a minute: +# +# - 1x Intel Xeon E-2278G 8-Core Processor @ 3.40GHz +# - 1x Intel Xeon E-2378G Processor 8-Core @ 2.80GHz +# - 1x AMD EPYC 7401P 24-Core Processor @ 2.00GHz +# - 1x AMD EPYC 7402P 24-Core Processor @ 2.80GHz +# - 2x Intel Xeon Gold 5120 14-Core Processor @ 2.20GHz +# - 1x Ampere Altra Q80-30 80-Core Processor @ 3.00GHz +# - 2x Intel Xeon Gold 5120 14-Core Processor @ 2.20GHz +# - 2x Intel Xeon Silver 4214 24-Core Processor @ 2.20GHz +# - 1x AMD EPYC 7502P 32-Core @ 2.50GHz +# - 1x Intel Xeon Gold 6314U 32-Core Processor @ 2.30GHz +# - 2x Intel Xeon Gold 6338 32-Core Processor @ 2.00GHz +# +# On less performant machines, you may need to increase the TEST_DURATION +# parameter to enhance the likelihood of encountering a race condition leading +# to a kernel soft lockup and avoid a false negative result. +# +# NOTE: The test may not produce the expected result in virtualized +# environments (e.g., qemu) due to differences in timing and CPU handling, +# which can affect the conditions needed to trigger a soft lockup. + +source lib.sh +source net_helper.sh + +TEST_DURATION=300 +ROUTING_TABLE_REFRESH_PERIOD=0.01 + +IPERF3_BITRATE="300m" + + +IPV6_NEXTHOP_ADDR_COUNT="128" +IPV6_NEXTHOP_ADDR_MASK="96" +IPV6_NEXTHOP_PREFIX="2001:0DB8:1" + + +SOURCE_TEST_IFACE="veth_source" +SOURCE_TEST_IP_ADDR="2001:0DB8:1::0:1/96" + +SINK_TEST_IFACE="veth_sink" +# ${SINK_TEST_IFACE} is populated with the following range of IPv6 addresses: +# 2001:0DB8:1::1:1 to 2001:0DB8:1::1:${IPV6_NEXTHOP_ADDR_COUNT} +SINK_LOOPBACK_IFACE="lo" +SINK_LOOPBACK_IP_MASK="128" +SINK_LOOPBACK_IP_ADDR="2001:0DB8:2::1:1" + +nexthop_ip_list="" +termination_signal="" +kernel_softlokup_panic_prev_val="" + +terminate_ns_processes_by_pattern() { + local ns=$1 + local pattern=$2 + + for pid in $(ip netns pids ${ns}); do + [ -e /proc/$pid/cmdline ] && grep -qe "${pattern}" /proc/$pid/cmdline && kill -9 $pid + done +} + +cleanup() { + echo "info: cleaning up namespaces and terminating all processes within them..." + + + # Terminate iperf3 instances running in the source_ns. To avoid race + # conditions, first iterate over the PIDs and terminate those + # associated with the bash shells running the + # `while true; do iperf3 -c ...; done` loops. In a second iteration, + # terminate the individual `iperf3 -c ...` instances. + terminate_ns_processes_by_pattern ${source_ns} while + terminate_ns_processes_by_pattern ${source_ns} iperf3 + + # Repeat the same process for sink_ns + terminate_ns_processes_by_pattern ${sink_ns} while + terminate_ns_processes_by_pattern ${sink_ns} iperf3 + + # Check if any iperf3 instances are still running. This could happen + # if a core has entered an infinite loop and the timeout for detecting + # the soft lockup has not expired, but either the test interval has + # already elapsed or the test was terminated manually (e.g., with ^C) + for pid in $(ip netns pids ${source_ns}); do + if [ -e /proc/$pid/cmdline ] && grep -qe 'iperf3' /proc/$pid/cmdline; then + echo "FAIL: unable to terminate some iperf3 instances. Soft lockup is underway. A kernel panic is on the way!" + exit ${ksft_fail} + fi + done + + if [ "$termination_signal" == "SIGINT" ]; then + echo "SKIP: Termination due to ^C (SIGINT)" + elif [ "$termination_signal" == "SIGALRM" ]; then + echo "PASS: No kernel soft lockup occurred during this ${TEST_DURATION} second test" + fi + + cleanup_ns ${source_ns} ${sink_ns} + + sysctl -qw kernel.softlockup_panic=${kernel_softlokup_panic_prev_val} +} + +setup_prepare() { + setup_ns source_ns sink_ns + + ip -n ${source_ns} link add name ${SOURCE_TEST_IFACE} type veth peer name ${SINK_TEST_IFACE} netns ${sink_ns} + + # Setting up the Source namespace + ip -n ${source_ns} addr add ${SOURCE_TEST_IP_ADDR} dev ${SOURCE_TEST_IFACE} + ip -n ${source_ns} link set dev ${SOURCE_TEST_IFACE} qlen 10000 + ip -n ${source_ns} link set dev ${SOURCE_TEST_IFACE} up + ip netns exec ${source_ns} sysctl -qw net.ipv6.fib_multipath_hash_policy=1 + + # Setting up the Sink namespace + ip -n ${sink_ns} addr add ${SINK_LOOPBACK_IP_ADDR}/${SINK_LOOPBACK_IP_MASK} dev ${SINK_LOOPBACK_IFACE} + ip -n ${sink_ns} link set dev ${SINK_LOOPBACK_IFACE} up + ip netns exec ${sink_ns} sysctl -qw net.ipv6.conf.${SINK_LOOPBACK_IFACE}.forwarding=1 + + ip -n ${sink_ns} link set ${SINK_TEST_IFACE} up + ip netns exec ${sink_ns} sysctl -qw net.ipv6.conf.${SINK_TEST_IFACE}.forwarding=1 + + + # Populate nexthop IPv6 addresses on the test interface in the sink_ns + echo "info: populating ${IPV6_NEXTHOP_ADDR_COUNT} IPv6 addresses on the ${SINK_TEST_IFACE} interface ..." + for IP in $(seq 1 ${IPV6_NEXTHOP_ADDR_COUNT}); do + ip -n ${sink_ns} addr add ${IPV6_NEXTHOP_PREFIX}::$(printf "1:%x" "${IP}")/${IPV6_NEXTHOP_ADDR_MASK} dev ${SINK_TEST_IFACE}; + done + + # Preparing list of nexthops + for IP in $(seq 1 ${IPV6_NEXTHOP_ADDR_COUNT}); do + nexthop_ip_list=$nexthop_ip_list" nexthop via ${IPV6_NEXTHOP_PREFIX}::$(printf "1:%x" $IP) dev ${SOURCE_TEST_IFACE} weight 1" + done +} + + +test_soft_lockup_during_routing_table_refresh() { + # Start num_of_iperf_servers iperf3 servers in the sink_ns namespace, + # each listening on ports starting at 5001 and incrementing + # sequentially. Since iperf3 instances may terminate unexpectedly, a + # while loop is used to automatically restart them in such cases. + echo "info: starting ${num_of_iperf_servers} iperf3 servers in the sink_ns namespace ..." + for i in $(seq 1 ${num_of_iperf_servers}); do + cmd="iperf3 --bind ${SINK_LOOPBACK_IP_ADDR} -s -p $(printf '5%03d' ${i}) --rcv-timeout 200 &>/dev/null" + ip netns exec ${sink_ns} bash -c "while true; do ${cmd}; done &" &>/dev/null + done + + # Wait for the iperf3 servers to be ready + for i in $(seq ${num_of_iperf_servers}); do + port=$(printf '5%03d' ${i}); + wait_local_port_listen ${sink_ns} ${port} tcp + done + + # Continuously refresh the routing table in the background within + # the source_ns namespace + ip netns exec ${source_ns} bash -c " + while \$(ip netns list | grep -q ${source_ns}); do + ip -6 route add ${SINK_LOOPBACK_IP_ADDR}/${SINK_LOOPBACK_IP_MASK} ${nexthop_ip_list}; + sleep ${ROUTING_TABLE_REFRESH_PERIOD}; + ip -6 route delete ${SINK_LOOPBACK_IP_ADDR}/${SINK_LOOPBACK_IP_MASK}; + done &" + + # Start num_of_iperf_servers iperf3 clients in the source_ns namespace, + # each sending TCP traffic on sequential ports starting at 5001. + # Since iperf3 instances may terminate unexpectedly (e.g., if the route + # to the server is deleted in the background during a route refresh), a + # while loop is used to automatically restart them in such cases. + echo "info: starting ${num_of_iperf_servers} iperf3 clients in the source_ns namespace ..." + for i in $(seq 1 ${num_of_iperf_servers}); do + cmd="iperf3 -c ${SINK_LOOPBACK_IP_ADDR} -p $(printf '5%03d' ${i}) --length 64 --bitrate ${IPERF3_BITRATE} -t 0 --connect-timeout 150 &>/dev/null" + ip netns exec ${source_ns} bash -c "while true; do ${cmd}; done &" &>/dev/null + done + + echo "info: IPv6 routing table is being updated at the rate of $(echo "1/${ROUTING_TABLE_REFRESH_PERIOD}" | bc)/s for ${TEST_DURATION} seconds ..." + echo "info: A kernel soft lockup, if detected, results in a kernel panic!" + + wait +} + +# Make sure 'iperf3' is installed, skip the test otherwise +if [ ! -x "$(command -v "iperf3")" ]; then + echo "SKIP: 'iperf3' is not installed. Skipping the test." + exit ${ksft_skip} +fi + +# Determine the number of cores on the machine +num_of_iperf_servers=$(( $(nproc)/2 )) + +# Check if we are running on a multi-core machine, skip the test otherwise +if [ "${num_of_iperf_servers}" -eq 0 ]; then + echo "SKIP: This test is not valid on a single core machine!" + exit ${ksft_skip} +fi + +# Since the kernel soft lockup we're testing causes at least one core to enter +# an infinite loop, destabilizing the host and likely affecting subsequent +# tests, we trigger a kernel panic instead of reporting a failure and +# continuing +kernel_softlokup_panic_prev_val=$(sysctl -n kernel.softlockup_panic) +sysctl -qw kernel.softlockup_panic=1 + +handle_sigint() { + termination_signal="SIGINT" + cleanup + exit ${ksft_skip} +} + +handle_sigalrm() { + termination_signal="SIGALRM" + cleanup + exit ${ksft_pass} +} + +trap handle_sigint SIGINT +trap handle_sigalrm SIGALRM + +(sleep ${TEST_DURATION} && kill -s SIGALRM $$)& + +setup_prepare +test_soft_lockup_during_routing_table_refresh diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh index be8707bfb46e..8994fec1c38f 100644 --- a/tools/testing/selftests/net/lib.sh +++ b/tools/testing/selftests/net/lib.sh @@ -1,11 +1,17 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0 +net_dir=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") +source "$net_dir/lib/sh/defer.sh" + ############################################################################## # Defines : "${WAIT_TIMEOUT:=20}" +# Whether to pause on after a failure. +: "${PAUSE_ON_FAIL:=no}" + BUSYWAIT_TIMEOUT=$((WAIT_TIMEOUT * 1000)) # ms # Kselftest framework constants. @@ -17,6 +23,11 @@ ksft_skip=4 # namespace list created by setup_ns NS_LIST=() +# Exit status to return at the end. Set in case one of the tests fails. +EXIT_STATUS=0 +# Per-test return value. Clear at the beginning of each test. +RET=0 + ############################################################################## # Helpers @@ -233,3 +244,218 @@ tc_rule_handle_stats_get() | jq ".[] | select(.options.handle == $handle) | \ .options.actions[0].stats$selector" } + +ret_set_ksft_status() +{ + local ksft_status=$1; shift + local msg=$1; shift + + RET=$(ksft_status_merge $RET $ksft_status) + if (( $? )); then + retmsg=$msg + fi +} + +log_test_result() +{ + local test_name=$1; shift + local opt_str=$1; shift + local result=$1; shift + local retmsg=$1; shift + + printf "TEST: %-60s [%s]\n" "$test_name $opt_str" "$result" + if [[ $retmsg ]]; then + printf "\t%s\n" "$retmsg" + fi +} + +pause_on_fail() +{ + if [[ $PAUSE_ON_FAIL == yes ]]; then + echo "Hit enter to continue, 'q' to quit" + read a + [[ $a == q ]] && exit 1 + fi +} + +handle_test_result_pass() +{ + local test_name=$1; shift + local opt_str=$1; shift + + log_test_result "$test_name" "$opt_str" " OK " +} + +handle_test_result_fail() +{ + local test_name=$1; shift + local opt_str=$1; shift + + log_test_result "$test_name" "$opt_str" FAIL "$retmsg" + pause_on_fail +} + +handle_test_result_xfail() +{ + local test_name=$1; shift + local opt_str=$1; shift + + log_test_result "$test_name" "$opt_str" XFAIL "$retmsg" + pause_on_fail +} + +handle_test_result_skip() +{ + local test_name=$1; shift + local opt_str=$1; shift + + log_test_result "$test_name" "$opt_str" SKIP "$retmsg" +} + +log_test() +{ + local test_name=$1 + local opt_str=$2 + + if [[ $# -eq 2 ]]; then + opt_str="($opt_str)" + fi + + if ((RET == ksft_pass)); then + handle_test_result_pass "$test_name" "$opt_str" + elif ((RET == ksft_xfail)); then + handle_test_result_xfail "$test_name" "$opt_str" + elif ((RET == ksft_skip)); then + handle_test_result_skip "$test_name" "$opt_str" + else + handle_test_result_fail "$test_name" "$opt_str" + fi + + EXIT_STATUS=$(ksft_exit_status_merge $EXIT_STATUS $RET) + return $RET +} + +log_test_skip() +{ + RET=$ksft_skip retmsg= log_test "$@" +} + +log_test_xfail() +{ + RET=$ksft_xfail retmsg= log_test "$@" +} + +log_info() +{ + local msg=$1 + + echo "INFO: $msg" +} + +tests_run() +{ + local current_test + + for current_test in ${TESTS:-$ALL_TESTS}; do + in_defer_scope \ + $current_test + done +} + +# Whether FAILs should be interpreted as XFAILs. Internal. +FAIL_TO_XFAIL= + +check_err() +{ + local err=$1 + local msg=$2 + + if ((err)); then + if [[ $FAIL_TO_XFAIL = yes ]]; then + ret_set_ksft_status $ksft_xfail "$msg" + else + ret_set_ksft_status $ksft_fail "$msg" + fi + fi +} + +check_fail() +{ + local err=$1 + local msg=$2 + + check_err $((!err)) "$msg" +} + +check_err_fail() +{ + local should_fail=$1; shift + local err=$1; shift + local what=$1; shift + + if ((should_fail)); then + check_fail $err "$what succeeded, but should have failed" + else + check_err $err "$what failed" + fi +} + +xfail() +{ + FAIL_TO_XFAIL=yes "$@" +} + +xfail_on_slow() +{ + if [[ $KSFT_MACHINE_SLOW = yes ]]; then + FAIL_TO_XFAIL=yes "$@" + else + "$@" + fi +} + +omit_on_slow() +{ + if [[ $KSFT_MACHINE_SLOW != yes ]]; then + "$@" + fi +} + +xfail_on_veth() +{ + local dev=$1; shift + local kind + + kind=$(ip -j -d link show dev $dev | + jq -r '.[].linkinfo.info_kind') + if [[ $kind = veth ]]; then + FAIL_TO_XFAIL=yes "$@" + else + "$@" + fi +} + +kill_process() +{ + local pid=$1; shift + + # Suppress noise from killing the process. + { kill $pid && wait $pid; } 2>/dev/null +} + +ip_link_add() +{ + local name=$1; shift + + ip link add name "$name" "$@" + defer ip link del dev "$name" +} + +ip_link_master() +{ + local member=$1; shift + local master=$1; shift + + ip link set dev "$member" master "$master" + defer ip link set dev "$member" nomaster +} diff --git a/tools/testing/selftests/net/lib/Makefile b/tools/testing/selftests/net/lib/Makefile index 82c3264b115e..18b9443454a9 100644 --- a/tools/testing/selftests/net/lib/Makefile +++ b/tools/testing/selftests/net/lib/Makefile @@ -10,6 +10,6 @@ TEST_FILES += ../../../../net/ynl TEST_GEN_FILES += csum -TEST_INCLUDES := $(wildcard py/*.py) +TEST_INCLUDES := $(wildcard py/*.py sh/*.sh) include ../../lib.mk diff --git a/tools/testing/selftests/net/lib/csum.c b/tools/testing/selftests/net/lib/csum.c index e0a34e5e8dd5..27437590eeb5 100644 --- a/tools/testing/selftests/net/lib/csum.c +++ b/tools/testing/selftests/net/lib/csum.c @@ -675,22 +675,20 @@ static int recv_verify_packet_ipv6(void *nh, int len) { struct ipv6hdr *ip6h = nh; uint16_t proto = cfg_encap ? IPPROTO_UDP : cfg_proto; - uint16_t ip_len; + uint16_t payload_len; if (len < sizeof(*ip6h) || ip6h->nexthdr != proto) return -1; - ip_len = ntohs(ip6h->payload_len); - if (ip_len > len - sizeof(*ip6h)) + payload_len = ntohs(ip6h->payload_len); + if (payload_len > len - sizeof(*ip6h)) return -1; - len = ip_len; iph_addr_p = &ip6h->saddr; - if (proto == IPPROTO_TCP) - return recv_verify_packet_tcp(ip6h + 1, len); + return recv_verify_packet_tcp(ip6h + 1, payload_len); else - return recv_verify_packet_udp(ip6h + 1, len); + return recv_verify_packet_udp(ip6h + 1, payload_len); } /* return whether auxdata includes TP_STATUS_CSUM_VALID */ diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index b6d498d125fe..54d8f5eba810 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -6,3 +6,4 @@ from .netns import NetNS from .nsim import * from .utils import * from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily +from .ynl import NetshaperFamily diff --git a/tools/testing/selftests/net/lib/py/ynl.py b/tools/testing/selftests/net/lib/py/ynl.py index 1ace58370c06..a0d689d58c57 100644 --- a/tools/testing/selftests/net/lib/py/ynl.py +++ b/tools/testing/selftests/net/lib/py/ynl.py @@ -47,3 +47,8 @@ class NetdevFamily(YnlFamily): def __init__(self): super().__init__((SPEC_PATH / Path('netdev.yaml')).as_posix(), schema='') + +class NetshaperFamily(YnlFamily): + def __init__(self): + super().__init__((SPEC_PATH / Path('net_shaper.yaml')).as_posix(), + schema='') diff --git a/tools/testing/selftests/net/lib/sh/defer.sh b/tools/testing/selftests/net/lib/sh/defer.sh new file mode 100644 index 000000000000..082f5d38321b --- /dev/null +++ b/tools/testing/selftests/net/lib/sh/defer.sh @@ -0,0 +1,115 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# map[(scope_id,track,cleanup_id) -> cleanup_command] +# track={d=default | p=priority} +declare -A __DEFER__JOBS + +# map[(scope_id,track) -> # cleanup_commands] +declare -A __DEFER__NJOBS + +# scope_id of the topmost scope. +__DEFER__SCOPE_ID=0 + +__defer__ndefer_key() +{ + local track=$1; shift + + echo $__DEFER__SCOPE_ID,$track +} + +__defer__defer_key() +{ + local track=$1; shift + local defer_ix=$1; shift + + echo $__DEFER__SCOPE_ID,$track,$defer_ix +} + +__defer__ndefers() +{ + local track=$1; shift + + echo ${__DEFER__NJOBS[$(__defer__ndefer_key $track)]} +} + +__defer__run() +{ + local track=$1; shift + local defer_ix=$1; shift + local defer_key=$(__defer__defer_key $track $defer_ix) + + ${__DEFER__JOBS[$defer_key]} + unset __DEFER__JOBS[$defer_key] +} + +__defer__schedule() +{ + local track=$1; shift + local ndefers=$(__defer__ndefers $track) + local ndefers_key=$(__defer__ndefer_key $track) + local defer_key=$(__defer__defer_key $track $ndefers) + local defer="$@" + + __DEFER__JOBS[$defer_key]="$defer" + __DEFER__NJOBS[$ndefers_key]=$((ndefers + 1)) +} + +__defer__scope_wipe() +{ + __DEFER__NJOBS[$(__defer__ndefer_key d)]=0 + __DEFER__NJOBS[$(__defer__ndefer_key p)]=0 +} + +defer_scope_push() +{ + ((__DEFER__SCOPE_ID++)) + __defer__scope_wipe +} + +defer_scope_pop() +{ + local defer_ix + + for ((defer_ix=$(__defer__ndefers p); defer_ix-->0; )); do + __defer__run p $defer_ix + done + + for ((defer_ix=$(__defer__ndefers d); defer_ix-->0; )); do + __defer__run d $defer_ix + done + + __defer__scope_wipe + ((__DEFER__SCOPE_ID--)) +} + +defer() +{ + __defer__schedule d "$@" +} + +defer_prio() +{ + __defer__schedule p "$@" +} + +defer_scopes_cleanup() +{ + while ((__DEFER__SCOPE_ID >= 0)); do + defer_scope_pop + done +} + +in_defer_scope() +{ + local ret + + defer_scope_push + "$@" + ret=$? + defer_scope_pop + + return $ret +} + +__defer__scope_wipe diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile index 5d796622e730..8e3fc05a5397 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -11,7 +11,7 @@ TEST_GEN_FILES = mptcp_connect pm_nl_ctl mptcp_sockopt mptcp_inq TEST_FILES := mptcp_lib.sh settings -TEST_INCLUDES := ../lib.sh ../net_helper.sh +TEST_INCLUDES := ../lib.sh $(wildcard ../lib/sh/*.sh) ../net_helper.sh EXTRA_CLEAN := *.pcap diff --git a/tools/testing/selftests/net/ncdevmem.c b/tools/testing/selftests/net/ncdevmem.c deleted file mode 100644 index 64d6805381c5..000000000000 --- a/tools/testing/selftests/net/ncdevmem.c +++ /dev/null @@ -1,570 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#define _GNU_SOURCE -#define __EXPORTED_HEADERS__ - -#include <linux/uio.h> -#include <stdio.h> -#include <stdlib.h> -#include <unistd.h> -#include <stdbool.h> -#include <string.h> -#include <errno.h> -#define __iovec_defined -#include <fcntl.h> -#include <malloc.h> -#include <error.h> - -#include <arpa/inet.h> -#include <sys/socket.h> -#include <sys/mman.h> -#include <sys/ioctl.h> -#include <sys/syscall.h> - -#include <linux/memfd.h> -#include <linux/dma-buf.h> -#include <linux/udmabuf.h> -#include <libmnl/libmnl.h> -#include <linux/types.h> -#include <linux/netlink.h> -#include <linux/genetlink.h> -#include <linux/netdev.h> -#include <time.h> -#include <net/if.h> - -#include "netdev-user.h" -#include <ynl.h> - -#define PAGE_SHIFT 12 -#define TEST_PREFIX "ncdevmem" -#define NUM_PAGES 16000 - -#ifndef MSG_SOCK_DEVMEM -#define MSG_SOCK_DEVMEM 0x2000000 -#endif - -/* - * tcpdevmem netcat. Works similarly to netcat but does device memory TCP - * instead of regular TCP. Uses udmabuf to mock a dmabuf provider. - * - * Usage: - * - * On server: - * ncdevmem -s <server IP> -c <client IP> -f eth1 -l -p 5201 -v 7 - * - * On client: - * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \ - * tr \\n \\0 | \ - * head -c 5G | \ - * nc <server IP> 5201 -p 5201 - * - * Note this is compatible with regular netcat. i.e. the sender or receiver can - * be replaced with regular netcat to test the RX or TX path in isolation. - */ - -static char *server_ip = "192.168.1.4"; -static char *client_ip = "192.168.1.2"; -static char *port = "5201"; -static size_t do_validation; -static int start_queue = 8; -static int num_queues = 8; -static char *ifname = "eth1"; -static unsigned int ifindex; -static unsigned int dmabuf_id; - -void print_bytes(void *ptr, size_t size) -{ - unsigned char *p = ptr; - int i; - - for (i = 0; i < size; i++) - printf("%02hhX ", p[i]); - printf("\n"); -} - -void print_nonzero_bytes(void *ptr, size_t size) -{ - unsigned char *p = ptr; - unsigned int i; - - for (i = 0; i < size; i++) - putchar(p[i]); - printf("\n"); -} - -void validate_buffer(void *line, size_t size) -{ - static unsigned char seed = 1; - unsigned char *ptr = line; - int errors = 0; - size_t i; - - for (i = 0; i < size; i++) { - if (ptr[i] != seed) { - fprintf(stderr, - "Failed validation: expected=%u, actual=%u, index=%lu\n", - seed, ptr[i], i); - errors++; - if (errors > 20) - error(1, 0, "validation failed."); - } - seed++; - if (seed == do_validation) - seed = 0; - } - - fprintf(stdout, "Validated buffer\n"); -} - -#define run_command(cmd, ...) \ - ({ \ - char command[256]; \ - memset(command, 0, sizeof(command)); \ - snprintf(command, sizeof(command), cmd, ##__VA_ARGS__); \ - printf("Running: %s\n", command); \ - system(command); \ - }) - -static int reset_flow_steering(void) -{ - int ret = 0; - - ret = run_command("sudo ethtool -K %s ntuple off", ifname); - if (ret) - return ret; - - return run_command("sudo ethtool -K %s ntuple on", ifname); -} - -static int configure_headersplit(bool on) -{ - return run_command("sudo ethtool -G %s tcp-data-split %s", ifname, - on ? "on" : "off"); -} - -static int configure_rss(void) -{ - return run_command("sudo ethtool -X %s equal %d", ifname, start_queue); -} - -static int configure_channels(unsigned int rx, unsigned int tx) -{ - return run_command("sudo ethtool -L %s rx %u tx %u", ifname, rx, tx); -} - -static int configure_flow_steering(void) -{ - return run_command("sudo ethtool -N %s flow-type tcp4 src-ip %s dst-ip %s src-port %s dst-port %s queue %d", - ifname, client_ip, server_ip, port, port, start_queue); -} - -static int bind_rx_queue(unsigned int ifindex, unsigned int dmabuf_fd, - struct netdev_queue_id *queues, - unsigned int n_queue_index, struct ynl_sock **ys) -{ - struct netdev_bind_rx_req *req = NULL; - struct netdev_bind_rx_rsp *rsp = NULL; - struct ynl_error yerr; - - *ys = ynl_sock_create(&ynl_netdev_family, &yerr); - if (!*ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return -1; - } - - req = netdev_bind_rx_req_alloc(); - netdev_bind_rx_req_set_ifindex(req, ifindex); - netdev_bind_rx_req_set_fd(req, dmabuf_fd); - __netdev_bind_rx_req_set_queues(req, queues, n_queue_index); - - rsp = netdev_bind_rx(*ys, req); - if (!rsp) { - perror("netdev_bind_rx"); - goto err_close; - } - - if (!rsp->_present.id) { - perror("id not present"); - goto err_close; - } - - printf("got dmabuf id=%d\n", rsp->id); - dmabuf_id = rsp->id; - - netdev_bind_rx_req_free(req); - netdev_bind_rx_rsp_free(rsp); - - return 0; - -err_close: - fprintf(stderr, "YNL failed: %s\n", (*ys)->err.msg); - netdev_bind_rx_req_free(req); - ynl_sock_destroy(*ys); - return -1; -} - -static void create_udmabuf(int *devfd, int *memfd, int *buf, size_t dmabuf_size) -{ - struct udmabuf_create create; - int ret; - - *devfd = open("/dev/udmabuf", O_RDWR); - if (*devfd < 0) { - error(70, 0, - "%s: [skip,no-udmabuf: Unable to access DMA buffer device file]\n", - TEST_PREFIX); - } - - *memfd = memfd_create("udmabuf-test", MFD_ALLOW_SEALING); - if (*memfd < 0) - error(70, 0, "%s: [skip,no-memfd]\n", TEST_PREFIX); - - /* Required for udmabuf */ - ret = fcntl(*memfd, F_ADD_SEALS, F_SEAL_SHRINK); - if (ret < 0) - error(73, 0, "%s: [skip,fcntl-add-seals]\n", TEST_PREFIX); - - ret = ftruncate(*memfd, dmabuf_size); - if (ret == -1) - error(74, 0, "%s: [FAIL,memfd-truncate]\n", TEST_PREFIX); - - memset(&create, 0, sizeof(create)); - - create.memfd = *memfd; - create.offset = 0; - create.size = dmabuf_size; - *buf = ioctl(*devfd, UDMABUF_CREATE, &create); - if (*buf < 0) - error(75, 0, "%s: [FAIL, create udmabuf]\n", TEST_PREFIX); -} - -int do_server(void) -{ - char ctrl_data[sizeof(int) * 20000]; - struct netdev_queue_id *queues; - size_t non_page_aligned_frags = 0; - struct sockaddr_in client_addr; - struct sockaddr_in server_sin; - size_t page_aligned_frags = 0; - int devfd, memfd, buf, ret; - size_t total_received = 0; - socklen_t client_addr_len; - bool is_devmem = false; - char *buf_mem = NULL; - struct ynl_sock *ys; - size_t dmabuf_size; - char iobuf[819200]; - char buffer[256]; - int socket_fd; - int client_fd; - size_t i = 0; - int opt = 1; - - dmabuf_size = getpagesize() * NUM_PAGES; - - create_udmabuf(&devfd, &memfd, &buf, dmabuf_size); - - if (reset_flow_steering()) - error(1, 0, "Failed to reset flow steering\n"); - - /* Configure RSS to divert all traffic from our devmem queues */ - if (configure_rss()) - error(1, 0, "Failed to configure rss\n"); - - /* Flow steer our devmem flows to start_queue */ - if (configure_flow_steering()) - error(1, 0, "Failed to configure flow steering\n"); - - sleep(1); - - queues = malloc(sizeof(*queues) * num_queues); - - for (i = 0; i < num_queues; i++) { - queues[i]._present.type = 1; - queues[i]._present.id = 1; - queues[i].type = NETDEV_QUEUE_TYPE_RX; - queues[i].id = start_queue + i; - } - - if (bind_rx_queue(ifindex, buf, queues, num_queues, &ys)) - error(1, 0, "Failed to bind\n"); - - buf_mem = mmap(NULL, dmabuf_size, PROT_READ | PROT_WRITE, MAP_SHARED, - buf, 0); - if (buf_mem == MAP_FAILED) - error(1, 0, "mmap()"); - - server_sin.sin_family = AF_INET; - server_sin.sin_port = htons(atoi(port)); - - ret = inet_pton(server_sin.sin_family, server_ip, &server_sin.sin_addr); - if (socket < 0) - error(79, 0, "%s: [FAIL, create socket]\n", TEST_PREFIX); - - socket_fd = socket(server_sin.sin_family, SOCK_STREAM, 0); - if (socket < 0) - error(errno, errno, "%s: [FAIL, create socket]\n", TEST_PREFIX); - - ret = setsockopt(socket_fd, SOL_SOCKET, SO_REUSEPORT, &opt, - sizeof(opt)); - if (ret) - error(errno, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX); - - ret = setsockopt(socket_fd, SOL_SOCKET, SO_REUSEADDR, &opt, - sizeof(opt)); - if (ret) - error(errno, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX); - - printf("binding to address %s:%d\n", server_ip, - ntohs(server_sin.sin_port)); - - ret = bind(socket_fd, &server_sin, sizeof(server_sin)); - if (ret) - error(errno, errno, "%s: [FAIL, bind]\n", TEST_PREFIX); - - ret = listen(socket_fd, 1); - if (ret) - error(errno, errno, "%s: [FAIL, listen]\n", TEST_PREFIX); - - client_addr_len = sizeof(client_addr); - - inet_ntop(server_sin.sin_family, &server_sin.sin_addr, buffer, - sizeof(buffer)); - printf("Waiting or connection on %s:%d\n", buffer, - ntohs(server_sin.sin_port)); - client_fd = accept(socket_fd, &client_addr, &client_addr_len); - - inet_ntop(client_addr.sin_family, &client_addr.sin_addr, buffer, - sizeof(buffer)); - printf("Got connection from %s:%d\n", buffer, - ntohs(client_addr.sin_port)); - - while (1) { - struct iovec iov = { .iov_base = iobuf, - .iov_len = sizeof(iobuf) }; - struct dmabuf_cmsg *dmabuf_cmsg = NULL; - struct dma_buf_sync sync = { 0 }; - struct cmsghdr *cm = NULL; - struct msghdr msg = { 0 }; - struct dmabuf_token token; - ssize_t ret; - - is_devmem = false; - printf("\n\n"); - - msg.msg_iov = &iov; - msg.msg_iovlen = 1; - msg.msg_control = ctrl_data; - msg.msg_controllen = sizeof(ctrl_data); - ret = recvmsg(client_fd, &msg, MSG_SOCK_DEVMEM); - printf("recvmsg ret=%ld\n", ret); - if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) - continue; - if (ret < 0) { - perror("recvmsg"); - continue; - } - if (ret == 0) { - printf("client exited\n"); - goto cleanup; - } - - i++; - for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) { - if (cm->cmsg_level != SOL_SOCKET || - (cm->cmsg_type != SCM_DEVMEM_DMABUF && - cm->cmsg_type != SCM_DEVMEM_LINEAR)) { - fprintf(stdout, "skipping non-devmem cmsg\n"); - continue; - } - - dmabuf_cmsg = (struct dmabuf_cmsg *)CMSG_DATA(cm); - is_devmem = true; - - if (cm->cmsg_type == SCM_DEVMEM_LINEAR) { - /* TODO: process data copied from skb's linear - * buffer. - */ - fprintf(stdout, - "SCM_DEVMEM_LINEAR. dmabuf_cmsg->frag_size=%u\n", - dmabuf_cmsg->frag_size); - - continue; - } - - token.token_start = dmabuf_cmsg->frag_token; - token.token_count = 1; - - total_received += dmabuf_cmsg->frag_size; - printf("received frag_page=%llu, in_page_offset=%llu, frag_offset=%llu, frag_size=%u, token=%u, total_received=%lu, dmabuf_id=%u\n", - dmabuf_cmsg->frag_offset >> PAGE_SHIFT, - dmabuf_cmsg->frag_offset % getpagesize(), - dmabuf_cmsg->frag_offset, dmabuf_cmsg->frag_size, - dmabuf_cmsg->frag_token, total_received, - dmabuf_cmsg->dmabuf_id); - - if (dmabuf_cmsg->dmabuf_id != dmabuf_id) - error(1, 0, - "received on wrong dmabuf_id: flow steering error\n"); - - if (dmabuf_cmsg->frag_size % getpagesize()) - non_page_aligned_frags++; - else - page_aligned_frags++; - - sync.flags = DMA_BUF_SYNC_READ | DMA_BUF_SYNC_START; - ioctl(buf, DMA_BUF_IOCTL_SYNC, &sync); - - if (do_validation) - validate_buffer( - ((unsigned char *)buf_mem) + - dmabuf_cmsg->frag_offset, - dmabuf_cmsg->frag_size); - else - print_nonzero_bytes( - ((unsigned char *)buf_mem) + - dmabuf_cmsg->frag_offset, - dmabuf_cmsg->frag_size); - - sync.flags = DMA_BUF_SYNC_READ | DMA_BUF_SYNC_END; - ioctl(buf, DMA_BUF_IOCTL_SYNC, &sync); - - ret = setsockopt(client_fd, SOL_SOCKET, - SO_DEVMEM_DONTNEED, &token, - sizeof(token)); - if (ret != 1) - error(1, 0, - "SO_DEVMEM_DONTNEED not enough tokens"); - } - if (!is_devmem) - error(1, 0, "flow steering error\n"); - - printf("total_received=%lu\n", total_received); - } - - fprintf(stdout, "%s: ok\n", TEST_PREFIX); - - fprintf(stdout, "page_aligned_frags=%lu, non_page_aligned_frags=%lu\n", - page_aligned_frags, non_page_aligned_frags); - - fprintf(stdout, "page_aligned_frags=%lu, non_page_aligned_frags=%lu\n", - page_aligned_frags, non_page_aligned_frags); - -cleanup: - - munmap(buf_mem, dmabuf_size); - close(client_fd); - close(socket_fd); - close(buf); - close(memfd); - close(devfd); - ynl_sock_destroy(ys); - - return 0; -} - -void run_devmem_tests(void) -{ - struct netdev_queue_id *queues; - int devfd, memfd, buf; - struct ynl_sock *ys; - size_t dmabuf_size; - size_t i = 0; - - dmabuf_size = getpagesize() * NUM_PAGES; - - create_udmabuf(&devfd, &memfd, &buf, dmabuf_size); - - /* Configure RSS to divert all traffic from our devmem queues */ - if (configure_rss()) - error(1, 0, "rss error\n"); - - queues = calloc(num_queues, sizeof(*queues)); - - if (configure_headersplit(1)) - error(1, 0, "Failed to configure header split\n"); - - if (!bind_rx_queue(ifindex, buf, queues, num_queues, &ys)) - error(1, 0, "Binding empty queues array should have failed\n"); - - for (i = 0; i < num_queues; i++) { - queues[i]._present.type = 1; - queues[i]._present.id = 1; - queues[i].type = NETDEV_QUEUE_TYPE_RX; - queues[i].id = start_queue + i; - } - - if (configure_headersplit(0)) - error(1, 0, "Failed to configure header split\n"); - - if (!bind_rx_queue(ifindex, buf, queues, num_queues, &ys)) - error(1, 0, "Configure dmabuf with header split off should have failed\n"); - - if (configure_headersplit(1)) - error(1, 0, "Failed to configure header split\n"); - - for (i = 0; i < num_queues; i++) { - queues[i]._present.type = 1; - queues[i]._present.id = 1; - queues[i].type = NETDEV_QUEUE_TYPE_RX; - queues[i].id = start_queue + i; - } - - if (bind_rx_queue(ifindex, buf, queues, num_queues, &ys)) - error(1, 0, "Failed to bind\n"); - - /* Deactivating a bound queue should not be legal */ - if (!configure_channels(num_queues, num_queues - 1)) - error(1, 0, "Deactivating a bound queue should be illegal.\n"); - - /* Closing the netlink socket does an implicit unbind */ - ynl_sock_destroy(ys); -} - -int main(int argc, char *argv[]) -{ - int is_server = 0, opt; - - while ((opt = getopt(argc, argv, "ls:c:p:v:q:t:f:")) != -1) { - switch (opt) { - case 'l': - is_server = 1; - break; - case 's': - server_ip = optarg; - break; - case 'c': - client_ip = optarg; - break; - case 'p': - port = optarg; - break; - case 'v': - do_validation = atoll(optarg); - break; - case 'q': - num_queues = atoi(optarg); - break; - case 't': - start_queue = atoi(optarg); - break; - case 'f': - ifname = optarg; - break; - case '?': - printf("unknown option: %c\n", optopt); - break; - } - } - - ifindex = if_nametoindex(ifname); - - for (; optind < argc; optind++) - printf("extra arguments: %s\n", argv[optind]); - - run_devmem_tests(); - - if (is_server) - return do_server(); - - return 0; -} diff --git a/tools/testing/selftests/net/netfilter/.gitignore b/tools/testing/selftests/net/netfilter/.gitignore index 0a64d6d0e29a..64c4f8d9aa6c 100644 --- a/tools/testing/selftests/net/netfilter/.gitignore +++ b/tools/testing/selftests/net/netfilter/.gitignore @@ -2,5 +2,6 @@ audit_logread connect_close conntrack_dump_flush +conntrack_reverse_clash sctp_collision nf_queue diff --git a/tools/testing/selftests/net/netfilter/Makefile b/tools/testing/selftests/net/netfilter/Makefile index 542f7886a0bc..ffe161fac8b5 100644 --- a/tools/testing/selftests/net/netfilter/Makefile +++ b/tools/testing/selftests/net/netfilter/Makefile @@ -8,6 +8,7 @@ MNL_LDLIBS := $(shell $(HOSTPKG_CONFIG) --libs libmnl 2>/dev/null || echo -lmnl) TEST_PROGS := br_netfilter.sh bridge_brouter.sh TEST_PROGS += br_netfilter_queue.sh +TEST_PROGS += conntrack_dump_flush.sh TEST_PROGS += conntrack_icmp_related.sh TEST_PROGS += conntrack_ipip_mtu.sh TEST_PROGS += conntrack_tcp_unreplied.sh @@ -36,10 +37,9 @@ TEST_PROGS += xt_string.sh TEST_PROGS_EXTENDED = nft_concat_range_perf.sh -TEST_GEN_PROGS = conntrack_dump_flush - TEST_GEN_FILES = audit_logread TEST_GEN_FILES += connect_close nf_queue +TEST_GEN_FILES += conntrack_dump_flush TEST_GEN_FILES += conntrack_reverse_clash TEST_GEN_FILES += sctp_collision @@ -55,4 +55,5 @@ TEST_FILES := lib.sh TEST_FILES += packetdrill TEST_INCLUDES := \ - ../lib.sh + ../lib.sh \ + $(wildcard ../lib/sh/*.sh) diff --git a/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c index 254ff03297f0..5f827e10717d 100644 --- a/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c +++ b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c @@ -43,6 +43,8 @@ static int build_cta_tuple_v4(struct nlmsghdr *nlh, int type, mnl_attr_nest_end(nlh, nest_proto); mnl_attr_nest_end(nlh, nest); + + return 0; } static int build_cta_tuple_v6(struct nlmsghdr *nlh, int type, @@ -71,6 +73,8 @@ static int build_cta_tuple_v6(struct nlmsghdr *nlh, int type, mnl_attr_nest_end(nlh, nest_proto); mnl_attr_nest_end(nlh, nest); + + return 0; } static int build_cta_proto(struct nlmsghdr *nlh) @@ -90,6 +94,8 @@ static int build_cta_proto(struct nlmsghdr *nlh) mnl_attr_nest_end(nlh, nest_proto); mnl_attr_nest_end(nlh, nest); + + return 0; } static int conntrack_data_insert(struct mnl_socket *sock, struct nlmsghdr *nlh, diff --git a/tools/testing/selftests/net/netfilter/conntrack_dump_flush.sh b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.sh new file mode 100755 index 000000000000..8b0935385849 --- /dev/null +++ b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.sh @@ -0,0 +1,3 @@ +#!/bin/bash + +exec unshare -n ./conntrack_dump_flush diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh index a9d109fcc15c..785e3875a6da 100755 --- a/tools/testing/selftests/net/netfilter/nft_queue.sh +++ b/tools/testing/selftests/net/netfilter/nft_queue.sh @@ -512,10 +512,10 @@ EOF :> "$TMPFILE1" :> "$TMPFILE2" - timeout 10 ip netns exec "$ns2" socat UDP-LISTEN:12345,fork OPEN:"$TMPFILE1",trunc & + timeout 10 ip netns exec "$ns2" socat UDP-LISTEN:12345,fork,pf=ipv4 OPEN:"$TMPFILE1",trunc & local rpid1=$! - timeout 10 ip netns exec "$ns3" socat UDP-LISTEN:12345,fork OPEN:"$TMPFILE2",trunc & + timeout 10 ip netns exec "$ns3" socat UDP-LISTEN:12345,fork,pf=ipv4 OPEN:"$TMPFILE2",trunc & local rpid2=$! ip netns exec "$nsrouter" ./nf_queue -q 12 -d 1000 & @@ -528,8 +528,8 @@ EOF # Send two packets, one should end up in ns1, other in ns2. # This is because nfqueue will delay packet for long enough so that # second packet will not find existing conntrack entry. - echo "Packet 1" | ip netns exec "$ns1" socat STDIN UDP-DATAGRAM:10.6.6.6:12345,bind=0.0.0.0:55221 - echo "Packet 2" | ip netns exec "$ns1" socat STDIN UDP-DATAGRAM:10.6.6.6:12345,bind=0.0.0.0:55221 + echo "Packet 1" | ip netns exec "$ns1" socat -u STDIN UDP-DATAGRAM:10.6.6.6:12345,bind=0.0.0.0:55221 + echo "Packet 2" | ip netns exec "$ns1" socat -u STDIN UDP-DATAGRAM:10.6.6.6:12345,bind=0.0.0.0:55221 busywait 10000 output_files_written "$TMPFILE1" "$TMPFILE2" diff --git a/tools/testing/selftests/net/netlink-dumps.c b/tools/testing/selftests/net/netlink-dumps.c index 7ee6dcd334df..84e29b7dffb6 100644 --- a/tools/testing/selftests/net/netlink-dumps.c +++ b/tools/testing/selftests/net/netlink-dumps.c @@ -56,10 +56,10 @@ TEST(test_sanity) ASSERT_EQ(n, sizeof(dump_policies)); n = recv(netlink_sock, buf, sizeof(buf), MSG_DONTWAIT); - ASSERT_GE(n, sizeof(struct nlmsghdr)); + ASSERT_GE(n, (ssize_t)sizeof(struct nlmsghdr)); n = recv(netlink_sock, buf, sizeof(buf), MSG_DONTWAIT); - ASSERT_GE(n, sizeof(struct nlmsghdr)); + ASSERT_GE(n, (ssize_t)sizeof(struct nlmsghdr)); close(netlink_sock); } diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index 569bce8b6383..66be7699c72c 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -197,6 +197,12 @@ # # - pmtu_ipv6_route_change # Same as above but with IPv6 +# +# - pmtu_ipv4_mp_exceptions +# Use the same topology as in pmtu_ipv4, but add routeable addresses +# on host A and B on lo reachable via both routers. Host A and B +# addresses have multipath routes to each other, b_r1 mtu = 1500. +# Check that PMTU exceptions are created for both paths. source lib.sh source net_helper.sh @@ -266,7 +272,8 @@ tests=" list_flush_ipv4_exception ipv4: list and flush cached exceptions 1 list_flush_ipv6_exception ipv6: list and flush cached exceptions 1 pmtu_ipv4_route_change ipv4: PMTU exception w/route replace 1 - pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1" + pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1 + pmtu_ipv4_mp_exceptions ipv4: PMTU multipath nh exceptions 1" # Addressing and routing for tests with routers: four network segments, with # index SEGMENT between 1 and 4, a common prefix (PREFIX4 or PREFIX6) and an @@ -343,6 +350,9 @@ tunnel6_a_addr="fd00:2::a" tunnel6_b_addr="fd00:2::b" tunnel6_mask="64" +host4_a_addr="192.168.99.99" +host4_b_addr="192.168.88.88" + dummy6_0_prefix="fc00:1000::" dummy6_1_prefix="fc00:1001::" dummy6_mask="64" @@ -984,6 +994,52 @@ setup_ovs_bridge() { run_cmd ip route add ${prefix6}:${b_r1}::1 via ${prefix6}:${a_r1}::2 } +setup_multipath_new() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip nexthop add id 401 via ${prefix4}.${a_r1}.2 dev veth_A-R1 + run_cmd ${ns_a} ip nexthop add id 402 via ${prefix4}.${a_r2}.2 dev veth_A-R2 + run_cmd ${ns_a} ip nexthop add id 403 group 401/402 + run_cmd ${ns_a} ip route add ${host4_b_addr} src ${host4_a_addr} nhid 403 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip nexthop add id 401 via ${prefix4}.${b_r1}.2 dev veth_B-R1 + run_cmd ${ns_b} ip nexthop add id 402 via ${prefix4}.${b_r2}.2 dev veth_B-R2 + run_cmd ${ns_b} ip nexthop add id 403 group 401/402 + run_cmd ${ns_b} ip route add ${host4_a_addr} src ${host4_b_addr} nhid 403 +} + +setup_multipath_old() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip route add ${host4_b_addr} \ + src ${host4_a_addr} \ + nexthop via ${prefix4}.${a_r1}.2 weight 1 \ + nexthop via ${prefix4}.${a_r2}.2 weight 1 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip route add ${host4_a_addr} \ + src ${host4_b_addr} \ + nexthop via ${prefix4}.${b_r1}.2 weight 1 \ + nexthop via ${prefix4}.${b_r2}.2 weight 1 +} + +setup_multipath() { + if [ "$USE_NH" = "yes" ]; then + setup_multipath_new + else + setup_multipath_old + fi + + # Set up routers with routes to dummies + run_cmd ${ns_r1} ip route add ${host4_a_addr} via ${prefix4}.${a_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_a_addr} via ${prefix4}.${a_r2}.1 + run_cmd ${ns_r1} ip route add ${host4_b_addr} via ${prefix4}.${b_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_b_addr} via ${prefix4}.${b_r2}.1 +} + setup() { [ "$(id -u)" -ne 0 ] && echo " need to run as root" && return $ksft_skip @@ -1076,23 +1132,15 @@ link_get_mtu() { } route_get_dst_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" + ns_cmd="${1}"; shift - if [ -z "${dsfield}" ]; then - dsfield=0 - fi - - ${ns_cmd} ip route get "${dst}" dsfield "${dsfield}" + ${ns_cmd} ip route get "$@" } route_get_dst_pmtu_from_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" + ns_cmd="${1}"; shift - mtu_parse "$(route_get_dst_exception "${ns_cmd}" "${dst}" "${dsfield}")" + mtu_parse "$(route_get_dst_exception "${ns_cmd}" "$@")" } check_pmtu_value() { @@ -1235,10 +1283,10 @@ test_pmtu_ipv4_dscp_icmp_exception() { run_cmd "${ns_a}" ping -q -M want -Q "${dsfield}" -c 1 -w 1 -s "${len}" "${dst2}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -1285,9 +1333,9 @@ test_pmtu_ipv4_dscp_udp_exception() { UDP:"${dst2}":50000,tos="${dsfield}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -2056,7 +2104,7 @@ check_running() { pid=${1} cmd=${2} - [ "$(cat /proc/${pid}/cmdline 2>/dev/null | tr -d '\0')" = "{cmd}" ] + [ "$(cat /proc/${pid}/cmdline 2>/dev/null | tr -d '\0')" = "${cmd}" ] } test_cleanup_vxlanX_exception() { @@ -2329,6 +2377,36 @@ test_pmtu_ipv6_route_change() { test_pmtu_ipvX_route_change 6 } +test_pmtu_ipv4_mp_exceptions() { + setup namespaces routing multipath || return $ksft_skip + + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1500 + mtu "${ns_b}" veth_B-R1 1500 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + # Ping and expect two nexthop exceptions for two routes + run_cmd ${ns_a} ping -q -M want -i 0.1 -c 1 -s 1800 "${host4_b_addr}" + + # Check that exceptions have been created with the correct PMTU + pmtu_a_R1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R1)" + pmtu_a_R2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R2)" + + check_pmtu_value "1500" "${pmtu_a_R1}" "exceeding MTU (veth_A-R1)" || return 1 + check_pmtu_value "1500" "${pmtu_a_R2}" "exceeding MTU (veth_A-R2)" || return 1 +} + usage() { echo echo "$0 [OPTIONS] [TEST]..." diff --git a/tools/testing/selftests/net/psock_fanout.c b/tools/testing/selftests/net/psock_fanout.c index 4f31e92ebd96..84c524357075 100644 --- a/tools/testing/selftests/net/psock_fanout.c +++ b/tools/testing/selftests/net/psock_fanout.c @@ -48,6 +48,7 @@ #include <string.h> #include <sys/mman.h> #include <sys/socket.h> +#include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> @@ -59,6 +60,33 @@ static uint32_t cfg_max_num_members; +static void loopback_set_up_down(int state_up) +{ + struct ifreq ifreq = {}; + int fd, err; + + fd = socket(AF_PACKET, SOCK_RAW, 0); + if (fd < 0) { + perror("socket loopback"); + exit(1); + } + strcpy(ifreq.ifr_name, "lo"); + err = ioctl(fd, SIOCGIFFLAGS, &ifreq); + if (err) { + perror("SIOCGIFFLAGS"); + exit(1); + } + if (state_up != !!(ifreq.ifr_flags & IFF_UP)) { + ifreq.ifr_flags ^= IFF_UP; + err = ioctl(fd, SIOCSIFFLAGS, &ifreq); + if (err) { + perror("SIOCSIFFLAGS"); + exit(1); + } + } + close(fd); +} + /* Open a socket in a given fanout mode. * @return -1 if mode is bad, a valid socket otherwise */ static int sock_fanout_open(uint16_t typeflags, uint16_t group_id) @@ -251,6 +279,41 @@ static int sock_fanout_read(int fds[], char *rings[], const int expect[]) return 0; } +/* Test that creating/joining a fanout group fails for unbound socket without + * a specified protocol + */ +static void test_unbound_fanout(void) +{ + int val, fd0, fd1, err; + + fprintf(stderr, "test: unbound fanout\n"); + fd0 = socket(PF_PACKET, SOCK_RAW, 0); + if (fd0 < 0) { + perror("socket packet"); + exit(1); + } + /* Try to create a new fanout group. Should fail. */ + val = (PACKET_FANOUT_HASH << 16) | 1; + err = setsockopt(fd0, SOL_PACKET, PACKET_FANOUT, &val, sizeof(val)); + if (!err) { + fprintf(stderr, "ERROR: unbound socket fanout create\n"); + exit(1); + } + fd1 = sock_fanout_open(PACKET_FANOUT_HASH, 1); + if (fd1 == -1) { + fprintf(stderr, "ERROR: failed to open HASH socket\n"); + exit(1); + } + /* Try to join an existing fanout group. Should fail. */ + err = setsockopt(fd0, SOL_PACKET, PACKET_FANOUT, &val, sizeof(val)); + if (!err) { + fprintf(stderr, "ERROR: unbound socket fanout join\n"); + exit(1); + } + close(fd0); + close(fd1); +} + /* Test illegal mode + flag combination */ static void test_control_single(void) { @@ -264,17 +327,22 @@ static void test_control_single(void) } /* Test illegal group with different modes or flags */ -static void test_control_group(void) +static void test_control_group(int toggle) { int fds[2]; - fprintf(stderr, "test: control multiple sockets\n"); + if (toggle) + fprintf(stderr, "test: control multiple sockets with link down toggle\n"); + else + fprintf(stderr, "test: control multiple sockets\n"); fds[0] = sock_fanout_open(PACKET_FANOUT_HASH, 0); if (fds[0] == -1) { fprintf(stderr, "ERROR: failed to open HASH socket\n"); exit(1); } + if (toggle) + loopback_set_up_down(0); if (sock_fanout_open(PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG, 0) != -1) { fprintf(stderr, "ERROR: joined group with wrong flag defrag\n"); @@ -294,6 +362,8 @@ static void test_control_group(void) fprintf(stderr, "ERROR: failed to join group\n"); exit(1); } + if (toggle) + loopback_set_up_down(1); if (close(fds[1]) || close(fds[0])) { fprintf(stderr, "ERROR: closing sockets\n"); exit(1); @@ -488,8 +558,10 @@ int main(int argc, char **argv) const int expect_uniqueid[2][2] = { { 20, 20}, { 20, 20 } }; int port_off = 2, tries = 20, ret; + test_unbound_fanout(); test_control_single(); - test_control_group(); + test_control_group(0); + test_control_group(1); test_control_group_max_num_members(); test_unique_fanout_group_ids(); diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh index bdf6f10d0558..7f05b5f9b76f 100755 --- a/tools/testing/selftests/net/rtnetlink.sh +++ b/tools/testing/selftests/net/rtnetlink.sh @@ -21,10 +21,10 @@ ALL_TESTS=" kci_test_vrf kci_test_encap kci_test_macsec - kci_test_macsec_offload kci_test_ipsec kci_test_ipsec_offload kci_test_fdb_get + kci_test_fdb_del kci_test_neigh_get kci_test_bridge_parent_id kci_test_address_proto @@ -559,73 +559,6 @@ kci_test_macsec() end_test "PASS: macsec" } -kci_test_macsec_offload() -{ - sysfsd=/sys/kernel/debug/netdevsim/netdevsim0/ports/0/ - sysfsnet=/sys/bus/netdevsim/devices/netdevsim0/net/ - probed=false - local ret=0 - run_cmd_grep "^Usage: ip macsec" ip macsec help - if [ $? -ne 0 ]; then - end_test "SKIP: macsec: iproute2 too old" - return $ksft_skip - fi - - if ! mount | grep -q debugfs; then - mount -t debugfs none /sys/kernel/debug/ &> /dev/null - fi - - # setup netdevsim since dummydev doesn't have offload support - if [ ! -w /sys/bus/netdevsim/new_device ] ; then - run_cmd modprobe -q netdevsim - - if [ $ret -ne 0 ]; then - end_test "SKIP: macsec_offload can't load netdevsim" - return $ksft_skip - fi - probed=true - fi - - echo "0" > /sys/bus/netdevsim/new_device - while [ ! -d $sysfsnet ] ; do :; done - udevadm settle - dev=`ls $sysfsnet` - - ip link set $dev up - if [ ! -d $sysfsd ] ; then - end_test "FAIL: macsec_offload can't create device $dev" - return 1 - fi - run_cmd_grep 'macsec-hw-offload: on' ethtool -k $dev - if [ $? -eq 1 ] ; then - end_test "FAIL: macsec_offload netdevsim doesn't support MACsec offload" - return 1 - fi - run_cmd ip link add link $dev kci_macsec1 type macsec port 4 offload mac - run_cmd ip link add link $dev kci_macsec2 type macsec address "aa:bb:cc:dd:ee:ff" port 5 offload mac - run_cmd ip link add link $dev kci_macsec3 type macsec sci abbacdde01020304 offload mac - run_cmd_fail ip link add link $dev kci_macsec4 type macsec port 8 offload mac - - msname=kci_macsec1 - run_cmd ip macsec add "$msname" tx sa 0 pn 1024 on key 01 12345678901234567890123456789012 - run_cmd ip macsec add "$msname" rx port 1234 address "1c:ed:de:ad:be:ef" - run_cmd ip macsec add "$msname" rx port 1234 address "1c:ed:de:ad:be:ef" sa 0 pn 1 on \ - key 00 0123456789abcdef0123456789abcdef - run_cmd_fail ip macsec add "$msname" rx port 1235 address "1c:ed:de:ad:be:ef" - # clean up any leftovers - for msdev in kci_macsec{1,2,3,4} ; do - ip link del $msdev 2> /dev/null - done - echo 0 > /sys/bus/netdevsim/del_device - $probed && rmmod netdevsim - - if [ $ret -ne 0 ]; then - end_test "FAIL: macsec_offload" - return 1 - fi - end_test "PASS: macsec_offload" -} - #------------------------------------------------------------------- # Example commands # ip x s add proto esp src 14.0.0.52 dst 14.0.0.70 \ @@ -809,10 +742,10 @@ kci_test_ipsec_offload() # does driver have correct offload info run_cmd diff $sysfsf - << EOF SA count=2 tx=3 -sa[0] tx ipaddr=0x00000000 00000000 00000000 00000000 +sa[0] tx ipaddr=$dstip sa[0] spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1 sa[0] key=0x34333231 38373635 32313039 36353433 -sa[1] rx ipaddr=0x00000000 00000000 00000000 037ba8c0 +sa[1] rx ipaddr=$srcip sa[1] spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1 sa[1] key=0x34333231 38373635 32313039 36353433 EOF @@ -1065,6 +998,45 @@ kci_test_fdb_get() end_test "PASS: bridge fdb get" } +kci_test_fdb_del() +{ + local test_mac=de:ad:be:ef:13:37 + local dummydev="dummy1" + local brdev="test-br0" + local ret=0 + + run_cmd_grep 'bridge fdb get' bridge fdb help + if [ $? -ne 0 ]; then + end_test "SKIP: fdb del tests: iproute2 too old" + return $ksft_skip + fi + + setup_ns testns + if [ $? -ne 0 ]; then + end_test "SKIP fdb del tests: cannot add net namespace $testns" + return $ksft_skip + fi + IP="ip -netns $testns" + BRIDGE="bridge -netns $testns" + run_cmd $IP link add $dummydev type dummy + run_cmd $IP link add name $brdev type bridge vlan_filtering 1 + run_cmd $IP link set dev $dummydev master $brdev + run_cmd $BRIDGE fdb add $test_mac dev $dummydev master static vlan 1 + run_cmd $BRIDGE vlan del vid 1 dev $dummydev + run_cmd $BRIDGE fdb get $test_mac br $brdev vlan 1 + run_cmd $BRIDGE fdb del $test_mac dev $dummydev master vlan 1 + run_cmd_fail $BRIDGE fdb get $test_mac br $brdev vlan 1 + + ip netns del $testns &>/dev/null + + if [ $ret -ne 0 ]; then + end_test "FAIL: bridge fdb del" + return 1 + fi + + end_test "PASS: bridge fdb del" +} + kci_test_neigh_get() { dstmac=de:ad:be:ef:13:37 diff --git a/tools/testing/selftests/net/tcp_ao/lib/aolib.h b/tools/testing/selftests/net/tcp_ao/lib/aolib.h index db44e77428dd..5db2f65cddc4 100644 --- a/tools/testing/selftests/net/tcp_ao/lib/aolib.h +++ b/tools/testing/selftests/net/tcp_ao/lib/aolib.h @@ -46,6 +46,7 @@ static inline char *test_snprintf(const char *fmt, va_list vargs) va_copy(tmp, vargs); n = vsnprintf(ret, size, fmt, tmp); + va_end(tmp); if (n < 0) return NULL; diff --git a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c index 084db4ecdff6..0abb9807d742 100644 --- a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c +++ b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c @@ -6,6 +6,8 @@ static union tcp_addr tcp_md5_client; +#define FILTER_TEST_NKEYS 16 + static int test_port = 7788; static void make_listen(int sk) { @@ -813,23 +815,197 @@ static void duplicate_tests(void) setsockopt_checked(sk, TCP_AO_ADD_KEY, &ao, EEXIST, "duplicate: SendID differs"); } +static void fetch_all_keys(int sk, struct tcp_ao_getsockopt *keys) +{ + socklen_t optlen = sizeof(struct tcp_ao_getsockopt); + + memset(keys, 0, sizeof(struct tcp_ao_getsockopt) * FILTER_TEST_NKEYS); + keys[0].get_all = 1; + keys[0].nkeys = FILTER_TEST_NKEYS; + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, &keys[0], &optlen)) + test_error("getsockopt"); +} + +static int prepare_test_keys(struct tcp_ao_getsockopt *keys) +{ + const char *test_password = "Test password number "; + struct tcp_ao_add test_ao[FILTER_TEST_NKEYS]; + char test_password_scratch[64] = {}; + u8 rcvid = 100, sndid = 100; + int sk; + + sk = socket(test_family, SOCK_STREAM, IPPROTO_TCP); + if (sk < 0) + test_error("socket()"); + + for (int i = 0; i < FILTER_TEST_NKEYS; i++) { + snprintf(test_password_scratch, 64, "%s %d", test_password, i); + test_prepare_key(&test_ao[i], DEFAULT_TEST_ALGO, this_ip_dest, + false, false, DEFAULT_TEST_PREFIX, 0, sndid++, + rcvid++, 0, 0, strlen(test_password_scratch), + test_password_scratch); + } + test_ao[0].set_current = 1; + test_ao[1].set_rnext = 1; + /* One key with a different addr and overlapping sndid, rcvid */ + tcp_addr_to_sockaddr_in(&test_ao[2].addr, &this_ip_addr, 0); + test_ao[2].sndid = 100; + test_ao[2].rcvid = 100; + + /* Add keys in a random order */ + for (int i = 0; i < FILTER_TEST_NKEYS; i++) { + int randidx = rand() % (FILTER_TEST_NKEYS - i); + + if (setsockopt(sk, IPPROTO_TCP, TCP_AO_ADD_KEY, + &test_ao[randidx], sizeof(struct tcp_ao_add))) + test_error("setsockopt()"); + memcpy(&test_ao[randidx], &test_ao[FILTER_TEST_NKEYS - 1 - i], + sizeof(struct tcp_ao_add)); + } + + fetch_all_keys(sk, keys); + + return sk; +} + +/* Assumes passwords are unique */ +static int compare_mkts(struct tcp_ao_getsockopt *expected, int nexpected, + struct tcp_ao_getsockopt *actual, int nactual) +{ + int matches = 0; + + for (int i = 0; i < nexpected; i++) { + for (int j = 0; j < nactual; j++) { + if (memcmp(expected[i].key, actual[j].key, + TCP_AO_MAXKEYLEN) == 0) + matches++; + } + } + return nexpected - matches; +} + +static void filter_keys_checked(int sk, struct tcp_ao_getsockopt *filter, + struct tcp_ao_getsockopt *expected, + unsigned int nexpected, const char *tst) +{ + struct tcp_ao_getsockopt filtered_keys[FILTER_TEST_NKEYS] = {}; + struct tcp_ao_getsockopt all_keys[FILTER_TEST_NKEYS] = {}; + socklen_t len = sizeof(struct tcp_ao_getsockopt); + + fetch_all_keys(sk, all_keys); + memcpy(&filtered_keys[0], filter, sizeof(struct tcp_ao_getsockopt)); + filtered_keys[0].nkeys = FILTER_TEST_NKEYS; + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, filtered_keys, &len)) + test_error("getsockopt"); + if (filtered_keys[0].nkeys != nexpected) { + test_fail("wrong nr of keys, expected %u got %u", nexpected, + filtered_keys[0].nkeys); + goto out_close; + } + if (compare_mkts(expected, nexpected, filtered_keys, + filtered_keys[0].nkeys)) { + test_fail("got wrong keys back"); + goto out_close; + } + test_ok("filter keys: %s", tst); + +out_close: + close(sk); + memset(filter, 0, sizeof(struct tcp_ao_getsockopt)); +} + +static void filter_tests(void) +{ + struct tcp_ao_getsockopt original_keys[FILTER_TEST_NKEYS]; + struct tcp_ao_getsockopt expected_keys[FILTER_TEST_NKEYS]; + struct tcp_ao_getsockopt filter = {}; + int sk, f, nmatches; + socklen_t len; + + f = 2; + sk = prepare_test_keys(original_keys); + filter.rcvid = original_keys[f].rcvid; + filter.sndid = original_keys[f].sndid; + memcpy(&filter.addr, &original_keys[f].addr, + sizeof(original_keys[f].addr)); + filter.prefix = original_keys[f].prefix; + filter_keys_checked(sk, &filter, &original_keys[f], 1, + "by sndid, rcvid, address"); + + f = -1; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].is_current) { + f = i; + break; + } + } + if (f < 0) + test_error("No current key after adding one"); + filter.is_current = 1; + filter_keys_checked(sk, &filter, &original_keys[f], 1, "by is_current"); + + f = -1; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].is_rnext) { + f = i; + break; + } + } + if (f < 0) + test_error("No rnext key after adding one"); + filter.is_rnext = 1; + filter_keys_checked(sk, &filter, &original_keys[f], 1, "by is_rnext"); + + f = -1; + nmatches = 0; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].sndid == 100) { + f = i; + memcpy(&expected_keys[nmatches], &original_keys[i], + sizeof(struct tcp_ao_getsockopt)); + nmatches++; + } + } + if (f < 0) + test_error("No key for sndid 100"); + if (nmatches != 2) + test_error("Should have 2 keys with sndid 100"); + filter.rcvid = original_keys[f].rcvid; + filter.sndid = original_keys[f].sndid; + filter.addr.ss_family = test_family; + filter_keys_checked(sk, &filter, expected_keys, nmatches, + "by sndid, rcvid"); + + sk = prepare_test_keys(original_keys); + filter.get_all = 1; + filter.nkeys = FILTER_TEST_NKEYS / 2; + len = sizeof(struct tcp_ao_getsockopt); + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, &filter, &len)) + test_error("getsockopt"); + if (filter.nkeys == FILTER_TEST_NKEYS) + test_ok("filter keys: correct nkeys when in.nkeys < matches"); + else + test_fail("filter keys: wrong nkeys, expected %u got %u", + FILTER_TEST_NKEYS, filter.nkeys); +} + static void *client_fn(void *arg) { if (inet_pton(TEST_FAMILY, __TEST_CLIENT_IP(2), &tcp_md5_client) != 1) test_error("Can't convert ip address"); extend_tests(); einval_tests(); + filter_tests(); duplicate_tests(); - /* - * TODO: check getsockopt(TCP_AO_GET_KEYS) with different filters - * returning proper nr & keys; - */ return NULL; } int main(int argc, char *argv[]) { - test_init(121, client_fn, NULL); + test_init(126, client_fn, NULL); return 0; } diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index f27a12d2a2c9..1a706d03bb6b 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -266,6 +266,25 @@ TEST_F(tls_basic, bad_cipher) EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, sizeof(struct tls12_crypto_info_aes_gcm_128)), -1); } +TEST_F(tls_basic, recseq_wrap) +{ + struct tls_crypto_info_keys tls12; + char const *test_str = "test_read"; + int send_len = 10; + + if (self->notls) + SKIP(return, "no TLS support"); + + tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_GCM_128, &tls12); + memset(&tls12.aes128.rec_seq, 0xff, sizeof(tls12.aes128.rec_seq)); + + ASSERT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0); + ASSERT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0); + + EXPECT_EQ(send(self->fd, test_str, send_len, 0), -1); + EXPECT_EQ(errno, EBADMSG); +} + FIXTURE(tls) { int fd, cfd; diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c index d626f22f9550..dae91eb97d69 100644 --- a/tools/testing/selftests/net/txtimestamp.c +++ b/tools/testing/selftests/net/txtimestamp.c @@ -77,6 +77,8 @@ static bool cfg_epollet; static bool cfg_do_listen; static uint16_t dest_port = 9000; static bool cfg_print_nsec; +static uint32_t ts_opt_id; +static bool cfg_use_cmsg_opt_id; static struct sockaddr_in daddr; static struct sockaddr_in6 daddr6; @@ -136,12 +138,13 @@ static void validate_key(int tskey, int tstype) /* compare key for each subsequent request * must only test for one type, the first one requested */ - if (saved_tskey == -1) + if (saved_tskey == -1 || cfg_use_cmsg_opt_id) saved_tskey_type = tstype; else if (saved_tskey_type != tstype) return; stepsize = cfg_proto == SOCK_STREAM ? cfg_payload_len : 1; + stepsize = cfg_use_cmsg_opt_id ? 0 : stepsize; if (tskey != saved_tskey + stepsize) { fprintf(stderr, "ERROR: key %d, expected %d\n", tskey, saved_tskey + stepsize); @@ -484,7 +487,7 @@ static void fill_header_udp(void *p, bool is_ipv4) static void do_test(int family, unsigned int report_opt) { - char control[CMSG_SPACE(sizeof(uint32_t))]; + char control[2 * CMSG_SPACE(sizeof(uint32_t))]; struct sockaddr_ll laddr; unsigned int sock_opt; struct cmsghdr *cmsg; @@ -624,18 +627,32 @@ static void do_test(int family, unsigned int report_opt) msg.msg_iov = &iov; msg.msg_iovlen = 1; - if (cfg_use_cmsg) { + if (cfg_use_cmsg || cfg_use_cmsg_opt_id) { memset(control, 0, sizeof(control)); msg.msg_control = control; - msg.msg_controllen = sizeof(control); + msg.msg_controllen = cfg_use_cmsg * CMSG_SPACE(sizeof(uint32_t)); + msg.msg_controllen += cfg_use_cmsg_opt_id * CMSG_SPACE(sizeof(uint32_t)); - cmsg = CMSG_FIRSTHDR(&msg); - cmsg->cmsg_level = SOL_SOCKET; - cmsg->cmsg_type = SO_TIMESTAMPING; - cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); + cmsg = NULL; + if (cfg_use_cmsg) { + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SO_TIMESTAMPING; + cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); + + *((uint32_t *)CMSG_DATA(cmsg)) = report_opt; + } + if (cfg_use_cmsg_opt_id) { + cmsg = cmsg ? CMSG_NXTHDR(&msg, cmsg) : CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_TS_OPT_ID; + cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); + + *((uint32_t *)CMSG_DATA(cmsg)) = ts_opt_id; + saved_tskey = ts_opt_id; + } - *((uint32_t *) CMSG_DATA(cmsg)) = report_opt; } val = sendmsg(fd, &msg, 0); @@ -685,6 +702,7 @@ static void __attribute__((noreturn)) usage(const char *filepath) " -L listen on hostname and port\n" " -n: set no-payload option\n" " -N: print timestamps and durations in nsec (instead of usec)\n" + " -o N: use SCM_TS_OPT_ID control message to provide N as tskey\n" " -p N: connect to port N\n" " -P: use PF_PACKET\n" " -r: use raw\n" @@ -705,7 +723,7 @@ static void parse_opt(int argc, char **argv) int c; while ((c = getopt(argc, argv, - "46bc:CeEFhIl:LnNp:PrRS:t:uv:V:x")) != -1) { + "46bc:CeEFhIl:LnNo:p:PrRS:t:uv:V:x")) != -1) { switch (c) { case '4': do_ipv6 = 0; @@ -746,6 +764,10 @@ static void parse_opt(int argc, char **argv) case 'N': cfg_print_nsec = true; break; + case 'o': + ts_opt_id = strtoul(optarg, NULL, 10); + cfg_use_cmsg_opt_id = true; + break; case 'p': dest_port = strtoul(optarg, NULL, 10); break; @@ -803,6 +825,8 @@ static void parse_opt(int argc, char **argv) error(1, 0, "cannot ask for pktinfo over pf_packet"); if (cfg_busy_poll && cfg_use_epoll) error(1, 0, "pass epoll or busy_poll, not both"); + if (cfg_proto == SOCK_STREAM && cfg_use_cmsg_opt_id) + error(1, 0, "TCP sockets don't support SCM_TS_OPT_ID"); if (optind != argc - 1) error(1, 0, "missing required hostname argument"); diff --git a/tools/testing/selftests/net/txtimestamp.sh b/tools/testing/selftests/net/txtimestamp.sh index 25baca4b148e..fe4649bb8786 100755 --- a/tools/testing/selftests/net/txtimestamp.sh +++ b/tools/testing/selftests/net/txtimestamp.sh @@ -37,11 +37,13 @@ run_test_v4v6() { run_test_tcpudpraw() { local -r args=$@ - run_test_v4v6 ${args} # tcp - run_test_v4v6 ${args} -u # udp - run_test_v4v6 ${args} -r # raw - run_test_v4v6 ${args} -R # raw (IPPROTO_RAW) - run_test_v4v6 ${args} -P # pf_packet + run_test_v4v6 ${args} # tcp + run_test_v4v6 ${args} -u # udp + run_test_v4v6 ${args} -u -o 42 # udp with fixed tskey + run_test_v4v6 ${args} -r # raw + run_test_v4v6 ${args} -r -o 42 # raw + run_test_v4v6 ${args} -R # raw (IPPROTO_RAW) + run_test_v4v6 ${args} -P # pf_packet } run_test_all() { diff --git a/tools/testing/selftests/net/veth.sh b/tools/testing/selftests/net/veth.sh index 4f1edbafb946..6bb7dfaa30b6 100755 --- a/tools/testing/selftests/net/veth.sh +++ b/tools/testing/selftests/net/veth.sh @@ -46,8 +46,6 @@ create_ns() { ip -n $BASE$ns addr add dev veth$ns $BM_NET_V4$ns/24 ip -n $BASE$ns addr add dev veth$ns $BM_NET_V6$ns/64 nodad done - echo "#kernel" > $BASE - chmod go-rw $BASE } __chk_flag() { diff --git a/tools/testing/selftests/net/ynl.mk b/tools/testing/selftests/net/ynl.mk index 1ef24119def0..d43afe243779 100644 --- a/tools/testing/selftests/net/ynl.mk +++ b/tools/testing/selftests/net/ynl.mk @@ -9,6 +9,8 @@ # YNL_GEN_FILES: TEST_GEN_FILES which need YNL YNL_OUTPUTS := $(patsubst %,$(OUTPUT)/%,$(YNL_GEN_FILES)) +YNL_SPECS := \ + $(patsubst %,$(top_srcdir)/Documentation/netlink/specs/%.yaml,$(YNL_GENS)) $(YNL_OUTPUTS): $(OUTPUT)/libynl.a $(YNL_OUTPUTS): CFLAGS += \ @@ -16,10 +18,20 @@ $(YNL_OUTPUTS): CFLAGS += \ -I$(top_srcdir)/tools/net/ynl/lib/ \ -I$(top_srcdir)/tools/net/ynl/generated/ -$(OUTPUT)/libynl.a: +# Make sure we rebuild libynl if user added a new family. We can't easily +# depend on the contents of a variable so create a fake file with a hash. +YNL_GENS_HASH := $(shell echo $(YNL_GENS) | sha1sum | cut -c1-8) +$(OUTPUT)/.libynl-$(YNL_GENS_HASH).sig: + $(Q)rm -f $(OUTPUT)/.libynl-*.sig + $(Q)touch $(OUTPUT)/.libynl-$(YNL_GENS_HASH).sig + +$(OUTPUT)/libynl.a: $(YNL_SPECS) $(OUTPUT)/.libynl-$(YNL_GENS_HASH).sig + $(Q)rm -f $(top_srcdir)/tools/net/ynl/libynl.a $(Q)$(MAKE) -C $(top_srcdir)/tools/net/ynl GENS="$(YNL_GENS)" libynl.a $(Q)cp $(top_srcdir)/tools/net/ynl/libynl.a $(OUTPUT)/libynl.a EXTRA_CLEAN += \ $(top_srcdir)/tools/net/ynl/lib/__pycache__ \ - $(top_srcdir)/tools/net/ynl/lib/*.[ado] + $(top_srcdir)/tools/net/ynl/lib/*.[ado] \ + $(OUTPUT)/.libynl-*.sig \ + $(OUTPUT)/libynl.a diff --git a/tools/testing/selftests/ptp/testptp.c b/tools/testing/selftests/ptp/testptp.c index 011252fe238c..58064151f2c8 100644 --- a/tools/testing/selftests/ptp/testptp.c +++ b/tools/testing/selftests/ptp/testptp.c @@ -146,6 +146,7 @@ static void usage(char *progname) " -T val set the ptp clock time to 'val' seconds\n" " -x val get an extended ptp clock time with the desired number of samples (up to %d)\n" " -X get a ptp clock cross timestamp\n" + " -y val pre/post tstamp timebase to use {realtime|monotonic|monotonic-raw}\n" " -z test combinations of rising/falling external time stamp flags\n", progname, PTP_MAX_SAMPLES); } @@ -189,6 +190,7 @@ int main(int argc, char *argv[]) int seconds = 0; int settime = 0; int channel = -1; + clockid_t ext_clockid = CLOCK_REALTIME; int64_t t1, t2, tp; int64_t interval, offset; @@ -198,7 +200,7 @@ int main(int argc, char *argv[]) progname = strrchr(argv[0], '/'); progname = progname ? 1+progname : argv[0]; - while (EOF != (c = getopt(argc, argv, "cd:e:f:F:ghH:i:k:lL:n:o:p:P:sSt:T:w:x:Xz"))) { + while (EOF != (c = getopt(argc, argv, "cd:e:f:F:ghH:i:k:lL:n:o:p:P:sSt:T:w:x:Xy:z"))) { switch (c) { case 'c': capabilities = 1; @@ -278,6 +280,21 @@ int main(int argc, char *argv[]) case 'X': getcross = 1; break; + case 'y': + if (!strcasecmp(optarg, "realtime")) + ext_clockid = CLOCK_REALTIME; + else if (!strcasecmp(optarg, "monotonic")) + ext_clockid = CLOCK_MONOTONIC; + else if (!strcasecmp(optarg, "monotonic-raw")) + ext_clockid = CLOCK_MONOTONIC_RAW; + else { + fprintf(stderr, + "type needs to be realtime, monotonic or monotonic-raw; was given %s\n", + optarg); + return -1; + } + break; + case 'z': flagtest = 1; break; @@ -566,6 +583,7 @@ int main(int argc, char *argv[]) } soe->n_samples = getextended; + soe->clockid = ext_clockid; if (ioctl(fd, PTP_SYS_OFFSET_EXTENDED, soe)) { perror("PTP_SYS_OFFSET_EXTENDED"); @@ -574,12 +592,46 @@ int main(int argc, char *argv[]) getextended); for (i = 0; i < getextended; i++) { - printf("sample #%2d: system time before: %lld.%09u\n", - i, soe->ts[i][0].sec, soe->ts[i][0].nsec); + switch (ext_clockid) { + case CLOCK_REALTIME: + printf("sample #%2d: real time before: %lld.%09u\n", + i, soe->ts[i][0].sec, + soe->ts[i][0].nsec); + break; + case CLOCK_MONOTONIC: + printf("sample #%2d: monotonic time before: %lld.%09u\n", + i, soe->ts[i][0].sec, + soe->ts[i][0].nsec); + break; + case CLOCK_MONOTONIC_RAW: + printf("sample #%2d: monotonic-raw time before: %lld.%09u\n", + i, soe->ts[i][0].sec, + soe->ts[i][0].nsec); + break; + default: + break; + } printf(" phc time: %lld.%09u\n", soe->ts[i][1].sec, soe->ts[i][1].nsec); - printf(" system time after: %lld.%09u\n", - soe->ts[i][2].sec, soe->ts[i][2].nsec); + switch (ext_clockid) { + case CLOCK_REALTIME: + printf(" real time after: %lld.%09u\n", + soe->ts[i][2].sec, + soe->ts[i][2].nsec); + break; + case CLOCK_MONOTONIC: + printf(" monotonic time after: %lld.%09u\n", + soe->ts[i][2].sec, + soe->ts[i][2].nsec); + break; + case CLOCK_MONOTONIC_RAW: + printf(" monotonic-raw time after: %lld.%09u\n", + soe->ts[i][2].sec, + soe->ts[i][2].nsec); + break; + default: + break; + } } } diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json b/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json index d1278de8ebc3..c9309a44a87e 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json @@ -67,7 +67,7 @@ }, { "id": "4943", - "name": "Add basic filter with cmp ematch u32/link layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/link layer and multiple actions", "category": [ "filter", "basic" @@ -155,7 +155,7 @@ }, { "id": "32d8", - "name": "Add basic filter with cmp ematch u32/network layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/network layer and multiple actions", "category": [ "filter", "basic" @@ -243,7 +243,7 @@ }, { "id": "62d7", - "name": "Add basic filter with cmp ematch u32/transport layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/transport layer and multiple actions", "category": [ "filter", "basic" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json b/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json index 03723cf84379..35c9a7dbe1c4 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json @@ -67,7 +67,7 @@ }, { "id": "0234", - "name": "Add cgroup filter with cmp ematch u32/link layer and miltiple actions", + "name": "Add cgroup filter with cmp ematch u32/link layer and multiple actions", "category": [ "filter", "cgroup" @@ -155,7 +155,7 @@ }, { "id": "2733", - "name": "Add cgroup filter with cmp ematch u32/network layer and miltiple actions", + "name": "Add cgroup filter with cmp ematch u32/network layer and multiple actions", "category": [ "filter", "cgroup" @@ -1189,7 +1189,7 @@ }, { "id": "4319", - "name": "Replace cgroup filter with diffferent match", + "name": "Replace cgroup filter with different match", "category": [ "filter", "cgroup" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json b/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json index 58189327f644..996448afe31b 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json @@ -507,7 +507,7 @@ }, { "id": "4341", - "name": "Add flow filter with muliple ops", + "name": "Add flow filter with multiple ops", "category": [ "filter", "flow" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/route.json b/tools/testing/selftests/tc-testing/tc-tests/filters/route.json index 8d8de8f65aef..05cedca67cca 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/route.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/route.json @@ -111,7 +111,7 @@ }, { "id": "7994", - "name": "Add route filter with miltiple actions", + "name": "Add route filter with multiple actions", "category": [ "filter", "route" diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json new file mode 100644 index 000000000000..d3dd65b05b5f --- /dev/null +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json @@ -0,0 +1,98 @@ +[ + { + "id": "ca5e", + "name": "Check class delete notification for ffff:", + "category": [ + "qdisc" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: drr", + "$TC filter add dev $DUMMY parent 1: basic classid 1:1", + "$TC class add dev $DUMMY parent 1: classid 1:1 drr", + "$TC qdisc add dev $DUMMY parent 1:1 handle ffff: drr", + "$TC filter add dev $DUMMY parent ffff: basic classid ffff:1", + "$TC class add dev $DUMMY parent ffff: classid ffff:1 drr", + "$TC qdisc add dev $DUMMY parent ffff:1 netem delay 1s", + "ping -c1 -W0.01 -I $DUMMY 10.10.10.1 || true", + "$TC class del dev $DUMMY classid ffff:1", + "$TC class add dev $DUMMY parent ffff: classid ffff:1 drr" + ], + "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1", + "expExitCode": "1", + "verifyCmd": "$TC -s qdisc ls dev $DUMMY", + "matchPattern": "drr 1: root", + "matchCount": "1", + "teardown": [ + "$TC qdisc del dev $DUMMY root handle 1: drr", + "$IP addr del 10.10.10.10/24 dev $DUMMY" + ] + }, + { + "id": "e4b7", + "name": "Check class delete notification for root ffff:", + "category": [ + "qdisc" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle ffff: drr", + "$TC filter add dev $DUMMY parent ffff: basic classid ffff:1", + "$TC class add dev $DUMMY parent ffff: classid ffff:1 drr", + "$TC qdisc add dev $DUMMY parent ffff:1 netem delay 1s", + "ping -c1 -W0.01 -I $DUMMY 10.10.10.1 || true", + "$TC class del dev $DUMMY classid ffff:1", + "$TC class add dev $DUMMY parent ffff: classid ffff:1 drr" + ], + "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1", + "expExitCode": "1", + "verifyCmd": "$TC qdisc ls dev $DUMMY", + "matchPattern": "drr ffff: root", + "matchCount": "1", + "teardown": [ + "$TC qdisc del dev $DUMMY root handle ffff: drr", + "$IP addr del 10.10.10.10/24 dev $DUMMY" + ] + }, + { + "id": "33a9", + "name": "Check ingress is not searchable on backlog update", + "category": [ + "qdisc" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY ingress", + "$TC qdisc add dev $DUMMY root handle 1: drr", + "$TC filter add dev $DUMMY parent 1: basic classid 1:1", + "$TC class add dev $DUMMY parent 1: classid 1:1 drr", + "$TC qdisc add dev $DUMMY parent 1:1 handle 2: drr", + "$TC filter add dev $DUMMY parent 2: basic classid 2:1", + "$TC class add dev $DUMMY parent 2: classid 2:1 drr", + "$TC qdisc add dev $DUMMY parent 2:1 netem delay 1s", + "ping -c1 -W0.01 -I $DUMMY 10.10.10.1 || true" + ], + "cmdUnderTest": "$TC class del dev $DUMMY classid 2:1", + "expExitCode": "0", + "verifyCmd": "$TC qdisc ls dev $DUMMY", + "matchPattern": "drr 1: root", + "matchCount": "1", + "teardown": [ + "$TC qdisc del dev $DUMMY root handle 1: drr", + "$TC qdisc del dev $DUMMY ingress", + "$IP addr del 10.10.10.10/24 dev $DUMMY" + ] + } +] diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 405ff262ca93..55500f901fbc 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -332,6 +332,7 @@ waitiface $netns1 vethc waitiface $netns2 veths n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' +[[ -e /proc/sys/net/netfilter/nf_conntrack_udp_timeout ]] || modprobe nf_conntrack n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout' n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream' n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 |