summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-09bpf: Add task and task/file iterator targetsYonghong Song
Only the tasks belonging to "current" pid namespace are enumerated. For task/file target, the bpf program will have access to struct task_struct *task u32 fd struct file *file where fd/file is an open file for the task. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175911.2476407-1-yhs@fb.com
2020-05-09net: bpf: Add netlink and ipv6_route bpf_iter targetsYonghong Song
This patch added netlink and ipv6_route targets, using the same seq_ops (except show() and minor changes for stop()) for /proc/net/{netlink,ipv6_route}. The net namespace for these targets are the current net namespace at file open stage, similar to /proc/net/{netlink,ipv6_route} reference counting the net namespace at seq_file open stage. Since module is not supported for now, ipv6_route is supported only if the IPV6 is built-in, i.e., not compiled as a module. The restriction can be lifted once module is properly supported for bpf_iter. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175910.2476329-1-yhs@fb.com
2020-05-09bpf: Add bpf_map iteratorYonghong Song
Implement seq_file operations to traverse all bpf_maps. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175909.2476096-1-yhs@fb.com
2020-05-09bpf: Implement common macros/helpers for target iteratorsYonghong Song
Macro DEFINE_BPF_ITER_FUNC is implemented so target can define an init function to capture the BTF type which represents the target. The bpf_iter_meta is a structure holding meta data, common to all targets in the bpf program. Additional marker functions are called before or after bpf_seq_read() show()/next()/stop() callback functions to help calculate precise seq_num and whether call bpf_prog inside stop(). Two functions, bpf_iter_get_info() and bpf_iter_run_prog(), are implemented so target can get needed information from bpf_iter infrastructure and can run the program. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175907.2475956-1-yhs@fb.com
2020-05-09bpf: Create file bpf iteratorYonghong Song
To produce a file bpf iterator, the fd must be corresponding to a link_fd assocciated with a trace/iter program. When the pinned file is opened, a seq_file will be generated. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175906.2475893-1-yhs@fb.com
2020-05-09bpf: Create anonymous bpf iteratorYonghong Song
A new bpf command BPF_ITER_CREATE is added. The anonymous bpf iterator is seq_file based. The seq_file private data are referenced by targets. The bpf_iter infrastructure allocated additional space at seq_file->private before the space used by targets to store some meta data, e.g., prog: prog to run session_id: an unique id for each opened seq_file seq_num: how many times bpf programs are queried in this session done_stop: an internal state to decide whether bpf program should be called in seq_ops->stop() or not The seq_num will start from 0 for valid objects. The bpf program may see the same seq_num more than once if - seq_file buffer overflow happens and the same object is retried by bpf_seq_read(), or - the bpf program explicitly requests a retry of the same object Since module is not supported for bpf_iter, all target registeration happens at __init time, so there is no need to change bpf_iter_unreg_target() as it is used mostly in error path of the init function at which time no bpf iterators have been created yet. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175905.2475770-1-yhs@fb.com
2020-05-09bpf: Implement bpf_seq_read() for bpf iteratorYonghong Song
bpf iterator uses seq_file to provide a lossless way to transfer data to user space. But we want to call bpf program after all objects have been traversed, and bpf program may write additional data to the seq_file buffer. The current seq_read() does not work for this use case. Besides allowing stop() function to write to the buffer, the bpf_seq_read() also fixed the buffer size to one page. If any single call of show() or stop() will emit data more than one page to cause overflow, -E2BIG error code will be returned to user space. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175904.2475468-1-yhs@fb.com
2020-05-09bpf: Support bpf tracing/iter programs for BPF_LINK_UPDATEYonghong Song
Added BPF_LINK_UPDATE support for tracing/iter programs. This way, a file based bpf iterator, which holds a reference to the link, can have its bpf program updated without creating new files. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175902.2475262-1-yhs@fb.com
2020-05-09bpf: Support bpf tracing/iter programs for BPF_LINK_CREATEYonghong Song
Given a bpf program, the step to create an anonymous bpf iterator is: - create a bpf_iter_link, which combines bpf program and the target. In the future, there could be more information recorded in the link. A link_fd will be returned to the user space. - create an anonymous bpf iterator with the given link_fd. The bpf_iter_link can be pinned to bpffs mount file system to create a file based bpf iterator as well. The benefit to use of bpf_iter_link: - using bpf link simplifies design and implementation as bpf link is used for other tracing bpf programs. - for file based bpf iterator, bpf_iter_link provides a standard way to replace underlying bpf programs. - for both anonymous and free based iterators, bpf link query capability can be leveraged. The patch added support of tracing/iter programs for BPF_LINK_CREATE. A new link type BPF_LINK_TYPE_ITER is added to facilitate link querying. Currently, only prog_id is needed, so there is no additional in-kernel show_fdinfo() and fill_link_info() hook is needed for BPF_LINK_TYPE_ITER link. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175901.2475084-1-yhs@fb.com
2020-05-09bpf: Allow loading of a bpf_iter programYonghong Song
A bpf_iter program is a tracing program with attach type BPF_TRACE_ITER. The load attribute attach_btf_id is used by the verifier against a particular kernel function, which represents a target, e.g., __bpf_iter__bpf_map for target bpf_map which is implemented later. The program return value must be 0 or 1 for now. 0 : successful, except potential seq_file buffer overflow which is handled by seq_file reader. 1 : request to restart the same object In the future, other return values may be used for filtering or teminating the iterator. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175900.2474947-1-yhs@fb.com
2020-05-09bpf: Implement an interface to register bpf_iter targetsYonghong Song
The target can call bpf_iter_reg_target() to register itself. The needed information: target: target name seq_ops: the seq_file operations for the target init_seq_private target callback to initialize seq_priv during file open fini_seq_private target callback to clean up seq_priv during file release seq_priv_size: the private_data size needed by the seq_file operations The target name represents a target which provides a seq_ops for iterating objects. The target can provide two callback functions, init_seq_private and fini_seq_private, called during file open/release time. For example, /proc/net/{tcp6, ipv6_route, netlink, ...}, net name space needs to be setup properly during file open and released properly during file release. Function bpf_iter_unreg_target() is also implemented to unregister a particular target. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175859.2474669-1-yhs@fb.com
2020-05-09bpf: Allow any port in bpf_bind helperStanislav Fomichev
We want to have a tighter control on what ports we bind to in the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means connect() becomes slightly more expensive. The expensive part comes from the fact that we now need to call inet_csk_get_port() that verifies that the port is not used and allocates an entry in the hash table for it. Since we can't rely on "snum || !bind_address_no_port" to prevent us from calling POST_BIND hook anymore, let's add another bind flag to indicate that the call site is BPF program. v5: * fix wrong AF_INET (should be AF_INET6) in the bpf program for v6 v3: * More bpf_bind documentation refinements (Martin KaFai Lau) * Add UDP tests as well (Martin KaFai Lau) * Don't start the thread, just do socket+bind+listen (Martin KaFai Lau) v2: * Update documentation (Andrey Ignatov) * Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200508174611.228805-5-sdf@google.com
2020-05-09net: Refactor arguments of inet{,6}_bindStanislav Fomichev
The intent is to add an additional bind parameter in the next commit. Instead of adding another argument, let's convert all existing flag arguments into an extendable bit field. No functional changes. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200508174611.228805-4-sdf@google.com
2020-05-09selftests/bpf: Move existing common networking parts into network_helpersStanislav Fomichev
1. Move pkt_v4 and pkt_v6 into network_helpers and adjust the users. 2. Copy-paste spin_lock_thread into two tests that use it. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Link: https://lore.kernel.org/bpf/20200508174611.228805-3-sdf@google.com
2020-05-09selftests/bpf: Generalize helpers to control background listenerStanislav Fomichev
Move the following routines that let us start a background listener thread and connect to a server by fd to the test_prog: * start_server - socket+bind+listen * connect_to_fd - connect to the server identified by fd These will be used in the next commit. Also, extend these helpers to support AF_INET6 and accept the family as an argument. v5: * drop pthread.h (Martin KaFai Lau) * add SO_SNDTIMEO (Martin KaFai Lau) v4: * export extra helper to start server without a thread (Martin KaFai Lau) * tcp_rtt is no longer starting background thread (Martin KaFai Lau) v2: * put helpers into network_helpers.c (Andrii Nakryiko) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200508174611.228805-2-sdf@google.com
2020-05-07bpf, i386: Remove unneeded conversion to boolJason Yan
The '==' expression itself is bool, no need to convert it to bool again. This fixes the following coccicheck warning: arch/x86/net/bpf_jit_comp32.c:1478:50-55: WARNING: conversion to bool not needed here arch/x86/net/bpf_jit_comp32.c:1479:50-55: WARNING: conversion to bool not needed here Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200506140352.37154-1-yanaijie@huawei.com
2020-05-06Merge tag 'perf-for-bpf-2020-05-06' of ↵Alexei Starovoitov
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into bpf-next CAP_PERFMON for BPF
2020-05-06Merge branch 'bpf-rv64-jit'Daniel Borkmann
Luke Nelson says: ==================== This patch series introduces a set of optimizations to the BPF JIT on RV64. The optimizations are related to the verifier zero-extension optimization and BPF_JMP BPF_K. We tested the optimizations on a QEMU riscv64 virt machine, using lib/test_bpf and test_verifier, and formally verified their correctness using Serval. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-05-06bpf, riscv: Optimize BPF_JSET BPF_K using andi on RV64Luke Nelson
This patch optimizes BPF_JSET BPF_K by using a RISC-V andi instruction when the BPF immediate fits in 12 bits, instead of first loading the immediate to a temporary register. Examples of generated code with and without this optimization: BPF_JMP_IMM(BPF_JSET, R1, 2, 1) without optimization: 20: li t1,2 24: and t1,a0,t1 28: bnez t1,0x30 BPF_JMP_IMM(BPF_JSET, R1, 2, 1) with optimization: 20: andi t1,a0,2 24: bnez t1,0x2c BPF_JMP32_IMM(BPF_JSET, R1, 2, 1) without optimization: 20: li t1,2 24: mv t2,a0 28: slli t2,t2,0x20 2c: srli t2,t2,0x20 30: slli t1,t1,0x20 34: srli t1,t1,0x20 38: and t1,t2,t1 3c: bnez t1,0x44 BPF_JMP32_IMM(BPF_JSET, R1, 2, 1) with optimization: 20: andi t1,a0,2 24: bnez t1,0x2c In these examples, because the upper 32 bits of the sign-extended immediate are 0, BPF_JMP BPF_JSET and BPF_JMP32 BPF_JSET are equivalent and therefore the JIT produces identical code for them. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20200506000320.28965-5-luke.r.nels@gmail.com
2020-05-06bpf, riscv: Optimize BPF_JMP BPF_K when imm == 0 on RV64Luke Nelson
This patch adds an optimization to BPF_JMP (32- and 64-bit) BPF_K for when the BPF immediate is zero. When the immediate is zero, the code can directly use the RISC-V zero register instead of loading a zero immediate to a temporary register first. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20200506000320.28965-4-luke.r.nels@gmail.com
2020-05-06bpf, riscv: Optimize FROM_LE using verifier_zext on RV64Luke Nelson
This patch adds two optimizations for BPF_ALU BPF_END BPF_FROM_LE in the RV64 BPF JIT. First, it enables the verifier zero-extension optimization to avoid zero extension when imm == 32. Second, it avoids generating code for imm == 64, since it is equivalent to a no-op. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20200506000320.28965-3-luke.r.nels@gmail.com
2020-05-06bpf, riscv: Enable missing verifier_zext optimizations on RV64Luke Nelson
Commit 66d0d5a854a6 ("riscv: bpf: eliminate zero extension code-gen") added support for the verifier zero-extension optimization on RV64 and commit 46dd3d7d287b ("bpf, riscv: Enable zext optimization for more RV64G ALU ops") enabled it for more instruction cases. However, BPF_LSH BPF_X and BPF_{LSH,RSH,ARSH} BPF_K are still missing the optimization. This patch enables the zero-extension optimization for these remaining cases. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20200506000320.28965-2-luke.r.nels@gmail.com
2020-05-05sysctl: Fix unused function warningArnd Bergmann
The newly added bpf_stats_handler function has the wrong #ifdef check around it, leading to an unused-function warning when CONFIG_SYSCTL is disabled: kernel/sysctl.c:205:12: error: unused function 'bpf_stats_handler' [-Werror,-Wunused-function] static int bpf_stats_handler(struct ctl_table *table, int write, Fix the check to match the reference. Fixes: d46edd671a14 ("bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200505140734.503701-1-arnd@arndb.de
2020-05-04xsk: Remove unnecessary member in xdp_umemMagnus Karlsson
Remove the unnecessary member of address in struct xdp_umem as it is only used during the umem registration. No need to carry this around as it is not used during run-time nor when unregistering the umem. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1588599232-24897-3-git-send-email-magnus.karlsson@intel.com
2020-05-04xsk: Change two variable names for increased clarityMagnus Karlsson
Change two variables names so that it is clearer what they represent. The first one is xsk_list that in fact only contains the list of AF_XDP sockets with a Tx component. Change this to xsk_tx_list for improved clarity. The second variable is size in the ring structure. One might think that this is the size of the ring, but it is in fact the size of the umem, copied into the ring structure to improve performance. Rename this variable umem_size to avoid any confusion. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1588599232-24897-2-git-send-email-magnus.karlsson@intel.com
2020-05-04bpf: Avoid gcc-10 stringop-overflow warning in struct bpf_progArnd Bergmann
gcc-10 warns about accesses to zero-length arrays: kernel/bpf/core.c: In function 'bpf_patch_insn_single': cc1: warning: writing 8 bytes into a region of size 0 [-Wstringop-overflow=] In file included from kernel/bpf/core.c:21: include/linux/filter.h:550:20: note: at offset 0 to object 'insnsi' with size 0 declared here 550 | struct bpf_insn insnsi[0]; | ^~~~~~ In this case, we really want to have two flexible-array members, but that is not possible. Removing the union to make insnsi a flexible-array member while leaving insns as a zero-length array fixes the warning, as nothing writes to the other one in that way. This trick only works on linux-3.18 or higher, as older versions had additional members in the union. Fixes: 60a3b2253c41 ("net: bpf: make eBPF interpreter images read-only") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200430213101.135134-6-arnd@arndb.de
2020-05-04bpf, arm: Optimize ALU ARSH K using asr immediate instructionLuke Nelson
This patch adds an optimization that uses the asr immediate instruction for BPF_ALU BPF_ARSH BPF_K, rather than loading the immediate to a temporary register. This is similar to existing code for handling BPF_ALU BPF_{LSH,RSH} BPF_K. This optimization saves two instructions and is more consistent with LSH and RSH. Example of the code generated for BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5) before the optimization: 2c: mov r8, #5 30: mov r9, #0 34: asr r0, r0, r8 and after optimization: 2c: asr r0, r0, #5 Tested on QEMU using lib/test_bpf and test_verifier. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200501020210.32294-3-luke.r.nels@gmail.com
2020-05-04bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instructionLuke Nelson
This patch optimizes the code generated by emit_a32_arsh_r64, which handles the BPF_ALU64 BPF_ARSH BPF_X instruction. The original code uses a conditional B followed by an unconditional ORR. The optimization saves one instruction by removing the B instruction and using a conditional ORR (with an inverted condition). Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), before optimization: 34: rsb ip, r2, #32 38: subs r9, r2, #32 3c: lsr lr, r0, r2 40: orr lr, lr, r1, lsl ip 44: bmi 0x4c 48: orr lr, lr, r1, asr r9 4c: asr ip, r1, r2 50: mov r0, lr 54: mov r1, ip and after optimization: 34: rsb ip, r2, #32 38: subs r9, r2, #32 3c: lsr lr, r0, r2 40: orr lr, lr, r1, lsl ip 44: orrpl lr, lr, r1, asr r9 48: asr ip, r1, r2 4c: mov r0, lr 50: mov r1, ip Tested on QEMU using lib/test_bpf and test_verifier. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com
2020-05-03Merge branch 'net-smc-add-and-delete-link-processing'David S. Miller
Karsten Graul says: ==================== net/smc: add and delete link processing These patches add the 'add link' and 'delete link' processing as SMC server and client. This processing allows to establish and remove links of a link group dynamically. v2: Fix mess up with unused static functions. Merge patch 8 into patch 4. Postpone patch 13 to next series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: enqueue local LLC messagesKarsten Graul
As SMC server, when a second link was deleted, trigger the setup of an asymmetric link. Do this by enqueueing a local ADD_LINK message which is processed by the LLC layer as if it were received from peer. Do the same when a new IB port became active and a new link could be created. smc_llc_srv_add_link_local() enqueues a local ADD_LINK message. And smc_llc_srv_delete_link_local() is used the same way to enqueue a local DELETE_LINK message. This is used when an IB port is no longer active. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: delete link processing as SMC serverKarsten Graul
Add smc_llc_process_srv_delete_link() to process a DELETE_LINK request as SMC server. When the request is to delete ALL links then terminate the whole link group. If not, find the link to delete by its link_id, send the DELETE_LINK request LLC message and wait for the response. No matter if a response was received, clear the deleted link and update the link group state. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: delete link processing as SMC clientKarsten Graul
Add smc_llc_process_cli_delete_link() to process a DELETE_LINK request as SMC client. When the request is to delete ALL links then terminate the whole link group. If not, find the link to delete by its link_id, send the DELETE_LINK response LLC message and then clear the deleted link. Finally determine and update the link group state. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: llc_del_link_work and use the LLC flow for delete linkKarsten Graul
Introduce a work that is scheduled when a new DELETE_LINK LLC request is received. The work will call either the SMC client or SMC server DELETE_LINK processing. And use the LLC flow framework to process incoming DELETE_LINK LLC messages, scheduling the llc_del_link_work for those events. With these changes smc_lgr_forget() is only called by one function and can be migrated into smc_lgr_cleanup_early(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: delete an asymmetric link as SMC serverKarsten Graul
When a link group moved from asymmetric to symmetric state then the dangling asymmetric link can be deleted. Add smc_llc_find_asym_link() to find the respective link and add smc_llc_delete_asym_link() to delete it. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: final part of add link processing as SMC serverKarsten Graul
This patch finalizes the ADD_LINK processing of new links. Send the CONFIRM_LINK request to the peer, receive the response and set link state to ACTIVE. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: rkey processing for a new link as SMC serverKarsten Graul
Part of SMC server new link establishment is the exchange of rkeys for used buffers. Loop over all used RMB buffers and send ADD_LINK_CONTINUE LLC messages to the peer. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: first part of add link processing as SMC serverKarsten Graul
First set of functions to process an ADD_LINK LLC request as an SMC server. Find an alternate IB device, determine the new link group type and get the index for the new link. Then initialize the link and send the ADD_LINK LLC message to the peer. Save the contents of the response, ready the link, map all used buffers and register the buffers with the IB device. If any error occurs, stop the processing and clear the link. And call smc_llc_srv_add_link() in af_smc.c to start second link establishment after the initial link of a link group was created. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: final part of add link processing as SMC clientKarsten Graul
This patch finalizes the ADD_LINK processing of new links. Receive the CONFIRM_LINK request from peer, complete the link initialization, register all used buffers with the IB device and finally send the CONFIRM_LINK response, which completes the ADD_LINK processing. And activate smc_llc_cli_add_link() in af_smc.c. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: rkey processing for a new link as SMC clientKarsten Graul
Part of the SMC client new link establishment process is the exchange of rkeys for all used buffers. Add new LLC message type ADD_LINK_CONTINUE which is used to exchange rkeys of all current RMB buffers. Add functions to iterate over all used RMB buffers of the link group, and implement the ADD_LINK_CONTINUE processing. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net/smc: first part of add link processing as SMC clientKarsten Graul
First set of functions to process an ADD_LINK LLC request as an SMC client. Find an alternate IB device, determine the new link group type and get the index for the new link. Then ready the link, map the buffers and send an ADD_LINK LLC response. If any error occurs, send a reject LLC message and terminate the processing. Add smc_llc_alloc_alt_link() to find a free link index for a new link, depending on the new link group type. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03Merge branch 'Enhance-current-features-in-ena-driver'David S. Miller
Sameeh Jubran says: ==================== Enhance current features in ena driver Difference from v2: * dropped patch "net: ena: move llq configuration from ena_probe to ena_device_init()" * reworked patch ""net: ena: implement ena_com_get_admin_polling_mode() to drop the prototype Difference from v1: * reodered paches #01 and #02. * dropped adding Rx/Tx drops to ethtool in patch #08 V1: This patchset introduces the following: * minor changes to RSS feature * add total rx and tx drop counter * add unmask_interrupt counter for ethtool statistics * add missing implementation for ena_com_get_admin_polling_mode() * some minor code clean-up and cosmetics * use SHUTDOWN as reset reason when closing interface ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: cosmetic: extract code to ena_indirection_table_set()Arthur Kiyanovski
Extract code to ena_indirection_table_set() to make the code cleaner. Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: cosmetic: remove unnecessary spaces and tabs in ena_com.h macrosSameeh Jubran
The macros in ena_com.h have inconsistent spaces between the macro name and it's value. This commit sets all the macros to have a single space between the name and value. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: use SHUTDOWN as reset reason when closing interfaceSameeh Jubran
The 'ENA_REGS_RESET_SHUTDOWN' enum indicates a normal driver shutdown / removal procedure. Also, a comment is added to one of the reset reason assignments for code clarity. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: drop superfluous prototypeArthur Kiyanovski
Before this commit there was a function prototype named ena_com_get_ena_admin_polling_mode() that was never implemented. This patch simply deletes it. Signed-off-by: Igor Chauskin <igorch@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: add support for reporting of packet dropsSameeh Jubran
1. Add support for getting tx drops from the device and saving them in the driver. 2. Report tx via netdev stats. Signed-off-by: Igor Chauskin <igorch@amazon.com> Signed-off-by: Guy Tzalik <gtzalik@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: add unmask interrupts statistics to ethtoolSameeh Jubran
Add unmask interrupts statistics to ethtool. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: remove code that does nothingSameeh Jubran
Both key and func parameters are pointers on the stack. Setting them to NULL does nothing. The original intent was to leave the key and func unset in this case, but for this to happen nothing needs to be done as the calling function ethtool_get_rxfh() already clears key and func. This commit removes the above described useless code. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: changes to RSS hash key allocationSameeh Jubran
This commit contains 2 cosmetic changes: 1. Use ena_com_check_supported_feature_id() in ena_com_hash_key_fill_default_key() instead of rewriting its implementation. This also saves us a superfluous admin command by using the cached value. 2. Change if conditions in ena_com_rss_init() to be clearer. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-03net: ena: change default RSS hash function to ToeplitzArthur Kiyanovski
Currently in the driver we are setting the hash function to be CRC32. Starting with this commit we want to change the default behaviour so that we set the hash function to be Toeplitz instead. Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>