summaryrefslogtreecommitdiff
path: root/tools/lib/bpf
AgeCommit message (Collapse)Author
2023-12-06libbpf: add BPF token support to bpf_map_create() APIAndrii Nakryiko
Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06libbpf: add bpf_token_create() APIAndrii Nakryiko
Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-28libbpf: Add st_type argument to elf_resolve_syms_offsets functionJiri Olsa
We need to get offsets for static variables in following changes, so making elf_resolve_syms_offsets to take st_type value as argument and passing it to elf_sym_iter_new. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231125193130.834322-2-jolsa@kernel.org
2023-11-23libbpf: Start v1.4 development cycleEduard Zingerman
Bump libbpf.map to v1.4.0 to start a new libbpf version cycle. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231123000439.12025-1-eddyz87@gmail.com
2023-11-09libbpf: Fix potential uninitialized tail padding with LIBBPF_OPTS_RESETYonghong Song
Martin reported that there is a libbpf complaining of non-zero-value tail padding with LIBBPF_OPTS_RESET macro if struct bpf_netkit_opts is modified to have a 4-byte tail padding. This only happens to clang compiler. The commend line is: ./test_progs -t tc_netkit_multi_links Martin and I did some investigation and found this indeed the case and the following are the investigation details. Clang: clang version 18.0.0 <I tried clang15/16/17 and they all have similar results> tools/lib/bpf/libbpf_common.h: #define LIBBPF_OPTS_RESET(NAME, ...) \ do { \ memset(&NAME, 0, sizeof(NAME)); \ NAME = (typeof(NAME)) { \ .sz = sizeof(NAME), \ __VA_ARGS__ \ }; \ } while (0) #endif tools/lib/bpf/libbpf.h: struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility */ size_t sz; __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; size_t :0; }; #define bpf_netkit_opts__last_field expected_revision In the above struct bpf_netkit_opts, there is no tail padding. prog_tests/tc_netkit.c: static void serial_test_tc_netkit_multi_links_target(int mode, int target) { ... LIBBPF_OPTS(bpf_netkit_opts, optl); ... LIBBPF_OPTS_RESET(optl, .flags = BPF_F_BEFORE, .relative_fd = bpf_program__fd(skel->progs.tc1), ); ... } Let us make the following source change, note that we have a 4-byte tailing padding now. diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 6cd9c501624f..0dd83910ae9a 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -803,13 +803,13 @@ bpf_program__attach_tcx(const struct bpf_program *prog, int ifindex, struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility */ size_t sz; - __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; + __u32 flags; size_t :0; }; -#define bpf_netkit_opts__last_field expected_revision +#define bpf_netkit_opts__last_field flags The clang 18 generated asm code looks like below: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d 7d 98 leaq -0x68(%rbp), %rdi 55e7: 31 f6 xorl %esi, %esi 55e9: ba 20 00 00 00 movl $0x20, %edx 55ee: e8 00 00 00 00 callq 0x55f3 <serial_test_tc_netkit_multi_links_target+0x18d3> 55f3: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 55fe: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5605: 48 8b 78 18 movq 0x18(%rax), %rdi 5609: e8 00 00 00 00 callq 0x560e <serial_test_tc_netkit_multi_links_target+0x18ee> 560e: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5614: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 561e: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5633: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563a: 48 89 45 98 movq %rax, -0x68(%rbp) 563e: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5645: 48 89 45 a0 movq %rax, -0x60(%rbp) 5649: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5650: 48 89 45 a8 movq %rax, -0x58(%rbp) 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); At -O0 level, the clang compiler creates an intermediate copy. We have below to store 'flags' with 4-byte store and leave another 4 byte in the same 8-byte-aligned storage undefined, 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) and later we store 8-byte to the original zero'ed buffer 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) This caused a problem as the 4-byte value at [%rbp-0x2dc, %rbp-0x2e0) may be garbage. gcc (gcc 11.4) does not have this issue as it does zeroing struct first before doing assignments: ; LIBBPF_OPTS_RESET(optl, 50fd: 48 8d 85 40 fc ff ff leaq -0x3c0(%rbp), %rax 5104: ba 20 00 00 00 movl $0x20, %edx 5109: be 00 00 00 00 movl $0x0, %esi 510e: 48 89 c7 movq %rax, %rdi 5111: e8 00 00 00 00 callq 0x5116 <serial_test_tc_netkit_multi_links_target+0x1522> 5116: 48 8b 45 f0 movq -0x10(%rbp), %rax 511a: 48 8b 40 18 movq 0x18(%rax), %rax 511e: 48 89 c7 movq %rax, %rdi 5121: e8 00 00 00 00 callq 0x5126 <serial_test_tc_netkit_multi_links_target+0x1532> 5126: 48 c7 85 40 fc ff ff 00 00 00 00 movq $0x0, -0x3c0(%rbp) 5131: 48 c7 85 48 fc ff ff 00 00 00 00 movq $0x0, -0x3b8(%rbp) 513c: 48 c7 85 50 fc ff ff 00 00 00 00 movq $0x0, -0x3b0(%rbp) 5147: 48 c7 85 58 fc ff ff 00 00 00 00 movq $0x0, -0x3a8(%rbp) 5152: 48 c7 85 40 fc ff ff 20 00 00 00 movq $0x20, -0x3c0(%rbp) 515d: 89 85 48 fc ff ff movl %eax, -0x3b8(%rbp) 5163: c7 85 58 fc ff ff 08 00 00 00 movl $0x8, -0x3a8(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); It is not clear how to resolve the compiler code generation as the compiler generates correct code w.r.t. how to handle unnamed padding in C standard. So this patch changed LIBBPF_OPTS_RESET macro to avoid uninitialized tail padding. We already knows LIBBPF_OPTS macro works on both gcc and clang, even with tail padding. So LIBBPF_OPTS_RESET is changed to be a LIBBPF_OPTS followed by a memcpy(), thus avoiding uninitialized tail padding. The below is asm code generated with this patch and with clang compiler: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d bd 10 fd ff ff leaq -0x2f0(%rbp), %rdi 55ea: 31 f6 xorl %esi, %esi 55ec: ba 20 00 00 00 movl $0x20, %edx 55f1: e8 00 00 00 00 callq 0x55f6 <serial_test_tc_netkit_multi_links_target+0x18d6> 55f6: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 5601: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5608: 48 8b 78 18 movq 0x18(%rax), %rdi 560c: e8 00 00 00 00 callq 0x5611 <serial_test_tc_netkit_multi_links_target+0x18f1> 5611: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5617: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 5621: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 562c: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5636: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563d: 48 89 45 98 movq %rax, -0x68(%rbp) 5641: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5648: 48 89 45 a0 movq %rax, -0x60(%rbp) 564c: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5653: 48 89 45 a8 movq %rax, -0x58(%rbp) 5657: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565e: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); In the above code, a temporary buffer is zeroed and then has proper value assigned. Finally, values in temporary buffer are copied to the original variable buffer, hence tail padding is guaranteed to be 0. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20231107201511.2548645-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-24libbpf: Add link-based API for netkitDaniel Borkmann
This adds bpf_program__attach_netkit() API to libbpf. Overall it is very similar to tcx. The API looks as following: LIBBPF_API struct bpf_link * bpf_program__attach_netkit(const struct bpf_program *prog, int ifindex, const struct bpf_netkit_opts *opts); The struct bpf_netkit_opts is done in similar way as struct bpf_tcx_opts for supporting bpf_mprog control parameters. The attach location for the primary and peer device is derived from the program section "netkit/primary" and "netkit/peer", respectively. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-17libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym sectionAndrii Nakryiko
Fix too eager assumption that SHT_GNU_verdef ELF section is going to be present whenever binary has SHT_GNU_versym section. It seems like either SHT_GNU_verdef or SHT_GNU_verneed can be used, so failing on missing SHT_GNU_verdef actually breaks use cases in production. One specific reported issue, which was used to manually test this fix, was trying to attach to `readline` function in BASH binary. Fixes: bb7fa09399b9 ("libbpf: Support symbol versioning for uprobe") Reported-by: Liam Wisehart <liamwisehart@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Manu Bretelle <chantr4@gmail.com> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20231016182840.4033346-1-andrii@kernel.org
2023-10-11libbpf: Add support for cgroup unix socket address hooksDaan De Meyer
Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-04libbpf: Fix syscall access arguments on riscvAlexandre Ghiti
Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org
2023-09-29libbpf: Allow Golang symbols in uprobe secdefHengqi Chen
Golang symbols in ELF files are different from C/C++ which contains special characters like '*', '(' and ')'. With generics, things get more complicated, there are symbols like: github.com/cilium/ebpf/internal.(*Deque[go.shape.interface { Format(fmt.State, int32); TypeName() string;github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type}]).Grow Matching such symbols using `%m[^\n]` in sscanf, this excludes newline which typically does not appear in ELF symbols. This should work in most use-cases and also work for unicode letters in identifiers. If newline do show up in ELF symbols, users can still attach to such symbol by specifying bpf_uprobe_opts::func_name. A working example can be found at this repo ([0]). [0]: https://github.com/chenhengqi/libbpf-go-symbols Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929155954.92448-1-hengqi.chen@gmail.com
2023-09-25libbpf: Add ring__consumeMartin Kelly
Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__map_fdMartin Kelly
Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__sizeMartin Kelly
Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__avail_data_sizeMartin Kelly
Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring__producer_pos, ring__consumer_posMartin Kelly
Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com
2023-09-25libbpf: Add ring_buffer__ringMartin Kelly
Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com
2023-09-25libbpf: Switch rings to array of pointersMartin Kelly
Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com
2023-09-25libbpf: Refactor cleanup in ring_buffer__addMartin Kelly
Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com
2023-09-22libbpf: Support symbol versioning for uprobeHengqi Chen
In current implementation, we assume that symbol found in .dynsym section would have a version suffix and use it to compare with symbol user supplied. According to the spec ([0]), this assumption is incorrect, the version info of dynamic symbols are stored in .gnu.version and .gnu.version_d sections of ELF objects. For example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work. Because the qualified name does NOT match `pthread_rwlock_wrlock` (without version suffix) in .dynsym sections. This commit implements the symbol versioning for dynsym and allows user to specify symbol in the following forms: - func - func@LIB_VERSION - func@@LIB_VERSION In case of symbol conflicts, error out and users should resolve it by specifying a qualified name. [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-3-hengqi.chen@gmail.com
2023-09-22libbpf: Resolve symbol conflicts at the same offset for uprobeHengqi Chen
Dynamic symbols in shared library may have the same name, for example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 Currently, users can't attach a uprobe to pthread_rwlock_wrlock because there are two symbols named pthread_rwlock_wrlock and both are global bind. And libbpf considers it as a conflict. Since both of them are at the same offset we could accept one of them harmlessly. Note that we already does this in elf_resolve_syms_offsets. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-2-hengqi.chen@gmail.com
2023-09-16libbpf: Add support for custom exception callbacksKumar Kartikeya Dwivedi
Add support to libbpf to append exception callbacks when loading a program. The exception callback is found by discovering the declaration tag 'exception_callback:<value>' and finding the callback in the value of the tag. The process is done in two steps. First, for each main program, the bpf_object__sanitize_and_load_btf function finds and marks its corresponding exception callback as defined by the declaration tag on it. Second, bpf_object__reloc_code is modified to append the indicated exception callback at the end of the instruction iteration (since exception callback will never be appended in that loop, as it is not directly referenced). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-16-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-16libbpf: Refactor bpf_object__reloc_codeKumar Kartikeya Dwivedi
Refactor bpf_object__append_subprog_code out of bpf_object__reloc_code to be able to reuse it to append subprog related code for the exception callback to the main program. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-15-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-08libbpf: Add __percpu_kptr macro definitionYonghong Song
Add __percpu_kptr macro definition in bpf_helpers.h. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152800.1998492-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-08libbpf: Add basic BTF sanity validationAndrii Nakryiko
Implement a simple and straightforward BTF sanity check when parsing BTF data. Right now it's very basic and just validates that all the string offsets and type IDs are within valid range. For FUNC we also check that it points to FUNC_PROTO kinds. Even with such simple checks it fixes a bunch of crashes found by OSS fuzzer ([0]-[5]) and will allow fuzzer to make further progress. Some other invariants will be checked in follow up patches (like ensuring there is no infinite type loops), but this seems like a good start already. Adding FUNC -> FUNC_PROTO check revealed that one of selftests has a problem with FUNC pointing to VAR instead, so fix it up in the same commit. [0] https://github.com/libbpf/libbpf/issues/482 [1] https://github.com/libbpf/libbpf/issues/483 [2] https://github.com/libbpf/libbpf/issues/485 [3] https://github.com/libbpf/libbpf/issues/613 [4] https://github.com/libbpf/libbpf/issues/618 [5] https://github.com/libbpf/libbpf/issues/619 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Song Liu <song@kernel.org> Closes: https://github.com/libbpf/libbpf/issues/617 Link: https://lore.kernel.org/bpf/20230825202152.1813394-1-andrii@kernel.org
2023-08-23libbpf: fix signedness determination in CO-RE relo handling logicAndrii Nakryiko
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we need to check that first before inferring signedness. Closes: https://github.com/libbpf/libbpf/issues/704 Reported-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-23libbpf: Add bpf_object__unpin()Daniel Xu
For bpf_object__pin_programs() there is bpf_object__unpin_programs(). Likewise bpf_object__unpin_maps() for bpf_object__pin_maps(). But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds symmetry to the API. It's also convenient for cleanup in application code. It's an API I would've used if it was available for a repro I was writing earlier. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
2023-08-22libbpf: Free btf_vmlinux when closing bpf_objectHao Luo
I hit a memory leak when testing bpf_program__set_attach_target(). Basically, set_attach_target() may allocate btf_vmlinux, for example, when setting attach target for bpf_iter programs. But btf_vmlinux is freed only in bpf_object_load(), which means if we only open bpf object but not load it, setting attach target may leak btf_vmlinux. So let's free btf_vmlinux in bpf_object__close() anyway. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com
2023-08-21libbpf: Add uprobe multi link support to bpf_program__attach_usdtJiri Olsa
Adding support for usdt_manager_attach_usdt to use uprobe_multi link to attach to usdt probes. The uprobe_multi support is detected before the usdt program is loaded and its expected_attach_type is set accordingly. If uprobe_multi support is detected the usdt_manager_attach_usdt gathers uprobes info and calls bpf_program__attach_uprobe to create all needed uprobes. If uprobe_multi support is not detected the old behaviour stays. Also adding usdt.s program section for sleepable usdt probes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add uprobe multi link detectionJiri Olsa
Adding uprobe-multi link detection. It will be used later in bpf_program__attach_usdt function to check and use uprobe_multi link over standard uprobe links. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add support for u[ret]probe.multi[.s] program sectionsJiri Olsa
Adding support for several uprobe_multi program sections to allow auto attach of multi_uprobe programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add bpf_program__attach_uprobe_multi functionJiri Olsa
Adding bpf_program__attach_uprobe_multi function that allows to attach multiple uprobes with uprobe_multi link. The user can specify uprobes with direct arguments: binary_path/func_pattern/pid or with struct bpf_uprobe_multi_opts opts argument fields: const char **syms; const unsigned long *offsets; const unsigned long *ref_ctr_offsets; const __u64 *cookies; User can specify 2 mutually exclusive set of inputs: 1) use only path/func_pattern/pid arguments 2) use path/pid with allowed combinations of: syms/offsets/ref_ctr_offsets/cookies/cnt - syms and offsets are mutually exclusive - ref_ctr_offsets and cookies are optional Any other usage results in error. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add bpf_link_create support for multi uprobesJiri Olsa
Adding new uprobe_multi struct to bpf_link_create_opts object to pass multiple uprobe data to link_create attr uapi. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add elf_resolve_pattern_offsets functionJiri Olsa
Adding elf_resolve_pattern_offsets function that looks up offsets for symbols specified by pattern argument. The 'pattern' argument allows wildcards (*?' supported). Offsets are returned in allocated array together with its size and needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add elf_resolve_syms_offsets functionJiri Olsa
Adding elf_resolve_syms_offsets function that looks up offsets for symbols specified in syms array argument. Offsets are returned in allocated array with the 'cnt' size, that needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add elf symbol iteratorJiri Olsa
Adding elf symbol iterator object (and some functions) that follow open-coded iterator pattern and some functions to ease up iterating elf object symbols. The idea is to iterate single symbol section with: struct elf_sym_iter iter; struct elf_sym *sym; if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM)) goto error; while ((sym = elf_sym_iter_next(&iter))) { ... } I considered opening the elf inside the iterator and iterate all symbol sections, but then it gets more complicated wrt user checks for when the next section is processed. Plus side is the we don't need 'exit' function, because caller/user is in charge of that. The returned iterated symbol object from elf_sym_iter_next function is placed inside the struct elf_sym_iter, so no extra allocation or argument is needed. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add elf_open/elf_close functionsJiri Olsa
Adding elf_open/elf_close functions and using it in elf_find_func_offset_from_file function. It will be used in following changes to save some common code. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Move elf_find_func_offset* functions to elf objectJiri Olsa
Adding new elf object that will contain elf related functions. There's no functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21libbpf: Add uprobe_multi attach type and link namesJiri Olsa
Adding new uprobe_multi attach type and link names, so the functions can resolve the new values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-18libbpf: Support triple-underscore flavors for kfunc relocationDave Marchevsky
The function signature of kfuncs can change at any time due to their intentional lack of stability guarantees. As kfuncs become more widely used, BPF program writers will need facilities to support calling different versions of a kfunc from a single BPF object. Consider this simplified example based on a real scenario we ran into at Meta: /* initial kfunc signature */ int some_kfunc(void *ptr) /* Oops, we need to add some flag to modify behavior. No problem, change the kfunc. flags = 0 retains original behavior */ int some_kfunc(void *ptr, long flags) If the initial version of the kfunc is deployed on some portion of the fleet and the new version on the rest, a fleetwide service that uses some_kfunc will currently need to load different BPF programs depending on which some_kfunc is available. Luckily CO-RE provides a facility to solve a very similar problem, struct definition changes, by allowing program writers to declare my_struct___old and my_struct___new, with ___suffix being considered a 'flavor' of the non-suffixed name and being ignored by bpf_core_type_exists and similar calls. This patch extends the 'flavor' facility to the kfunc extern relocation process. BPF program writers can now declare extern int some_kfunc___old(void *ptr) extern int some_kfunc___new(void *ptr, int flags) then test which version of the kfunc exists with bpf_ksym_exists. Relocation and verifier's dead code elimination will work in concert as expected, allowing this pattern: if (bpf_ksym_exists(some_kfunc___old)) some_kfunc___old(ptr); else some_kfunc___new(ptr, 0); Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com
2023-08-14libbpf: Set close-on-exec flag on gzopenMarco Vedovati
Enable the close-on-exec flag when using gzopen. This is especially important for multithreaded programs making use of libbpf, where a fork + exec could race with libbpf library calls, potentially resulting in a file descriptor leaked to the new process. This got missed in 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC"). Fixes: 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC") Signed-off-by: Marco Vedovati <marco.vedovati@crowdstrike.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230810214350.106301-1-martin.kelly@crowdstrike.com
2023-08-04libbpf: Use local includes inside the librarySergey Kacheev
In our monrepo, we try to minimize special processing when importing (aka vendor) third-party source code. Ideally, we try to import directly from the repositories with the code without changing it, we try to stick to the source code dependency instead of the artifact dependency. In the current situation, a patch has to be made for libbpf to fix the includes in bpf headers so that they work directly from libbpf/src. Signed-off-by: Sergey Kacheev <s.kacheev@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/CAJVhQqUg6OKq6CpVJP5ng04Dg+z=igevPpmuxTqhsR3dKvd9+Q@mail.gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-02libbpf: fix typos in MakefileRandy Dunlap
Capitalize ABI (acronym) and fix spelling of "destination". Fixes: 706819495921 ("libbpf: Improve usability of libbpf Makefile") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: Xin Liu <liuxin350@huawei.com> Link: https://lore.kernel.org/r/20230722065236.17010-1-rdunlap@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add helper macro to clear opts structsDaniel Borkmann
Add a small and generic LIBBPF_OPTS_RESET() helper macros which clears an opts structure and reinitializes its .sz member to place the structure size. Additionally, the user can pass option-specific data to reinitialize via varargs. I found this very useful when developing selftests, but it is also generic enough as a macro next to the existing LIBBPF_OPTS() which hides the .sz initialization, too. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-6-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add link-based API for tcxDaniel Borkmann
Implement tcx BPF link support for libbpf. The bpf_program__attach_fd() API has been refactored slightly in order to pass bpf_link_create_opts pointer as input. A new bpf_program__attach_tcx() has been added on top of this which allows for passing all relevant data via extensible struct bpf_tcx_opts. The program sections tcx/ingress and tcx/egress correspond to the hook locations for tc ingress and egress, respectively. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-5-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add opts-based attach/detach/query API for tcxDaniel Borkmann
Extend libbpf attach opts and add a new detach opts API so this can be used to add/remove fd-based tcx BPF programs. The old-style bpf_prog_detach() and bpf_prog_detach2() APIs are refactored to reuse the new bpf_prog_detach_opts() internally. The bpf_prog_query_opts() API got extended to be able to handle the new link_ids, link_attach_flags and revision fields. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19xsk: add new netlink attribute dedicated for ZC max fragsMaciej Fijalkowski
Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will carry maximum fragments that underlying ZC driver is able to handle on TX side. It is going to be included in netlink response only when driver supports ZC. Any value higher than 1 implies multi-buffer ZC support on underlying device. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-11-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-11libbpf: Remove HASHMAP_INIT static initialization helperJohn Sanpe
Remove the wrong HASHMAP_INIT. It's not used anywhere in libbpf. Signed-off-by: John Sanpe <sanpeqf@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230711070712.2064144-1-sanpeqf@gmail.com
2023-07-11libbpf: Fix realloc API handling in zero-sized edge casesAndrii Nakryiko
realloc() and reallocarray() can either return NULL or a special non-NULL pointer, if their size argument is zero. This requires a bit more care to handle NULL-as-valid-result situation differently from NULL-as-error case. This has caused real issues before ([0]), and just recently bit again in production when performing bpf_program__attach_usdt(). This patch fixes 4 places that do or potentially could suffer from this mishandling of NULL, including the reported USDT-related one. There are many other places where realloc()/reallocarray() is used and NULL is always treated as an error value, but all those have guarantees that their size is always non-zero, so those spot don't need any extra handling. [0] d08ab82f59d5 ("libbpf: Fix double-free when linker processes empty sections") Fixes: 999783c8bbda ("libbpf: Wire up spec management and other arch-independent USDT logic") Fixes: b63b3c490eee ("libbpf: Add bpf_program__set_insns function") Fixes: 697f104db8a6 ("libbpf: Support custom SEC() handlers") Fixes: b12688267280 ("libbpf: Change the order of data and text relocations.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230711024150.1566433-1-andrii@kernel.org
2023-07-08libbpf: only reset sec_def handler when necessaryAndrii Nakryiko
Don't reset recorded sec_def handler unconditionally on bpf_program__set_type(). There are two situations where this is wrong. First, if the program type didn't actually change. In that case original SEC handler should work just fine. Second, catch-all custom SEC handler is supposed to work with any BPF program type and SEC() annotation, so it also doesn't make sense to reset that. This patch fixes both issues. This was reported recently in the context of breaking perf tool, which uses custom catch-all handler for fancy BPF prologue generation logic. This patch should fix the issue. [0] https://lore.kernel.org/linux-perf-users/ab865e6d-06c5-078e-e404-7f90686db50d@amd.com/ Fixes: d6e6286a12e7 ("libbpf: disassociate section handler on explicit bpf_program__set_type() call") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230707231156.1711948-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-06libbpf: Use available_filter_functions_addrs with multi-kprobesJackie Liu
Now that kernel provides a new available_filter_functions_addrs file which can help us avoid the need to cross-validate available_filter_functions and kallsyms, we can improve efficiency of multi-attach kprobes. For example, on my device, the sample program [1] of start time: $ sudo ./funccount "tcp_*" before after 1.2s 1.0s [1]: https://github.com/JackieLiu1/ketones/tree/master/src/funccount Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230705091209.3803873-2-liu.yun@linux.dev