summaryrefslogtreecommitdiff
path: root/tools/lib/bpf/libbpf.c
AgeCommit message (Collapse)Author
2022-07-28libbpf: Support PPC in arch_specific_syscall_pfxDaniel Müller
Commit 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") added the arch_specific_syscall_pfx() function, which returns a string representing the architecture in use. As it turns out this function is currently not aware of Power PC, where NULL is returned. That's being flagged by the libbpf CI system, which builds for ppc64le and the compiler sees a NULL pointer being passed in to a %s format string. With this change we add representations for two more architectures, for Power PC and Power PC 64, and also adjust the string format logic to handle NULL pointers gracefully, in an attempt to prevent similar issues with other architectures in the future. Fixes: 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net
2022-07-19libbpf: make RINGBUF map size adjustments more eagerlyAndrii Nakryiko
Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2 of page_size) more eagerly: during open phase when initializing the map and on explicit calls to bpf_map__set_max_entries(). Such approach allows user to check actual size of BPF ringbuf even before it's created in the kernel, but also it prevents various edge case scenarios where BPF ringbuf size can get out of sync with what it would be in kernel. One of them (reported in [0]) is during an attempt to pin/reuse BPF ringbuf. Move adjust_ringbuf_sz() helper closer to its first actual use. The implementation of the helper is unchanged. Also make detection of whether bpf_object is already loaded more robust by checking obj->loaded explicitly, given that map->fd can be < 0 even if bpf_object is already loaded due to ability to disable map creation with bpf_map__set_autocreate(map, false). [0] Closes: https://github.com/libbpf/libbpf/pull/530 Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: fallback to tracefs mount point if debugfs is not mountedAndrii Nakryiko
Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if debugfs (/sys/kernel/debug/tracing) isn't mounted. Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Connor O'Brien <connoro@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: add ksyscall/kretsyscall sections support for syscall kprobesAndrii Nakryiko
Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding kretsyscall variants (for return kprobes) to allow users to kprobe syscall functions in kernel. These special sections allow to ignore complexities and differences between kernel versions and host architectures when it comes to syscall wrapper and corresponding __<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf itself doesn't rely on /proc/config.gz for detecting this, see BPF_KSYSCALL patch for how it's done internally). Combined with the use of BPF_KSYSCALL() macro, this allows to just specify intended syscall name and expected input arguments and leave dealing with all the variations to libbpf. In addition to SEC("ksyscall+") and SEC("kretsyscall+") add bpf_program__attach_ksyscall() API which allows to specify syscall name at runtime and provide associated BPF cookie value. At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not handle all the calling convention quirks for mmap(), clone() and compat syscalls. It also only attaches to "native" syscall interfaces. If host system supports compat syscalls or defines 32-bit syscalls in 64-bit kernel, such syscall interfaces won't be attached to by libbpf. These limitations may or may not change in the future. Therefore it is recommended to use SEC("kprobe") for these syscalls or if working with compat and 32-bit interfaces is required. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALLAndrii Nakryiko
Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to match libbpf's SEC("ksyscall") section name, added in next patch) to use __kconfig variable to determine how to properly fetch syscall arguments. Instead of relying on hard-coded knowledge of whether kernel's architecture uses syscall wrapper or not (which only reflects the latest kernel versions, but is not necessarily true for older kernels and won't necessarily hold for later kernel versions on some particular host architecture), determine this at runtime by attempting to create perf_event (with fallback to kprobe event creation through tracefs on legacy kernels, just like kprobe attachment code is doing) for kernel function that would correspond to bpf() syscall on a system that has CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try '__x64_sys_bpf'). If host kernel uses syscall wrapper, syscall kernel function's first argument is a pointer to struct pt_regs that then contains syscall arguments. In such case we need to use bpf_probe_read_kernel() to fetch actual arguments (which we do through BPF_CORE_READ() macro) from inner pt_regs. But if the kernel doesn't use syscall wrapper approach, input arguments can be read from struct pt_regs directly with no probe reading. All this feature detection is done without requiring /proc/config.gz existence and parsing, and BPF-side helper code uses newly added LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with user-side feature detection of libbpf. BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf")) and SEC("ksyscall") program added in the next patch (which are the same kprobe program with added benefit of libbpf determining correct kernel function name automatically). Kretprobe and kretsyscall (added in next patch) programs don't need BPF_KSYSCALL as they don't provide access to input arguments. Normal BPF_KRETPROBE is completely sufficient and is recommended. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: generalize virtual __kconfig externs and use it for USDTAndrii Nakryiko
Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION. LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead customly filled out by libbpf. This patch generalizes this approach to support more such virtual __kconfig externs. One such extern added in this patch is LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in usdt.bpf.h instead of using CO-RE-based enum detection approach for detecting bpf_get_attach_cookie() BPF helper. This allows to remove otherwise not needed CO-RE dependency and keeps user-space and BPF-side parts of libbpf's USDT support strictly in sync in terms of their feature detection. We'll use similar approach for syscall wrapper detection for BPF_KSYSCALL() BPF-side macro in follow up patch. Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values and LINUX_ for virtual libbpf-backed externs. In the future we might extend the set of prefixes that are supported. This can be done without any breaking changes, as currently any __kconfig extern with unrecognized name is rejected. For LINUX_xxx externs we support the normal "weak rule": if libbpf doesn't recognize given LINUX_xxx extern but such extern is marked as __weak, it is not rejected and defaults to zero. This follows CONFIG_xxx handling logic and will allow BPF applications to opportunistically use newer libbpf virtual externs without breaking on older libbpf versions unnecessarily. Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-15libbpf: perfbuf: Add API to get the ring bufferJon Doron
Add support for writing a custom event reader, by exposing the ring buffer. With the new API perf_buffer__buffer() you will get access to the raw mmaped()'ed per-cpu underlying memory of the ring buffer. This region contains both the perf buffer data and header (struct perf_event_mmap_page), which manages the ring buffer state (head/tail positions, when accessing the head/tail position it's important to take into consideration SMP). With this type of low level access one can implement different types of consumers here are few simple examples where this API helps with: 1. perf_event_read_simple is allocating using malloc, perhaps you want to handle the wrap-around in some other way. 2. Since perf buf is per-cpu then the order of the events is not guarnteed, for example: Given 3 events where each event has a timestamp t0 < t1 < t2, and the events are spread on more than 1 CPU, then we can end up with the following state in the ring buf: CPU[0] => [t0, t2] CPU[1] => [t1] When you consume the events from CPU[0], you could know there is a t1 missing, (assuming there are no drops, and your event data contains a sequential index). So now one can simply do the following, for CPU[0], you can store the address of t0 and t2 in an array (without moving the tail, so there data is not perished) then move on the CPU[1] and set the address of t1 in the same array. So you end up with something like: void **arr[] = [&t0, &t1, &t2], now you can consume it orderely and move the tails as you process in order. 3. Assuming there are multiple CPUs and we want to start draining the messages from them, then we can "pick" with which one to start with according to the remaining free space in the ring buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com
2022-07-13libbpf: Fix the name of a reused mapAnquan Wu
BPF map name is limited to BPF_OBJ_NAME_LEN. A map name is defined as being longer than BPF_OBJ_NAME_LEN, it will be truncated to BPF_OBJ_NAME_LEN when a userspace program calls libbpf to create the map. A pinned map also generates a path in the /sys. If the previous program wanted to reuse the map, it can not get bpf_map by name, because the name of the map is only partially the same as the name which get from pinned path. The syscall information below show that map name "process_pinned_map" is truncated to "process_pinned_". bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map", bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory) bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0, map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6, btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4 This patch check that if the name of pinned map are the same as the actual name for the first (BPF_OBJ_NAME_LEN - 1), bpf map still uses the name which is included in bpf object. Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse") Signed-off-by: Anquan Wu <leiqi96@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM
2022-07-13libbpf: Error out when binary_path is NULL for uprobe and USDTHengqi Chen
binary_path is a required non-null parameter for bpf_program__attach_usdt and bpf_program__attach_uprobe_opts. Check it against NULL to prevent coredump on strchr. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com
2022-07-05libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()Chuang Wang
A potential scenario, when an error is returned after add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or bpf_program__attach_perf_event_opts() in bpf_program__attach_uprobe_opts() returns an error, the uprobe_event that was previously created is not cleaned. So, with this patch, when an error is returned, fix this by adding remove_uprobe_event_legacy() Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com
2022-07-05libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()Chuang Wang
Use "type" as opposed to "err" in pr_warn() after determine_uprobe_perf_type_legacy() returns an error. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-3-nashuiliang@gmail.com
2022-07-05libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()Chuang Wang
Before the 0bc11ed5ab60 commit ("kprobes: Allow kprobes coexist with livepatch"), in a scenario where livepatch and kprobe coexist on the same function entry, the creation of kprobe_event using add_kprobe_event_legacy() will be successful, at the same time as a trace event (e.g. /debugfs/tracing/events/kprobe/XXX) will exist, but perf_event_open() will return an error because both livepatch and kprobe use FTRACE_OPS_FL_IPMODIFY. As follows: 1) add a livepatch $ insmod livepatch-XXX.ko 2) add a kprobe using tracefs API (i.e. add_kprobe_event_legacy) $ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events 3) enable this kprobe (i.e. sys_perf_event_open) This will return an error, -EBUSY. On Andrii Nakryiko's comment, few error paths in bpf_program__attach_kprobe_opts() that should need to call remove_kprobe_event_legacy(). With this patch, whenever an error is returned after add_kprobe_event_legacy() or bpf_program__attach_perf_event_opts(), this ensures that the created kprobe_event is cleaned. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-2-nashuiliang@gmail.com
2022-07-05bpf, libbpf: Add type match supportDaniel Müller
This patch adds support for the proposed type match relation to relo_core where it is shared between userspace and kernel. It plumbs through both kernel-side and libbpf-side support. The matching relation is defined as follows (copy from source): - modifiers and typedefs are stripped (and, hence, effectively ignored) - generally speaking types need to be of same kind (struct vs. struct, union vs. union, etc.) - exceptions are struct/union behind a pointer which could also match a forward declaration of a struct or union, respectively, and enum vs. enum64 (see below) Then, depending on type: - integers: - match if size and signedness match - arrays & pointers: - target types are recursively matched - structs & unions: - local members need to exist in target with the same name - for each member we recursively check match unless it is already behind a pointer, in which case we only check matching names and compatible kind - enums: - local variants have to have a match in target by symbolic name (but not numeric value) - size has to match (but enum may match enum64 and vice versa) - function pointers: - number and position of arguments in local type has to match target - for each argument and the return value we recursively check match Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net
2022-06-29libbpf: add lsm_cgoup_sock typeStanislav Fomichev
lsm_cgroup/ is the prefix for BPF_LSM_CGROUP. Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: enforce strict libbpf 1.0 behaviorsAndrii Nakryiko
Remove support for legacy features and behaviors that previously had to be disabled by calling libbpf_set_strict_mode(): - legacy BPF map definitions are not supported now; - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see libbpf_set_memlock_rlim()); - program name is used for program pinning (instead of section name); - cleaned up error returning logic; - entry BPF programs should have SEC() always. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: clean up SEC() handlingAndrii Nakryiko
Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap} sections. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove internal multi-instance prog supportAndrii Nakryiko
Clean up internals that had to deal with the possibility of multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all this is not necessary now and can be simplified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove multi-instance and custom private data APIsAndrii Nakryiko
Remove all the public APIs that are related to creating multi-instance bpf_programs through custom preprocessing callback and generally working with them. Also remove all the bpf_{object,map,program}__[set_]priv() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove most other deprecated high-level APIsAndrii Nakryiko
Remove a bunch of high-level bpf_object/bpf_map/bpf_program related APIs. All the APIs related to private per-object/map/prog state, program preprocessing callback, and generally everything multi-instance related is removed in a separate patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove prog_info_linear APIsAndrii Nakryiko
Remove prog_info_linear-related APIs previously used by perf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: clean up perfbuf APIsAndrii Nakryiko
Remove deprecated perfbuf APIs and clean up opts structs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove deprecated BTF APIsAndrii Nakryiko
Get rid of deprecated BTF-related APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28libbpf: remove deprecated low-level APIsAndrii Nakryiko
Drop low-level APIs as well as high-level (and very confusingly named) BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-24bpf: Merge "types_are_compat" logic into relo_core.cDaniel Müller
BPF type compatibility checks (bpf_core_types_are_compat()) are currently duplicated between kernel and user space. That's a historical artifact more than intentional doing and can lead to subtle bugs where one implementation is adjusted but another is forgotten. That happened with the enum64 work, for example, where the libbpf side was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation support")) to use the btf_kind_core_compat() helper function but the kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64 support")). This patch addresses both the duplication issue, by merging both implementations and moving them into relo_core.c, and fixes the alluded to kind check (by giving preference to libbpf's already adjusted logic). For discussion of the topic, please refer to: https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665 Changelog: v1 -> v2: - limited libbpf recursion limit to 32 - changed name to __bpf_core_types_are_compat - included warning previously present in libbpf version - merged kernel and user space changes into a single patch Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net
2022-06-16libbpf: add support for sleepable uprobe programsDelyan Kratunov
Add section mappings for u(ret)probe.s programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-14libbpf: Fix an unsigned < 0 bugYonghong Song
Andrii reported a bug with the following information: 2859 if (enum64_placeholder_id == 0) { 2860 enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0); >>> CID 394804: Control flow issues (NO_EFFECT) >>> This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U". 2861 if (enum64_placeholder_id < 0) 2862 return enum64_placeholder_id; 2863 ... Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0 is always false. Declare enum64_placeholder_id as 'int' in order to capture the potential error properly. Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
2022-06-09libbpf: Fix uprobe symbol file offset calculation logicAndrii Nakryiko
Fix libbpf's bpf_program__attach_uprobe() logic of determining function's *file offset* (which is what kernel is actually expecting) when attaching uprobe/uretprobe by function name. Previously calculation was determining virtual address offset relative to base load address, which (offset) is not always the same as file offset (though very frequently it is which is why this went unnoticed for a while). Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Riham Selim <rihams@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org
2022-06-07libbpf: Add enum64 relocation supportYonghong Song
The enum64 relocation support is added. The bpf local type could be either enum or enum64 and the remote type could be either enum or enum64 too. The all combinations of local enum/enum64 and remote enum/enum64 are supported. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07libbpf: Add enum64 sanitizationYonghong Song
When old kernel does not support enum64 but user space btf contains non-zero enum kflag or enum64, libbpf needs to do proper sanitization so modified btf can be accepted by the kernel. Sanitization for enum kflag can be achieved by clearing the kflag bit. For enum64, the type is replaced with an union of integer member types and the integer member size must be smaller than enum64 size. If such an integer type cannot be found, a new type is created and used for union members. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-03libbpf: Fix is_pow_of_2Yuze Chi
Move the correct definition from linker.c into libbpf_internal.h. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2022-06-02libbpf: Introduce libbpf_bpf_link_type_strDaniel Müller
This change introduces a new function, libbpf_bpf_link_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_link_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_attach_type_strDaniel Müller
This change introduces a new function, libbpf_bpf_attach_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_attach_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_map_type_strDaniel Müller
This change introduces a new function, libbpf_bpf_map_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_map_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_prog_type_strDaniel Müller
This change introduces a new function, libbpf_bpf_prog_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_prog_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2022-05-23libbpf: Fix typo in commentJulia Lawall
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Müller <deso@posteo.net> Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr
2022-05-16libbpf: fix memory leak in attach_tp for target-less tracepoint programAndrii Nakryiko
Fix sec_name memory leak if user defines target-less SEC("tp"). Fixes: 9af8efc45eb1 ("libbpf: Allow "incomplete" basic tracing SEC() definitions") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20220516184547.3204674-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-13libbpf: Add safer high-level wrappers for map operationsAndrii Nakryiko
Add high-level API wrappers for most common and typical BPF map operations that works directly on instances of struct bpf_map * (so you don't have to call bpf_map__fd()) and validate key/value size expectations. These helpers require users to specify key (and value, where appropriate) sizes when performing lookup/update/delete/etc. This forces user to actually think and validate (for themselves) those. This is a good thing as user is expected by kernel to implicitly provide correct key/value buffer sizes and kernel will just read/write necessary amount of data. If it so happens that user doesn't set up buffers correctly (which bit people for per-CPU maps especially) kernel either randomly overwrites stack data or return -EFAULT, depending on user's luck and circumstances. These high-level APIs are meant to prevent such unpleasant and hard to debug bugs. This patch also adds bpf_map_delete_elem_flags() low-level API and requires passing flags to bpf_map__delete_elem() API for consistency across all similar APIs, even though currently kernel doesn't expect any extra flags for BPF_MAP_DELETE_ELEM operation. List of map operations that get these high-level APIs: - bpf_map_lookup_elem; - bpf_map_update_elem; - bpf_map_delete_elem; - bpf_map_lookup_and_delete_elem; - bpf_map_get_next_key. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220512220713.2617964-1-andrii@kernel.org
2022-05-11libbpf: Add bpf_program__set_insns functionJiri Olsa
Adding bpf_program__set_insns that allows to set new instructions for a BPF program. This is a very advanced libbpf API and users need to know what they are doing. This should be used from prog_prepare_load_fn callback only. We can have changed instructions after calling prog_prepare_load_fn callback, reloading them. One of the users of this new API will be perf's internal BPF prologue generation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org
2022-05-11libbpf: Clean up ringbuf size adjustment implementationAndrii Nakryiko
Drop unused iteration variable, move overflow prevention check into the for loop. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org
2022-05-10libbpf: Assign cookies to links in libbpf.Kui-Feng Lee
Add a cookie field to the attributes of bpf_link_create(). Add bpf_program__attach_trace_opts() to attach a cookie to a link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com
2022-05-09libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessaryAndrii Nakryiko
Kernel imposes a pretty particular restriction on ringbuf map size. It has to be a power-of-2 multiple of page size. While generally this isn't hard for user to satisfy, sometimes it's impossible to do this declaratively in BPF source code or just plain inconvenient to do at runtime. One such example might be BPF libraries that are supposed to work on different architectures, which might not agree on what the common page size is. Let libbpf find the right size for user instead, if it turns out to not satisfy kernel requirements. If user didn't set size at all, that's most probably a mistake so don't upsize such zero size to one full page, though. Also we need to be careful about not overflowing __u32 max_entries. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org
2022-04-28libbpf: Allow to opt-out from creating BPF mapsAndrii Nakryiko
Add bpf_map__set_autocreate() API that allows user to opt-out from libbpf automatically creating BPF map during BPF object load. This is a useful feature when building CO-RE-enabled BPF application that takes advantage of some new-ish BPF map type (e.g., socket-local storage) if kernel supports it, but otherwise uses some alternative way (e.g., extra HASH map). In such case, being able to disable the creation of a map that kernel doesn't support allows to successfully create and load BPF object file with all its other maps and programs. It's still up to user to make sure that no "live" code in any of their BPF programs are referencing such map instance, which can be achieved by guarding such code with CO-RE relocation check or by using .rodata global variables. If user fails to properly guard such code to turn it into "dead code", libbpf will helpfully post-process BPF verifier log and will provide more meaningful error and map name that needs to be guarded properly. As such, instead of: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: (85) call unknown#2001000000 invalid func unknown#2001000000 ... user will see: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: <invalid BPF map reference> BPF map 'missing_map' is referenced but wasn't created Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-4-andrii@kernel.org
2022-04-28libbpf: Use libbpf_mem_ensure() when allocating new mapAndrii Nakryiko
Reuse libbpf_mem_ensure() when adding a new map to the list of maps inside bpf_object. It takes care of proper resizing and reallocating of map array and zeroing out newly allocated memory. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-3-andrii@kernel.org
2022-04-28libbpf: Append "..." in fixed up log if CO-RE spec is truncatedAndrii Nakryiko
Detect CO-RE spec truncation and append "..." to make user aware that there was supposed to be more of the spec there. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-2-andrii@kernel.org
2022-04-28libbpf: Support target-less SEC() definitions for BTF-backed programsAndrii Nakryiko
Similar to previous patch, support target-less definitions like SEC("fentry"), SEC("freplace"), etc. For such BTF-backed program types it is expected that user will specify BTF target programmatically at runtime using bpf_program__set_attach_target() *before* load phase. If not, libbpf will report this as an error. Aslo use SEC_ATTACH_BTF flag instead of explicitly listing a set of types that are expected to require attach_btf_id. This was an accidental omission during custom SEC() support refactoring. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-3-andrii@kernel.org
2022-04-28libbpf: Allow "incomplete" basic tracing SEC() definitionsAndrii Nakryiko
In a lot of cases the target of kprobe/kretprobe, tracepoint, raw tracepoint, etc BPF program might not be known at the compilation time and will be discovered at runtime. This was always a supported case by libbpf, with APIs like bpf_program__attach_{kprobe,tracepoint,etc}() accepting full target definition, regardless of what was defined in SEC() definition in BPF source code. Unfortunately, up till now libbpf still enforced users to specify at least something for the fake target, e.g., SEC("kprobe/whatever"), which is cumbersome and somewhat misleading. This patch allows target-less SEC() definitions for basic tracing BPF program types: - kprobe/kretprobe; - multi-kprobe/multi-kretprobe; - tracepoints; - raw tracepoints. Such target-less SEC() definitions are meant to specify declaratively proper BPF program type only. Attachment of them will have to be handled programmatically using correct APIs. As such, skeleton's auto-attachment of such BPF programs is skipped and generic bpf_program__attach() will fail, if attempted, due to the lack of enough target information. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-2-andrii@kernel.org
2022-04-26libbpf: Fix up verifier log for unguarded failed CO-RE relosAndrii Nakryiko
Teach libbpf to post-process BPF verifier log on BPF program load failure and detect known error patterns to provide user with more context. Currently there is one such common situation: an "unguarded" failed BPF CO-RE relocation. While failing CO-RE relocation is expected, it is expected to be property guarded in BPF code such that BPF verifier always eliminates BPF instructions corresponding to such failed CO-RE relos as dead code. In cases when user failed to take such precautions, BPF verifier provides the best log it can: 123: (85) call unknown#195896080 invalid func unknown#195896080 Such incomprehensible log error is due to libbpf "poisoning" BPF instruction that corresponds to failed CO-RE relocation by replacing it with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads "bad relo" if you squint hard enough). Luckily, libbpf has all the necessary information to look up CO-RE relocation that failed and provide more human-readable description of what's going on: 5: <invalid CO-RE relocation> failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) This hopefully makes it much easier to understand what's wrong with user's BPF program without googling magic constants. This BPF verifier log fixup is setup to be extensible and is going to be used for at least one other upcoming feature of libbpf in follow up patches. Libbpf is parsing lines of BPF verifier log starting from the very end. Currently it processes up to 10 lines of code looking for familiar patterns. This avoids wasting lots of CPU processing huge verifier logs (especially for log_level=2 verbosity level). Actual verification error should normally be found in last few lines, so this should work reliably. If libbpf needs to expand log beyond available log_buf_size, it truncates the end of the verifier log. Given verifier log normally ends with something like: processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ... truncating this on program load error isn't too bad (end user can always increase log size, if it needs to get complete log). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
2022-04-26libbpf: Record subprog-resolved CO-RE relocations unconditionallyAndrii Nakryiko
Previously, libbpf recorded CO-RE relocations with insns_idx resolved according to finalized subprog locations (which are appended at the end of entry BPF program) to simplify the job of light skeleton generator. This is necessary because once subprogs' instructions are appended to main entry BPF program all the subprog instruction indices are shifted and that shift is different for each entry (main) BPF program, so it's generally impossible to map final absolute insn_idx of the finalized BPF program to their original locations inside subprograms. This information is now going to be used not only during light skeleton generation, but also to map absolute instruction index to subprog's instruction and its corresponding CO-RE relocation. So start recording these relocations always, not just when obj->gen_loader is set. This information is going to be freed at the end of bpf_object__load() step, as before (but this can change in the future if there will be a need for this information post load step). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
2022-04-26libbpf: Avoid joining .BTF.ext data with BPF programs by section nameAndrii Nakryiko
Instead of using ELF section names as a joining key between .BTF.ext and corresponding BPF programs, pre-build .BTF.ext section number to ELF section index mapping during bpf_object__open() and use it later for matching .BTF.ext information (func/line info or CO-RE relocations) to their respective BPF programs and subprograms. This simplifies corresponding joining logic and let's libbpf do manipulations with BPF program's ELF sections like dropping leading '?' character for non-autoloaded programs. Original joining logic in bpf_object__relocate_core() (see relevant comment that's now removed) was never elegant, so it's a good improvement regardless. But it also avoids unnecessary internal assumptions about preserving original ELF section name as BPF program's section name (which was broken when SEC("?abc") support was added). Fixes: a3820c481112 ("libbpf: Support opting out from autoloading BPF programs declaratively") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
2022-04-26libbpf: Fix logic for finding matching program for CO-RE relocationAndrii Nakryiko
Fix the bug in bpf_object__relocate_core() which can lead to finding invalid matching BPF program when processing CO-RE relocation. IF matching program is not found, last encountered program will be assumed to be correct program and thus error detection won't detect the problem. Fixes: 9c82a63cf370 ("libbpf: Fix CO-RE relocs against .text section") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org