Age | Commit message (Collapse) | Author |
|
bpf_prog_pack_free() uses header->size to decide whether the header
should be freed with module_memfree() or the bpf_prog_pack logic.
However, in kvmalloc() failure path of bpf_jit_binary_pack_alloc(),
header->size is not set yet. As a result, bpf_prog_pack_free() may treat
a slice of a pack as a standalone kvmalloc'd header and call
module_memfree() on the whole pack. This in turn causes use-after-free by
other users of the pack.
Fix this by setting ro_header->size before freeing ro_header.
Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: syzbot+2f649ec6d2eea1495a8f@syzkaller.appspotmail.com
Reported-by: syzbot+ecb1e7e51c52f68f7481@syzkaller.appspotmail.com
Reported-by: syzbot+87f65c75f4a72db05445@syzkaller.appspotmail.com
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220217183001.1876034-1-song@kernel.org
|
|
Avoid unnecessary goto cleanup, as there is nothing to clean up.
Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220217180210.2981502-1-fallentree@fb.com
|
|
Fix typo in vmtest.sh to make sure it launch proper vm with 8 cpus.
Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220217155212.2309672-1-fallentree@fb.com
|
|
Ensure that libbpf_netlink_recv() frees dynamically allocated buffer in
all code paths.
Fixes: 9c3de619e13e ("libbpf: Use dynamically allocated buffer when receiving netlink messages")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20220217073958.276959-1-andrii@kernel.org
|
|
Mark C++-specific T::open() and other methods as static inline to avoid
symbol redefinition when multiple files use the same skeleton header in
an application.
Fixes: bb8ffe61ea45 ("bpftool: Add C++-specific open/load/etc skeleton wrappers")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220216233540.216642-1-andrii@kernel.org
|
|
The commit e5043894b21f ("bpftool: Use libbpf_get_error() to check error")
fails to dump map without BTF loaded in pretty mode (-p option).
Fixing this by making sure get_map_kv_btf won't fail in case there's
no BTF available for the map.
Fixes: e5043894b21f ("bpftool: Use libbpf_get_error() to check error")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220216092102.125448-1-jolsa@kernel.org
|
|
Mauricio Vásquez <mauricio@kinvolk.io> says:
====================
CO-RE requires to have BTF information describing the kernel types in
order to perform the relocations. This is usually provided by the kernel
itself when it's configured with CONFIG_DEBUG_INFO_BTF. However, this
configuration is not enabled in all the distributions and it's not
available on kernels before 5.12.
It's possible to use CO-RE in kernels without CONFIG_DEBUG_INFO_BTF
support by providing the BTF information from an external source.
BTFHub[0] contains BTF files to each released kernel not supporting BTF,
for the most popular distributions.
Providing this BTF file for a given kernel has some challenges:
1. Each BTF file is a few MBs big, then it's not possible to ship the
eBPF program with all the BTF files needed to run in different kernels.
(The BTF files will be in the order of GBs if you want to support a high
number of kernels)
2. Downloading the BTF file for the current kernel at runtime delays the
start of the program and it's not always possible to reach an external
host to download such a file.
Providing the BTF file with the information about all the data types of
the kernel for running an eBPF program is an overkill in many of the
cases. Usually the eBPF programs access only some kernel fields.
This series implements BTFGen support in bpftool. This idea was
discussed during the "Towards truly portable eBPF"[1] presentation at
Linux Plumbers 2021.
There is a good example[2] on how to use BTFGen and BTFHub together
to generate multiple BTF files, to each existing/supported kernel,
tailored to one application. For example: a complex bpf object might
support nearly 400 kernels by having BTF files summing only 1.5 MB.
[0]: https://github.com/aquasecurity/btfhub/
[1]: https://www.youtube.com/watch?v=igJLKyP1lFk&t=2418s
[2]: https://github.com/aquasecurity/btfhub/tree/main/tools
Changelog:
v6 > v7:
- use array instead of hashmap to store ids
- use btf__add_{struct,union}() instead of memcpy()
- don't use fixed path for testing BTF file
- update example to use DECLARE_LIBBPF_OPTS()
v5 > v6:
- use BTF structure to store used member/types instead of hashmaps
- remove support for input/output folders
- remove bpf_core_{created,free}_cand_cache()
- reorganize commits to avoid having unused static functions
- remove usage of libbpf_get_error()
- fix some errno propagation issues
- do not record full types for type-based relocations
- add support for BTF_KIND_FUNC_PROTO
- implement tests based on core_reloc ones
v4 > v5:
- move some checks before invoking prog->obj->gen_loader
- use p_info() instead of printf()
- improve command output
- fix issue with record_relo_core()
- implement bash completion
- write man page
- implement some tests
v3 > v4:
- parse BTF and BTF.ext sections in bpftool and use
bpf_core_calc_relo_insn() directly
- expose less internal details from libbpf to bpftool
- implement support for enum-based relocations
- split commits in a more granular way
v2 > v3:
- expose internal libbpf APIs to bpftool instead
- implement btfgen in bpftool
- drop btf__raw_data() from libbpf
v1 > v2:
- introduce bpf_object__prepare() and ‘record_core_relos’ to expose
CO-RE relocations instead of bpf_object__reloc_info_gen()
- rename btf__save_to_file() to btf__raw_data()
v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/
v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/
v3: https://lore.kernel.org/bpf/20211217185654.311609-1-mauricio@kinvolk.io/
v4: https://lore.kernel.org/bpf/20220112142709.102423-1-mauricio@kinvolk.io/
v5: https://lore.kernel.org/bpf/20220128223312.1253169-1-mauricio@kinvolk.io/
v6: https://lore.kernel.org/bpf/20220209222646.348365-1-mauricio@kinvolk.io/
Mauricio Vásquez (6):
libbpf: split bpf_core_apply_relo()
libbpf: Expose bpf_core_{add,free}_cands() to bpftool
bpftool: Add gen min_core_btf command
bpftool: Implement "gen min_core_btf" logic
bpftool: Implement btfgen_get_btf()
selftests/bpf: Test "bpftool gen min_core_btf"
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
This commit reuses the core_reloc test to check if the BTF files
generated with "bpftool gen min_core_btf" are correct. This introduces
test_core_btfgen() that runs all the core_reloc tests, but this time
the source BTF files are generated by using "bpftool gen min_core_btf".
The goal of this test is to check that the generated files are usable,
and not to check if the algorithm is creating an optimized BTF file.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-8-mauricio@kinvolk.io
|
|
Add "min_core_btf" feature explanation and one example of how to use it
to bpftool-gen man page.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-7-mauricio@kinvolk.io
|
|
The last part of the BTFGen algorithm is to create a new BTF object with
all the types that were recorded in the previous steps.
This function performs two different steps:
1. Add the types to the new BTF object by using btf__add_type(). Some
special logic around struct and unions is implemented to only add the
members that are really used in the field-based relocations. The type
ID on the new and old BTF objects is stored on a map.
2. Fix all the type IDs on the new BTF object by using the IDs saved in
the previous step.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-6-mauricio@kinvolk.io
|
|
This commit implements the logic for the gen min_core_btf command.
Specifically, it implements the following functions:
- minimize_btf(): receives the path of a source and destination BTF
files and a list of BPF objects. This function records the relocations
for all objects and then generates the BTF file by calling
btfgen_get_btf() (implemented in the following commit).
- btfgen_record_obj(): loads the BTF and BTF.ext sections of the BPF
objects and loops through all CO-RE relocations. It uses
bpf_core_calc_relo_insn() from libbpf and passes the target spec to
btfgen_record_reloc(), that calls one of the following functions
depending on the relocation kind.
- btfgen_record_field_relo(): uses the target specification to mark all
the types that are involved in a field-based CO-RE relocation. In this
case types resolved and marked recursively using btfgen_mark_type().
Only the struct and union members (and their types) involved in the
relocation are marked to optimize the size of the generated BTF file.
- btfgen_record_type_relo(): marks the types involved in a type-based
CO-RE relocation. In this case no members for the struct and union types
are marked as libbpf doesn't use them while performing this kind of
relocation. Pointed types are marked as they are used by libbpf in this
case.
- btfgen_record_enumval_relo(): marks the whole enum type for enum-based
relocations.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-5-mauricio@kinvolk.io
|
|
This command is implemented under the "gen" command in bpftool and the
syntax is the following:
$ bpftool gen min_core_btf INPUT OUTPUT OBJECT [OBJECT...]
INPUT is the file that contains all the BTF types for a kernel and
OUTPUT is the path of the minimize BTF file that will be created with
only the types needed by the objects.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-4-mauricio@kinvolk.io
|
|
Expose bpf_core_add_cands() and bpf_core_free_cands() to handle
candidates list.
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-3-mauricio@kinvolk.io
|
|
BTFGen needs to run the core relocation logic in order to understand
what are the types involved in a given relocation.
Currently bpf_core_apply_relo() calculates and **applies** a relocation
to an instruction. Having both operations in the same function makes it
difficult to only calculate the relocation without patching the
instruction. This commit splits that logic in two different phases: (1)
calculate the relocation and (2) patch the instruction.
For the first phase bpf_core_apply_relo() is renamed to
bpf_core_calc_relo_insn() who is now only on charge of calculating the
relocation, the second phase uses the already existing
bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
the BTFGen will use only bpf_core_calc_relo_insn().
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io
|
|
Now kfunc call uses s32 to represent the offset between the address of
kfunc and __bpf_call_base, but it doesn't check whether or not s32 will
be overflowed. The overflow is possible when kfunc is in module and the
offset between module and kernel is greater than 2GB. Take arm64 as an
example, before commit b2eed9b58811 ("arm64/kernel: kaslr: reduce module
randomization range to 2 GB"), the offset between module symbol and
__bpf_call_base will in 4GB range due to KASLR and may overflow s32.
So add an extra checking to reject these invalid kfunc calls.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220215065732.3179408-1-houtao1@huawei.com
|
|
Andrii Nakryiko says:
====================
Add minimal C++-specific additions to BPF skeleton codegen to facilitate
easier use of C skeletons in C++ applications. These additions don't add any
extra ongoing maintenance and allows C++ users to fit pure C skeleton better
into their C++ code base. All that without the need to design, implement and
support a separate C++ BPF skeleton implementation.
v1->v2:
- use default argument values in T::open() (Alexei).
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add an example of how to build C++ template-based BPF skeleton wrapper.
It's an actually runnable valid use of skeleton through more C++-like
interface. Note that skeleton destuction happens implicitly through
Skeleton<T>'s destructor.
Also make test_cpp runnable as it would have crashed on invalid btf
passed into btf_dump__new().
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220212055733.539056-3-andrii@kernel.org
|
|
Add C++-specific static methods for code-generated BPF skeleton for each
skeleton operation: open, open_opts, open_and_load, load, attach,
detach, destroy, and elf_bytes. This is to facilitate easier C++
templating on top of pure C BPF skeleton.
In C, open/load/destroy/etc "methods" are of the form
<skeleton_name>__<method>() to avoid name collision with similar
"methods" of other skeletons withint the same application. This works
well, but is very inconvenient for C++ applications that would like to
write generic (templated) wrappers around BPF skeleton to fit in with
C++ code base and take advantage of destructors and other convenient C++
constructs.
This patch makes it easier to build such generic templated wrappers by
additionally defining C++ static methods for skeleton's struct with
fixed names. This allows to refer to, say, open method as `T::open()`
instead of having to somehow generate `T__open()` function call.
Next patch adds an example template to test_cpp selftest to demonstrate
how it's possible to have all the operations wrapped in a generic
Skeleton<my_skeleton> type without explicitly passing function references.
An example of generated declaration section without %1$s placeholders:
#ifdef __cplusplus
static struct test_attach_probe *open(const struct bpf_object_open_opts *opts = nullptr);
static struct test_attach_probe *open_and_load();
static int load(struct test_attach_probe *skel);
static int attach(struct test_attach_probe *skel);
static void detach(struct test_attach_probe *skel);
static void destroy(struct test_attach_probe *skel);
static const void *elf_bytes(size_t *sz);
#endif /* __cplusplus */
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220212055733.539056-2-andrii@kernel.org
|
|
When compiling selftests in -O2 mode with GCC1, we get three new
compilations warnings about potentially uninitialized variables.
Compiler is wrong 2 out of 3 times, but this patch makes GCC11 happy
anyways, as it doesn't cost us anything and makes optimized selftests
build less annoying.
The amazing one is tc_redirect case of token that is malloc()'ed before
ASSERT_OK_PTR() check is done on it. Seems like GCC pessimistically
assumes that libbpf_get_error() will dereference the contents of the
pointer (no it won't), so the only way I found to shut GCC up was to do
zero-initializaing calloc(). This one was new to me.
For linfo case, GCC didn't realize that linfo_size will be initialized
by the function that is returning linfo_size as out parameter.
core_reloc.c case was a real bug, we can goto cleanup before initializing
obj. But we don't need to do any clean up, so just continue iteration
intstead.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211190927.1434329-1-andrii@kernel.org
|
|
When reworking btf__get_from_id() in commit a19f93cfafdf the error
handling when calling bpf_btf_get_fd_by_id() changed. Before the rework
if bpf_btf_get_fd_by_id() failed the error would not be propagated to
callers of btf__get_from_id(), after the rework it is. This lead to a
change in behavior in print_key_value() that now prints an error when
trying to lookup keys in maps with no btf available.
Fix this by following the way used in dumping maps to allow to look up
keys in no-btf maps, by which it decides whether and where to get the
btf info according to the btf value type.
Fixes: a19f93cfafdf ("libbpf: Add internal helper to load BTF data by FD")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/1644249625-22479-1-git-send-email-yinjun.zhang@corigine.com
|
|
When receiving netlink messages, libbpf was using a statically allocated
stack buffer of 4k bytes. This happened to work fine on systems with a 4k
page size, but on systems with larger page sizes it can lead to truncated
messages. The user-visible impact of this was that libbpf would insist no
XDP program was attached to some interfaces because that bit of the netlink
message got chopped off.
Fix this by switching to a dynamically allocated buffer; we borrow the
approach from iproute2 of using recvmsg() with MSG_PEEK|MSG_TRUNC to get
the actual size of the pending message before receiving it, adjusting the
buffer as necessary. While we're at it, also add retries on interrupted
system calls around the recvmsg() call.
v2:
- Move peek logic to libbpf_netlink_recv(), don't double free on ENOMEM.
Fixes: 8bbb77b7c7a2 ("libbpf: Add various netlink helpers")
Reported-by: Zhiqian Guan <zhguan@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20220211234819.612288-1-toke@redhat.com
|
|
Ensure that LIBBPF_0.7.0 inherits everything from LIBBPF_0.6.0.
Fixes: dbdd2c7f8cec ("libbpf: Add API to get/set log_level at per-program level")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211205235.2089104-1-andrii@kernel.org
|
|
Quentin Monnet says:
====================
Hi, this set aims at updating the way bpftool versions are numbered.
Instead of copying the version from the kernel (given that the sources for
the kernel and bpftool are shipped together), align it on libbpf's version
number, with a fixed offset (6) to avoid going backwards. Please refer to
the description of the second commit for details on the motivations.
The patchset also adds the number of the version of libbpf that was used to
compile to the output of "bpftool version". Bpftool makes such a heavy
usage of libbpf that it makes sense to indicate what version was used to
build it.
v3:
- Compute bpftool's version at compile time, but from the macros exposed by
libbpf instead of calling a shell to compute $(BPFTOOL_VERSION) in the
Makefile.
- Drop the commit which would add a "libbpfversion" target to libbpf's
Makefile. This is no longer necessary.
- Use libbpf's major, minor versions with jsonw_printf() to avoid
offsetting the version string to skip the "v" prefix.
- Reword documentation change.
v2:
- Align on libbpf's version number instead of creating an independent
versioning scheme.
- Use libbpf_version_string() to retrieve and display libbpf's version.
- Re-order patches (1 <-> 2).
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Since the notion of versions was introduced for bpftool, it has been
following the version number of the kernel (using the version number
corresponding to the tree in which bpftool's sources are located). The
rationale was that bpftool's features are loosely tied to BPF features
in the kernel, and that we could defer versioning to the kernel
repository itself.
But this versioning scheme is confusing today, because a bpftool binary
should be able to work with both older and newer kernels, even if some
of its recent features won't be available on older systems. Furthermore,
if bpftool is ported to other systems in the future, keeping a
Linux-based version number is not a good option.
Looking at other options, we could either have a totally independent
scheme for bpftool, or we could align it on libbpf's version number
(with an offset on the major version number, to avoid going backwards).
The latter comes with a few drawbacks:
- We may want bpftool releases in-between two libbpf versions. We can
always append pre-release numbers to distinguish versions, although
those won't look as "official" as something with a proper release
number. But at the same time, having bpftool with version numbers that
look "official" hasn't really been an issue so far.
- If no new feature lands in bpftool for some time, we may move from
e.g. 6.7.0 to 6.8.0 when libbpf levels up and have two different
versions which are in fact the same.
- Following libbpf's versioning scheme sounds better than kernel's, but
ultimately it doesn't make too much sense either, because even though
bpftool uses the lib a lot, its behaviour is not that much conditioned
by the internal evolution of the library (or by new APIs that it may
not use).
Having an independent versioning scheme solves the above, but at the
cost of heavier maintenance. Developers will likely forget to increase
the numbers when adding features or bug fixes, and we would take the
risk of having to send occasional "catch-up" patches just to update the
version number.
Based on these considerations, this patch aligns bpftool's version
number on libbpf's. This is not a perfect solution, but 1) it's
certainly an improvement over the current scheme, 2) the issues raised
above are all minor at the moment, and 3) we can still move to an
independent scheme in the future if we realise we need it.
Given that libbpf is currently at version 0.7.0, and bpftool, before
this patch, was at 5.16, we use an offset of 6 for the major version,
bumping bpftool to 6.7.0. Libbpf does not export its patch number;
leave bpftool's patch number at 0 for now.
It remains possible to manually override the version number by setting
BPFTOOL_VERSION when calling make.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220210104237.11649-3-quentin@isovalent.com
|
|
To help users check what version of libbpf is being used with bpftool,
print the number along with bpftool's own version number.
Output:
$ ./bpftool version
./bpftool v5.16.0
using libbpf v0.7
features: libbfd, libbpf_strict, skeletons
$ ./bpftool version --json --pretty
{
"version": "5.16.0",
"libbpf_version": "0.7",
"features": {
"libbfd": true,
"libbpf_strict": true,
"skeletons": true
}
}
Note that libbpf does not expose its patch number.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220210104237.11649-2-quentin@isovalent.com
|
|
bpf_prog_pack causes build error with powerpc ppc64_defconfig:
kernel/bpf/core.c:830:23: error: variably modified 'bitmap' at file scope
830 | unsigned long bitmap[BITS_TO_LONGS(BPF_PROG_CHUNK_COUNT)];
| ^~~~~~
This is because the marco expands as:
unsigned long bitmap[((((((1UL) << (16 + __pte_index_size)) / (1 << 6))) \
+ ((sizeof(long) * 8)) - 1) / ((sizeof(long) * 8)))];
where __pte_index_size is a global variable.
Fix it by turning bitmap into a 0-length array.
Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211024939.2962537-1-song@kernel.org
|
|
Update test_xdp_update_frags adding a test for a buffer size
set to (MAX_SKB_FRAGS + 2) * PAGE_SIZE. The kernel is supposed
to return -ENOMEM.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/3e4afa0ee4976854b2f0296998fe6754a80b62e5.1644366736.git.lorenzo@kernel.org
|
|
Alexei Starovoitov says:
====================
The libbpf performs a set of complex operations to load BPF programs.
With "loader program" and "CO-RE in the kernel" the loading job of
libbpf was diminished. The light skeleton became lean enough to perform
program loading and map creation tasks without libbpf.
It's now possible to tweak it further to make light skeleton usable
out of user space and out of kernel module.
This allows bpf_preload.ko to drop user-mode-driver usage,
drop host compiler dependency, allow cross compilation and simplify the
code. It's a building block toward safe and portable kernel modules.
v3->v4:
- inlined skel_prep_init_value() as direct assignment in lskel
v2->v3:
- dropped vm_mmap() and switched to bpf_loader_ctx->flags & KERNEL approach.
It allows bpf_preload.ko to be built-in.
The kernel is able to load bpf progs before init process starts.
- added comments (Yonghong's review)
- added error checks in lskel (Andrii's review)
- added Acks in all but 2nd patch.
v1->v2:
- removed redundant anon struct and added comments (Andrii's reivew)
- added Yonghong's ack
- fixed build warning when JIT is off
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The main change is a move of the single line
#include "iterators.lskel.h"
from iterators/iterators.c to bpf_preload_kern.c.
Which means that generated light skeleton can be used from user space or
user mode driver like iterators.c or from the kernel module or the kernel itself.
The direct use of light skeleton from the kernel module simplifies the code,
since UMD is no longer necessary. The libbpf.a required user space and UMD. The
CO-RE in the kernel and generated "loader bpf program" used by the light
skeleton are capable to perform complex loading operations traditionally
provided by libbpf. In addition UMD approach was launching UMD process
every time bpffs has to be mounted. With light skeleton in the kernel
the bpf_preload kernel module loads bpf iterators once and pins them
multiple times into different bpffs mounts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-6-alexei.starovoitov@gmail.com
|
|
Light skeleton and skel_internal.h have changed.
Update iterators.lskel.h.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-5-alexei.starovoitov@gmail.com
|
|
Generealize light skeleton by hiding mmap details in skel_internal.h
In this form generated lskel.h is usable both by user space and by the kernel.
Note that previously #include <bpf/bpf.h> was in *.lskel.h file.
To avoid #ifdef-s in a generated lskel.h the include of bpf.h is moved
to skel_internal.h, but skel_internal.h is also used by gen_loader.c
which is part of libbpf. Therefore skel_internal.h does #include "bpf.h"
in case of user space, so gen_loader.c and lskel.h have necessary definitions.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-4-alexei.starovoitov@gmail.com
|
|
Prepare light skeleton to be used in the kernel module and in the user space.
The look and feel of lskel.h is mostly the same with the difference that for
user space the skel->rodata is the same pointer before and after skel_load
operation, while in the kernel the skel->rodata after skel_open and the
skel->rodata after skel_load are different pointers.
Typical usage of skeleton remains the same for kernel and user space:
skel = my_bpf__open();
skel->rodata->my_global_var = init_val;
err = my_bpf__load(skel);
err = my_bpf__attach(skel);
// access skel->rodata->my_global_var;
// access skel->bss->another_var;
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-3-alexei.starovoitov@gmail.com
|
|
bpf_sycall programs can be used directly by the kernel modules
to load programs and create maps via kernel skeleton.
. Export bpf_sys_bpf syscall wrapper to be used in kernel skeleton.
. Export bpf_map_get to be used in kernel skeleton.
. Allow prog_run cmd for bpf_syscall programs with recursion check.
. Enable link_create and raw_tp_open cmds.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-2-alexei.starovoitov@gmail.com
|
|
drivers/net/dsa/qca8k.c:422:37-43: ERROR: application of sizeof to pointer
sizeof when applied to a pointer typed expression gives the size of
the pointer
Generated by: scripts/coccinelle/misc/noderef.cocci
Fixes: 90386223f44e ("net: dsa: qca8k: add support for larger read/write size with mgmt Ethernet")
CC: Ansuel Smith <ansuelsmth@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220209221304.GA17529@d2214a582157
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace zero-length array with flexible-array member and make use
of the struct_size() helper in kmalloc(). For example:
struct switchdev_deferred_item {
...
unsigned long data[];
};
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 563f8e97e054 ("ipv4: Stop taking ECN bits into account in
fib4-rules") replaced the validation test on frh->tos. While the new
test is stricter for ECN bits, it doesn't detect the use of high order
DSCP bits. This would be fine if IPv4 could properly handle them. But
currently, most IPv4 lookups are done with the three high DSCP bits
masked. Therefore, using these bits doesn't lead to the expected
result.
Let's reject such configurations again, so that nobody starts to
use and make any assumption about how the stack handles the three high
order DSCP bits in fib4 rules.
Fixes: 563f8e97e054 ("ipv4: Stop taking ECN bits into account in fib4-rules")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds TC feature for VFs also. When MCAM
rules are allocated for a VF then either TC or ntuple
filters can be used. Below are the commands to use
TC feature for a VF(say lbk0):
devlink dev param set pci/0002:01:00.1 name mcam_count value 16 \
cmode runtime
ethtool -K lbk0 hw-tc-offload on
ifconfig lbk0 up
tc qdisc add dev lbk0 ingress
tc filter add dev lbk0 parent ffff: protocol ip flower skip_sw \
dst_mac 98:03:9b:83:aa:12 action police rate 100Mbit burst 5000
Also to modify any fields of the hardware context with
NIX_AQ_INSTOP_WRITE command then corresponding masks of those
fields must be set as per hardware. This was missing in
ingress ratelimiting context. This patch sets those masks also.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Having to acquire rtnl from netdev_run_todo() for every dismantled
device is not desirable when/if rtnl is under stress.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Device firmware can assert if the device shutdown path in driver
encounters an async. events from mfw (processed in
qed_mcp_handle_events()) after qed_mcp_unload_req() returns.
A call to qed_mcp_unload_req() currently marks the device as inactive
and thus stops any new events, but there is a windows where in-flight
events might still be received by the driver.
To prevent this race condition, atomically set QED_MCP_BYPASS_PROC_BIT
in qed_mcp_unload_req() to make sure qed_mcp_handle_events() ignores all
events. Wait for any event that might already be in-process to complete
by monitoring QED_MCP_IN_PROCESSING_BIT.
Signed-off-by: Pravin Kumar Ganesh Dhende <pdhende@marvell.com>
Signed-off-by: Venkata Sudheer Kumar Bhavaraju <vbhavaraju@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
net: ping6: support basic socket cmsgs
Add support for common SOL_SOCKET cmsgs in ICMPv6 sockets.
Extend the cmsg tests to cover more cmsgs and socket types.
SOL_IPV6 cmsgs to follow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test TIMESTAMPING and TXTIME across UDP / ICMP and IP versions.
Before ICMPv6 support:
# ./tools/testing/selftests/net/cmsg_time.sh
Case ICMPv6 - ts cnt returned '0', expected '2'
Case ICMPv6 - ts0 SCHED returned '', expected 'OK'
Case ICMPv6 - ts0 SND returned '', expected 'OK'
Case ICMPv6 - TXTIME abs returned '', expected 'OK'
Case ICMPv6 - TXTIME rel returned '', expected 'OK'
FAIL - 5/36 cases failed
After:
# ./tools/testing/selftests/net/cmsg_time.sh
OK
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Support requesting Tx timestamps:
$ ./cmsg_sender -p i -t -4 $tgt 123 -d 1000
SCHED ts0 61us
SND ts0 1071us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add ability to send delayed packets.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test if setting SO_MARK with setsockopt works and if cmsg
takes precedence over it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use new capabilities of cmsg_sender to test ICMP and RAW sockets,
previously only UDP was tested.
Before SO_MARK support was added to ICMPv6:
# ./cmsg_so_mark.sh
Case ICMP rejection returned 0, expected 1
FAIL - 1/12 cases failed
After:
# ./cmsg_so_mark.sh
OK
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Support sending fake ICMP(v6) messages and UDP via RAW sockets.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Parametrize the code so that it can support UDP and ICMP
sockets in the future, and more cmsg types.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename the file in prep for generalization.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Minor reordering of the code and a call to sock_cmsg_send()
gives us support for setting the common socket options via
cmsg (the usual ones - SO_MARK, SO_TIMESTAMPING_OLD, SCM_TXTIME).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Nothing prevents the user from requesting timestamping
on ping6 sockets, yet timestamps are not going to be reported.
Plumb the flags through.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|