Merge tag 'kbuild-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada: - Add generic support for built-in boot DTB files - Enable TAB cycling for dialog buttons in nconfig - Fix issues in streamline_config.pl - Refactor Kconfig - Add support for Clang's AutoFDO (Automatic Feedback-Directed Optimization) - Add support for Clang's Propeller, a profile-guided optimization. - Change the working directory to the external module directory for M= builds - Support building external modules in a separate output directory - Enable objtool for *.mod.o and additional kernel objects - Use lz4 instead of deprecated lz4c - Work around a performance issue with "git describe" - Refactor modpost * tag 'kbuild-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (85 commits) kbuild: rename .tmp_vmlinux.kallsyms0.syms to .tmp_vmlinux0.syms gitignore: Don't ignore 'tags' directory kbuild: add dependency from vmlinux to resolve_btfids modpost: replace tdb_hash() with hash_str() kbuild: deb-pkg: add python3:native to build dependency genksyms: reduce indentation in export_symbol() modpost: improve error messages in device_id_check() modpost: rename alias symbol for MODULE_DEVICE_TABLE() modpost: rename variables in handle_moddevtable() modpost: move strstarts() to modpost.h modpost: convert do_usb_table() to a generic handler modpost: convert do_of_table() to a generic handler modpost: convert do_pnp_device_entry() to a generic handler modpost: convert do_pnp_card_entries() to a generic handler modpost: call module_alias_printf() from all do_*_entry() functions modpost: pass (struct module *) to do_*_entry() functions modpost: remove DEF_FIELD_ADDR_VAR() macro modpost: deduplicate MODULE_ALIAS() for all drivers modpost: introduce module_alias_printf() helper modpost: remove unnecessary check in do_acpi_entry() ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2024-11-30 13:41:50 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2024-11-30 13:41:50 -0800
commit: 6a34dfa15d6edf7e78b8118d862d2db0889cf669 (patch)
tree: f5bc36ccaa5251fa660cf2efdb00d6ffa3ace36b /Documentation
parent: 0e287d31b62bb53ad81d5e59778384a40f8b6f56 (diff)
parent: e6064da6461f989a357f2e280d7f8d4155267c4c (diff)
8 files changed, 390 insertions, 19 deletions
diff --git a/Documentation/dev-tools/autofdo.rst b/Documentation/dev-tools/autofdo.rst
new file mode 100644
index 000000000000..1f0a451e9ccd
--- /dev/null
+++ b/Documentation/dev-tools/autofdo.rst
@@ -0,0 +1,168 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Using AutoFDO with the Linux kernel
+===================================
+
+This enables AutoFDO build support for the kernel when using
+the Clang compiler. AutoFDO (Auto-Feedback-Directed Optimization)
+is a type of profile-guided optimization (PGO) used to enhance the
+performance of binary executables. It gathers information about the
+frequency of execution of various code paths within a binary using
+hardware sampling. This data is then used to guide the compiler's
+optimization decisions, resulting in a more efficient binary. AutoFDO
+is a powerful optimization technique, and data indicates that it can
+significantly improve kernel performance. It's especially beneficial
+for workloads affected by front-end stalls.
+
+For AutoFDO builds, unlike non-FDO builds, the user must supply a
+profile. Acquiring an AutoFDO profile can be done in several ways.
+AutoFDO profiles are created by converting hardware sampling using
+the "perf" tool. It is crucial that the workload used to create these
+perf files is representative; they must exhibit runtime
+characteristics similar to the workloads that are intended to be
+optimized. Failure to do so will result in the compiler optimizing
+for the wrong objective.
+
+The AutoFDO profile often encapsulates the program's behavior. If the
+performance-critical codes are architecture-independent, the profile
+can be applied across platforms to achieve performance gains. For
+instance, using the profile generated on Intel architecture to build
+a kernel for AMD architecture can also yield performance improvements.
+
+There are two methods for acquiring a representative profile:
+(1) Sample real workloads using a production environment.
+(2) Generate the profile using a representative load test.
+When enabling the AutoFDO build configuration without providing an
+AutoFDO profile, the compiler only modifies the dwarf information in
+the kernel without impacting runtime performance. It's advisable to
+use a kernel binary built with the same AutoFDO configuration to
+collect the perf profile. While it's possible to use a kernel built
+with different options, it may result in inferior performance.
+
+One can collect profiles using AutoFDO build for the previous kernel.
+AutoFDO employs relative line numbers to match the profiles, offering
+some tolerance for source changes. This mode is commonly used in a
+production environment for profile collection.
+
+In a profile collection based on a load test, the AutoFDO collection
+process consists of the following steps:
+
+#. Initial build: The kernel is built with AutoFDO options
+   without a profile.
+
+#. Profiling: The above kernel is then run with a representative
+   workload to gather execution frequency data. This data is
+   collected using hardware sampling, via perf. AutoFDO is most
+   effective on platforms supporting advanced PMU features like
+   LBR on Intel machines.
+
+#. AutoFDO profile generation: Perf output file is converted to
+   the AutoFDO profile via offline tools.
+
+The support requires a Clang compiler LLVM 17 or later.
+
+Preparation
+===========
+
+Configure the kernel with::
+
+   CONFIG_AUTOFDO_CLANG=y
+
+Customization
+=============
+
+The default CONFIG_AUTOFDO_CLANG setting covers kernel space objects for
+AutoFDO builds. One can, however, enable or disable AutoFDO build for
+individual files and directories by adding a line similar to the following
+to the respective kernel Makefile:
+
+- For enabling a single file (e.g. foo.o) ::
+
+   AUTOFDO_PROFILE_foo.o := y
+
+- For enabling all files in one directory ::
+
+   AUTOFDO_PROFILE := y
+
+- For disabling one file ::
+
+   AUTOFDO_PROFILE_foo.o := n
+
+- For disabling all files in one directory ::
+
+   AUTOFDO_PROFILE := n
+
+Workflow
+========
+
+Here is an example workflow for AutoFDO kernel:
+
+1)  Build the kernel on the host machine with LLVM enabled,
+    for example, ::
+
+      $ make menuconfig LLVM=1
+
+    Turn on AutoFDO build config::
+
+      CONFIG_AUTOFDO_CLANG=y
+
+    With a configuration that with LLVM enabled, use the following command::
+
+      $ scripts/config -e AUTOFDO_CLANG
+
+    After getting the config, build with ::
+
+      $ make LLVM=1
+
+2) Install the kernel on the test machine.
+
+3) Run the load tests. The '-c' option in perf specifies the sample
+   event period. We suggest using a suitable prime number, like 500009,
+   for this purpose.
+
+   - For Intel platforms::
+
+      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
+
+   - For AMD platforms:
+
+     The supported systems are: Zen3 with BRS, or Zen4 with amd_lbr_v2. To check,
+
+     For Zen3::
+
+      $ cat proc/cpuinfo | grep " brs"
+
+     For Zen4::
+
+      $ cat proc/cpuinfo | grep amd_lbr_v2
+
+     The following command generated the perf data file::
+
+      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
+
+4) (Optional) Download the raw perf file to the host machine.
+
+5) To generate an AutoFDO profile, two offline tools are available:
+   create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part
+   of the AutoFDO project and can be found on GitHub
+   (https://github.com/google/autofdo), version v0.30.1 or later.
+   The llvm_profgen tool is included in the LLVM compiler itself. It's
+   important to note that the version of llvm_profgen doesn't need to match
+   the version of Clang. It needs to be the LLVM 19 release of Clang
+   or later, or just from the LLVM trunk. ::
+
+      $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> -o <profile_file>
+
+   or ::
+
+      $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> --format=extbinary --out=<profile_file>
+
+   Note that multiple AutoFDO profile files can be merged into one via::
+
+      $ llvm-profdata merge -o <profile_file> <profile_1> <profile_2> ... <profile_n>
+
+6) Rebuild the kernel using the AutoFDO profile file with the same config as step 1,
+   (Note CONFIG_AUTOFDO_CLANG needs to be enabled)::
+
+      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file>
diff --git a/Documentation/dev-tools/coccinelle.rst b/Documentation/dev-tools/coccinelle.rst
index 535ce126fb4f..6e70a1e9a3c0 100644
--- a/Documentation/dev-tools/coccinelle.rst
+++ b/Documentation/dev-tools/coccinelle.rst
@@ -250,25 +250,17 @@ variables for .cocciconfig is as follows:
 - Your directory from which spatch is called is processed next
 - The directory provided with the ``--dir`` option is processed last, if used
 
-Since coccicheck runs through make, it naturally runs from the kernel
-proper dir; as such the second rule above would be implied for picking up a
-.cocciconfig when using ``make coccicheck``.
-
 ``make coccicheck`` also supports using M= targets. If you do not supply
 any M= target, it is assumed you want to target the entire kernel.
 The kernel coccicheck script has::
 
-    if [ "$KBUILD_EXTMOD" = "" ] ; then
-        OPTIONS="--dir $srctree $COCCIINCLUDE"
-    else
-        OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE"
-    fi
-
-KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases
-the spatch ``--dir`` argument is used, as such third rule applies when whether
-M= is used or not, and when M= is used the target directory can have its own
-.cocciconfig file. When M= is not passed as an argument to coccicheck the
-target directory is the same as the directory from where spatch was called.
+    OPTIONS="--dir $srcroot $COCCIINCLUDE"
+
+Here, $srcroot refers to the source directory of the target: it points to the
+external module's source directory when M= used, and otherwise, to the kernel
+source directory. The third rule ensures the spatch reads the .cocciconfig from
+the target directory, allowing external modules to have their own .cocciconfig
+file.
 
 If not using the kernel's coccicheck target, keep the above precedence
 order logic of .cocciconfig reading. If using the kernel's coccicheck target,
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 53d4d124f9c5..3c0ac08b2709 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -34,6 +34,8 @@ Documentation/dev-tools/testing-overview.rst
    ktap
    checkuapi
    gpio-sloppy-logic-analyzer
+   autofdo
+   propeller
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/propeller.rst b/Documentation/dev-tools/propeller.rst
new file mode 100644
index 000000000000..92195958e3db
--- /dev/null
+++ b/Documentation/dev-tools/propeller.rst
@@ -0,0 +1,162 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================
+Using Propeller with the Linux kernel
+=====================================
+
+This enables Propeller build support for the kernel when using Clang
+compiler. Propeller is a profile-guided optimization (PGO) method used
+to optimize binary executables. Like AutoFDO, it utilizes hardware
+sampling to gather information about the frequency of execution of
+different code paths within a binary. Unlike AutoFDO, this information
+is then used right before linking phase to optimize (among others)
+block layout within and across functions.
+
+A few important notes about adopting Propeller optimization:
+
+#. Although it can be used as a standalone optimization step, it is
+   strongly recommended to apply Propeller on top of AutoFDO,
+   AutoFDO+ThinLTO or Instrument FDO. The rest of this document
+   assumes this paradigm.
+
+#. Propeller uses another round of profiling on top of
+   AutoFDO/AutoFDO+ThinLTO/iFDO. The whole build process involves
+   "build-afdo - train-afdo - build-propeller - train-propeller -
+   build-optimized".
+
+#. Propeller requires LLVM 19 release or later for Clang/Clang++
+   and the linker(ld.lld).
+
+#. In addition to LLVM toolchain, Propeller requires a profiling
+   conversion tool: https://github.com/google/autofdo with a release
+   after v0.30.1: https://github.com/google/autofdo/releases/tag/v0.30.1.
+
+The Propeller optimization process involves the following steps:
+
+#. Initial building: Build the AutoFDO or AutoFDO+ThinLTO binary as
+   you would normally do, but with a set of compile-time / link-time
+   flags, so that a special metadata section is created within the
+   kernel binary. The special section is only intend to be used by the
+   profiling tool, it is not part of the runtime image, nor does it
+   change kernel run time text sections.
+
+#. Profiling: The above kernel is then run with a representative
+   workload to gather execution frequency data. This data is collected
+   using hardware sampling, via perf. Propeller is most effective on
+   platforms supporting advanced PMU features like LBR on Intel
+   machines. This step is the same as profiling the kernel for AutoFDO
+   (the exact perf parameters can be different).
+
+#. Propeller profile generation: Perf output file is converted to a
+   pair of Propeller profiles via an offline tool.
+
+#. Optimized build: Build the AutoFDO or AutoFDO+ThinLTO optimized
+   binary as you would normally do, but with a compile-time /
+   link-time flag to pick up the Propeller compile time and link time
+   profiles. This build step uses 3 profiles - the AutoFDO profile,
+   the Propeller compile-time profile and the Propeller link-time
+   profile.
+
+#. Deployment: The optimized kernel binary is deployed and used
+   in production environments, providing improved performance
+   and reduced latency.
+
+Preparation
+===========
+
+Configure the kernel with::
+
+   CONFIG_AUTOFDO_CLANG=y
+   CONFIG_PROPELLER_CLANG=y
+
+Customization
+=============
+
+The default CONFIG_PROPELLER_CLANG setting covers kernel space objects
+for Propeller builds. One can, however, enable or disable Propeller build
+for individual files and directories by adding a line similar to the
+following to the respective kernel Makefile:
+
+- For enabling a single file (e.g. foo.o)::
+
+   PROPELLER_PROFILE_foo.o := y
+
+- For enabling all files in one directory::
+
+   PROPELLER_PROFILE := y
+
+- For disabling one file::
+
+   PROPELLER_PROFILE_foo.o := n
+
+- For disabling all files in one directory::
+
+   PROPELLER__PROFILE := n
+
+
+Workflow
+========
+
+Here is an example workflow for building an AutoFDO+Propeller kernel:
+
+1) Assuming an AutoFDO profile is already collected following
+   instructions in the AutoFDO document, build the kernel on the host
+   machine, with AutoFDO and Propeller build configs ::
+
+      CONFIG_AUTOFDO_CLANG=y
+      CONFIG_PROPELLER_CLANG=y
+
+   and ::
+
+      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo-profile-name>
+
+2) Install the kernel on the test machine.
+
+3) Run the load tests. The '-c' option in perf specifies the sample
+   event period. We suggest using a suitable prime number, like 500009,
+   for this purpose.
+
+   - For Intel platforms::
+
+      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
+
+   - For AMD platforms::
+
+      $ perf record --pfm-event RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
+
+   Note you can repeat the above steps to collect multiple <perf_file>s.
+
+4) (Optional) Download the raw perf file(s) to the host machine.
+
+5) Use the create_llvm_prof tool (https://github.com/google/autofdo) to
+   generate Propeller profile. ::
+
+      $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file>
+                         --format=propeller --propeller_output_module_name
+                         --out=<propeller_profile_prefix>_cc_profile.txt
+                         --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt
+
+   "<propeller_profile_prefix>" can be something like "/home/user/dir/any_string".
+
+   This command generates a pair of Propeller profiles:
+   "<propeller_profile_prefix>_cc_profile.txt" and
+   "<propeller_profile_prefix>_ld_profile.txt".
+
+   If there are more than 1 perf_file collected in the previous step,
+   you can create a temp list file "<perf_file_list>" with each line
+   containing one perf file name and run::
+
+      $ create_llvm_prof --binary=<vmlinux> --profile=@<perf_file_list>
+                         --format=propeller --propeller_output_module_name
+                         --out=<propeller_profile_prefix>_cc_profile.txt
+                         --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt
+
+6) Rebuild the kernel using the AutoFDO and Propeller
+   profiles. ::
+
+      CONFIG_AUTOFDO_CLANG=y
+      CONFIG_PROPELLER_CLANG=y
+
+   and ::
+
+      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix>
diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index 1796b3eba37b..17c9f920f03d 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -137,12 +137,18 @@ Specify the output directory when building the kernel.
 This variable can also be used to point to the kernel output directory when
 building external modules against a pre-built kernel in a separate build
 directory. Please note that this does NOT specify the output directory for the
-external modules themselves.
+external modules themselves. (Use KBUILD_EXTMOD_OUTPUT for that purpose.)
 
 The output directory can also be specified using "O=...".
 
 Setting "O=..." takes precedence over KBUILD_OUTPUT.
 
+KBUILD_EXTMOD_OUTPUT
+--------------------
+Specify the output directory for external modules.
+
+Setting "MO=..." takes precedence over KBUILD_EXTMOD_OUTPUT.
+
 KBUILD_EXTRA_WARN
 -----------------
 Specify the extra build checks. The same value can be assigned by passing
diff --git a/Documentation/kbuild/kconfig-language.rst b/Documentation/kbuild/kconfig-language.rst
index 43037be96a16..2619fdf56e68 100644
--- a/Documentation/kbuild/kconfig-language.rst
+++ b/Documentation/kbuild/kconfig-language.rst
@@ -412,8 +412,8 @@ choices::
 	<choice block>
 	"endchoice"
 
-This defines a choice group and accepts any of the above attributes as
-options.
+This defines a choice group and accepts "prompt", "default", "depends on", and
+"help" attributes as options.
 
 A choice only allows a single config entry to be selected.
 
diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst
index 7964e0c245ae..d36519f194dc 100644
--- a/Documentation/kbuild/makefiles.rst
+++ b/Documentation/kbuild/makefiles.rst
@@ -449,6 +449,20 @@ $(obj)
   to prerequisites are referenced with $(src) (because they are not
   generated files).
 
+$(srcroot)
+  $(srcroot) refers to the root of the source you are building, which can be
+  either the kernel source or the external modules source, depending on whether
+  KBUILD_EXTMOD is set. This can be either a relative or an absolute path, but
+  if KBUILD_ABS_SRCTREE=1 is set, it is always an absolute path.
+
+$(srctree)
+  $(srctree) refers to the root of the kernel source tree. When building the
+  kernel, this is the same as $(srcroot).
+
+$(objtree)
+  $(objtree) refers to the root of the kernel object tree. It is ``.`` when
+  building the kernel, but it is different when building external modules.
+
 $(kecho)
   echoing information to user in a rule is often a good practice
   but when execution ``make -s`` one does not expect to see any output
diff --git a/Documentation/kbuild/modules.rst b/Documentation/kbuild/modules.rst
index cd5a54d91e6d..101de236cd0c 100644
--- a/Documentation/kbuild/modules.rst
+++ b/Documentation/kbuild/modules.rst
@@ -59,6 +59,12 @@ Command Syntax
 
 		$ make -C /lib/modules/`uname -r`/build M=$PWD modules_install
 
+	Starting from Linux 6.13, you can use the -f option instead of -C. This
+	will avoid unnecessary change of the working directory. The external
+	module will be output to the directory where you invoke make.
+
+		$ make -f /lib/modules/`uname -r`/build/Makefile M=$PWD
+
 Options
 -------
 
@@ -66,7 +72,10 @@ Options
 	of the kernel output directory if the kernel was built in a separate
 	build directory.)
 
-	make -C $KDIR M=$PWD
+	You can optionally pass MO= option if you want to build the modules in
+	a separate directory.
+
+	make -C $KDIR M=$PWD [MO=$BUILD_DIR]
 
 	-C $KDIR
 		The directory that contains the kernel and relevant build
@@ -80,6 +89,9 @@ Options
 		directory where the external module (kbuild file) is
 		located.
 
+	MO=$BUILD_DIR
+		Specifies a separate output directory for the external module.
+
 Targets
 -------
 
@@ -215,6 +227,21 @@ Separate Kbuild File and Makefile
 	consisting of several hundred lines, and here it really pays
 	off to separate the kbuild part from the rest.
 
+	Linux 6.13 and later support another way. The external module Makefile
+	can include the kernel Makefile directly, rather than invoking sub Make.
+
+	Example 3::
+
+		--> filename: Kbuild
+		obj-m  := 8123.o
+		8123-y := 8123_if.o 8123_pci.o
+
+		--> filename: Makefile
+		KDIR ?= /lib/modules/$(shell uname -r)/build
+		export KBUILD_EXTMOD := $(realpath $(dir $(lastword $(MAKEFILE_LIST))))
+		include $(KDIR)/Makefile
+
+
 Building Multiple Modules
 -------------------------
author	Linus Torvalds <torvalds@linux-foundation.org>	2024-11-30 13:41:50 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2024-11-30 13:41:50 -0800
commit	6a34dfa15d6edf7e78b8118d862d2db0889cf669 (patch)
tree	f5bc36ccaa5251fa660cf2efdb00d6ffa3ace36b /Documentation
parent	0e287d31b62bb53ad81d5e59778384a40f8b6f56 (diff)
parent	e6064da6461f989a357f2e280d7f8d4155267c4c (diff)