summaryrefslogtreecommitdiff
path: root/Documentation/userspace-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/userspace-api')
-rw-r--r--Documentation/userspace-api/check_exec.rst144
-rw-r--r--Documentation/userspace-api/index.rst2
-rw-r--r--Documentation/userspace-api/ioctl/ioctl-number.rst2
-rw-r--r--Documentation/userspace-api/netlink/c-code-gen.rst4
-rw-r--r--Documentation/userspace-api/netlink/intro-specs.rst8
-rw-r--r--Documentation/userspace-api/ntsync.rst385
-rw-r--r--Documentation/userspace-api/sysfs-platform_profile.rst38
7 files changed, 578 insertions, 5 deletions
diff --git a/Documentation/userspace-api/check_exec.rst b/Documentation/userspace-api/check_exec.rst
new file mode 100644
index 000000000000..05dfe3b56f71
--- /dev/null
+++ b/Documentation/userspace-api/check_exec.rst
@@ -0,0 +1,144 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright © 2024 Microsoft Corporation
+
+===================
+Executability check
+===================
+
+The ``AT_EXECVE_CHECK`` :manpage:`execveat(2)` flag, and the
+``SECBIT_EXEC_RESTRICT_FILE`` and ``SECBIT_EXEC_DENY_INTERACTIVE`` securebits
+are intended for script interpreters and dynamic linkers to enforce a
+consistent execution security policy handled by the kernel. See the
+`samples/check-exec/inc.c`_ example.
+
+Whether an interpreter should check these securebits or not depends on the
+security risk of running malicious scripts with respect to the execution
+environment, and whether the kernel can check if a script is trustworthy or
+not. For instance, Python scripts running on a server can use arbitrary
+syscalls and access arbitrary files. Such interpreters should then be
+enlighten to use these securebits and let users define their security policy.
+However, a JavaScript engine running in a web browser should already be
+sandboxed and then should not be able to harm the user's environment.
+
+Script interpreters or dynamic linkers built for tailored execution environments
+(e.g. hardened Linux distributions or hermetic container images) could use
+``AT_EXECVE_CHECK`` without checking the related securebits if backward
+compatibility is handled by something else (e.g. atomic update ensuring that
+all legitimate libraries are allowed to be executed). It is then recommended
+for script interpreters and dynamic linkers to check the securebits at run time
+by default, but also to provide the ability for custom builds to behave like if
+``SECBIT_EXEC_RESTRICT_FILE`` or ``SECBIT_EXEC_DENY_INTERACTIVE`` were always
+set to 1 (i.e. always enforce restrictions).
+
+AT_EXECVE_CHECK
+===============
+
+Passing the ``AT_EXECVE_CHECK`` flag to :manpage:`execveat(2)` only performs a
+check on a regular file and returns 0 if execution of this file would be
+allowed, ignoring the file format and then the related interpreter dependencies
+(e.g. ELF libraries, script's shebang).
+
+Programs should always perform this check to apply kernel-level checks against
+files that are not directly executed by the kernel but passed to a user space
+interpreter instead. All files that contain executable code, from the point of
+view of the interpreter, should be checked. However the result of this check
+should only be enforced according to ``SECBIT_EXEC_RESTRICT_FILE`` or
+``SECBIT_EXEC_DENY_INTERACTIVE.``.
+
+The main purpose of this flag is to improve the security and consistency of an
+execution environment to ensure that direct file execution (e.g.
+``./script.sh``) and indirect file execution (e.g. ``sh script.sh``) lead to
+the same result. For instance, this can be used to check if a file is
+trustworthy according to the caller's environment.
+
+In a secure environment, libraries and any executable dependencies should also
+be checked. For instance, dynamic linking should make sure that all libraries
+are allowed for execution to avoid trivial bypass (e.g. using ``LD_PRELOAD``).
+For such secure execution environment to make sense, only trusted code should
+be executable, which also requires integrity guarantees.
+
+To avoid race conditions leading to time-of-check to time-of-use issues,
+``AT_EXECVE_CHECK`` should be used with ``AT_EMPTY_PATH`` to check against a
+file descriptor instead of a path.
+
+SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE
+==========================================================
+
+When ``SECBIT_EXEC_RESTRICT_FILE`` is set, a process should only interpret or
+execute a file if a call to :manpage:`execveat(2)` with the related file
+descriptor and the ``AT_EXECVE_CHECK`` flag succeed.
+
+This secure bit may be set by user session managers, service managers,
+container runtimes, sandboxer tools... Except for test environments, the
+related ``SECBIT_EXEC_RESTRICT_FILE_LOCKED`` bit should also be set.
+
+Programs should only enforce consistent restrictions according to the
+securebits but without relying on any other user-controlled configuration.
+Indeed, the use case for these securebits is to only trust executable code
+vetted by the system configuration (through the kernel), so we should be
+careful to not let untrusted users control this configuration.
+
+However, script interpreters may still use user configuration such as
+environment variables as long as it is not a way to disable the securebits
+checks. For instance, the ``PATH`` and ``LD_PRELOAD`` variables can be set by
+a script's caller. Changing these variables may lead to unintended code
+executions, but only from vetted executable programs, which is OK. For this to
+make sense, the system should provide a consistent security policy to avoid
+arbitrary code execution e.g., by enforcing a write xor execute policy.
+
+When ``SECBIT_EXEC_DENY_INTERACTIVE`` is set, a process should never interpret
+interactive user commands (e.g. scripts). However, if such commands are passed
+through a file descriptor (e.g. stdin), its content should be interpreted if a
+call to :manpage:`execveat(2)` with the related file descriptor and the
+``AT_EXECVE_CHECK`` flag succeed.
+
+For instance, script interpreters called with a script snippet as argument
+should always deny such execution if ``SECBIT_EXEC_DENY_INTERACTIVE`` is set.
+
+This secure bit may be set by user session managers, service managers,
+container runtimes, sandboxer tools... Except for test environments, the
+related ``SECBIT_EXEC_DENY_INTERACTIVE_LOCKED`` bit should also be set.
+
+Here is the expected behavior for a script interpreter according to combination
+of any exec securebits:
+
+1. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0``
+
+ Always interpret scripts, and allow arbitrary user commands (default).
+
+ No threat, everyone and everything is trusted, but we can get ahead of
+ potential issues thanks to the call to :manpage:`execveat(2)` with
+ ``AT_EXECVE_CHECK`` which should always be performed but ignored by the
+ script interpreter. Indeed, this check is still important to enable systems
+ administrators to verify requests (e.g. with audit) and prepare for
+ migration to a secure mode.
+
+2. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0``
+
+ Deny script interpretation if they are not executable, but allow
+ arbitrary user commands.
+
+ The threat is (potential) malicious scripts run by trusted (and not fooled)
+ users. That can protect against unintended script executions (e.g. ``sh
+ /tmp/*.sh``). This makes sense for (semi-restricted) user sessions.
+
+3. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1``
+
+ Always interpret scripts, but deny arbitrary user commands.
+
+ This use case may be useful for secure services (i.e. without interactive
+ user session) where scripts' integrity is verified (e.g. with IMA/EVM or
+ dm-verity/IPE) but where access rights might not be ready yet. Indeed,
+ arbitrary interactive commands would be much more difficult to check.
+
+4. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1``
+
+ Deny script interpretation if they are not executable, and also deny
+ any arbitrary user commands.
+
+ The threat is malicious scripts run by untrusted users (but trusted code).
+ This makes sense for system services that may only execute trusted scripts.
+
+.. Links
+.. _samples/check-exec/inc.c:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/check-exec/inc.c
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 274cc7546efc..b1395d94b3fd 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -35,6 +35,7 @@ Security-related interfaces
mfd_noexec
spec_ctrl
tee
+ check_exec
Devices and I/O
===============
@@ -63,6 +64,7 @@ Everything else
vduse
futex2
perf_ring_buffer
+ ntsync
.. only:: subproject and html
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 243f1f1b554a..6d1465315df3 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -283,6 +283,7 @@ Code Seq# Include File Comments
'p' 80-9F linux/ppdev.h user-space parport
<mailto:tim@cyberelk.net>
'p' A1-A5 linux/pps.h LinuxPPS
+'p' B1-B3 linux/pps_gen.h LinuxPPS
<mailto:giometti@linux.it>
'q' 00-1F linux/serio.h
'q' 80-FF linux/telephony.h Internet PhoneJACK, Internet LineJACK
@@ -311,6 +312,7 @@ Code Seq# Include File Comments
<mailto:oe@port.de>
'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict!
'|' 00-7F linux/media.h
+'|' 80-9F samples/ Any sample and example drivers
0x80 00-1F linux/fb.h
0x81 00-1F linux/vduse.h
0x89 00-06 arch/x86/include/asm/sockios.h
diff --git a/Documentation/userspace-api/netlink/c-code-gen.rst b/Documentation/userspace-api/netlink/c-code-gen.rst
index 89de42c13350..46415e6d646d 100644
--- a/Documentation/userspace-api/netlink/c-code-gen.rst
+++ b/Documentation/userspace-api/netlink/c-code-gen.rst
@@ -56,7 +56,9 @@ If ``name-prefix`` is specified it replaces the ``$family-$enum``
portion of the entry name.
Boolean ``render-max`` controls creation of the max values
-(which are enabled by default for attribute enums).
+(which are enabled by default for attribute enums). These max
+values are named ``__$pfx-MAX`` and ``$pfx-MAX``. The name
+of the first value can be overridden via ``enum-cnt-name`` property.
Attributes
==========
diff --git a/Documentation/userspace-api/netlink/intro-specs.rst b/Documentation/userspace-api/netlink/intro-specs.rst
index bada89699455..a4435ae4628d 100644
--- a/Documentation/userspace-api/netlink/intro-specs.rst
+++ b/Documentation/userspace-api/netlink/intro-specs.rst
@@ -15,7 +15,7 @@ developing Netlink related code. The tool is implemented in Python
and can use a YAML specification to issue Netlink requests
to the kernel. Only Generic Netlink is supported.
-The tool is located at ``tools/net/ynl/cli.py``. It accepts
+The tool is located at ``tools/net/ynl/pyynl/cli.py``. It accepts
a handul of arguments, the most important ones are:
- ``--spec`` - point to the spec file
@@ -27,7 +27,7 @@ YAML specs can be found under ``Documentation/netlink/specs/``.
Example use::
- $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/ethtool.yaml \
+ $ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/ethtool.yaml \
--do rings-get \
--json '{"header":{"dev-index": 18}}'
{'header': {'dev-index': 18, 'dev-name': 'eni1np1'},
@@ -75,7 +75,7 @@ the two marker lines like above to a file, add that file to git,
and run the regeneration tool. Grep the tree for ``YNL-GEN``
to see other examples.
-The code generation itself is performed by ``tools/net/ynl/ynl-gen-c.py``
+The code generation itself is performed by ``tools/net/ynl/pyynl/ynl_gen_c.py``
but it takes a few arguments so calling it directly for each file
quickly becomes tedious.
@@ -84,7 +84,7 @@ YNL lib
``tools/net/ynl/lib/`` contains an implementation of a C library
(based on libmnl) which integrates with code generated by
-``tools/net/ynl/ynl-gen-c.py`` to create easy to use netlink wrappers.
+``tools/net/ynl/pyynl/ynl_gen_c.py`` to create easy to use netlink wrappers.
YNL basics
----------
diff --git a/Documentation/userspace-api/ntsync.rst b/Documentation/userspace-api/ntsync.rst
new file mode 100644
index 000000000000..25e7c4aef968
--- /dev/null
+++ b/Documentation/userspace-api/ntsync.rst
@@ -0,0 +1,385 @@
+===================================
+NT synchronization primitive driver
+===================================
+
+This page documents the user-space API for the ntsync driver.
+
+ntsync is a support driver for emulation of NT synchronization
+primitives by user-space NT emulators. It exists because implementation
+in user-space, using existing tools, cannot match Windows performance
+while offering accurate semantics. It is implemented entirely in
+software, and does not drive any hardware device.
+
+This interface is meant as a compatibility tool only, and should not
+be used for general synchronization. Instead use generic, versatile
+interfaces such as futex(2) and poll(2).
+
+Synchronization primitives
+==========================
+
+The ntsync driver exposes three types of synchronization primitives:
+semaphores, mutexes, and events.
+
+A semaphore holds a single volatile 32-bit counter, and a static 32-bit
+integer denoting the maximum value. It is considered signaled (that is,
+can be acquired without contention, or will wake up a waiting thread)
+when the counter is nonzero. The counter is decremented by one when a
+wait is satisfied. Both the initial and maximum count are established
+when the semaphore is created.
+
+A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit
+identifier denoting its owner. A mutex is considered signaled when its
+owner is zero (indicating that it is not owned). The recursion count is
+incremented when a wait is satisfied, and ownership is set to the given
+identifier.
+
+A mutex also holds an internal flag denoting whether its previous owner
+has died; such a mutex is said to be abandoned. Owner death is not
+tracked automatically based on thread death, but rather must be
+communicated using ``NTSYNC_IOC_MUTEX_KILL``. An abandoned mutex is
+inherently considered unowned.
+
+Except for the "unowned" semantics of zero, the actual value of the
+owner identifier is not interpreted by the ntsync driver at all. The
+intended use is to store a thread identifier; however, the ntsync
+driver does not actually validate that a calling thread provides
+consistent or unique identifiers.
+
+An event is similar to a semaphore with a maximum count of one. It holds
+a volatile boolean state denoting whether it is signaled or not. There
+are two types of events, auto-reset and manual-reset. An auto-reset
+event is designaled when a wait is satisfied; a manual-reset event is
+not. The event type is specified when the event is created.
+
+Unless specified otherwise, all operations on an object are atomic and
+totally ordered with respect to other operations on the same object.
+
+Objects are represented by files. When all file descriptors to an
+object are closed, that object is deleted.
+
+Char device
+===========
+
+The ntsync driver creates a single char device /dev/ntsync. Each file
+description opened on the device represents a unique instance intended
+to back an individual NT virtual machine. Objects created by one ntsync
+instance may only be used with other objects created by the same
+instance.
+
+ioctl reference
+===============
+
+All operations on the device are done through ioctls. There are four
+structures used in ioctl calls::
+
+ struct ntsync_sem_args {
+ __u32 count;
+ __u32 max;
+ };
+
+ struct ntsync_mutex_args {
+ __u32 owner;
+ __u32 count;
+ };
+
+ struct ntsync_event_args {
+ __u32 signaled;
+ __u32 manual;
+ };
+
+ struct ntsync_wait_args {
+ __u64 timeout;
+ __u64 objs;
+ __u32 count;
+ __u32 owner;
+ __u32 index;
+ __u32 alert;
+ __u32 flags;
+ __u32 pad;
+ };
+
+Depending on the ioctl, members of the structure may be used as input,
+output, or not at all.
+
+The ioctls on the device file are as follows:
+
+.. c:macro:: NTSYNC_IOC_CREATE_SEM
+
+ Create a semaphore object. Takes a pointer to struct
+ :c:type:`ntsync_sem_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``count``
+ - Initial count of the semaphore.
+ * - ``max``
+ - Maximum count of the semaphore.
+
+ Fails with ``EINVAL`` if ``count`` is greater than ``max``.
+ On success, returns a file descriptor the created semaphore.
+
+.. c:macro:: NTSYNC_IOC_CREATE_MUTEX
+
+ Create a mutex object. Takes a pointer to struct
+ :c:type:`ntsync_mutex_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``count``
+ - Initial recursion count of the mutex.
+ * - ``owner``
+ - Initial owner of the mutex.
+
+ If ``owner`` is nonzero and ``count`` is zero, or if ``owner`` is
+ zero and ``count`` is nonzero, the function fails with ``EINVAL``.
+ On success, returns a file descriptor the created mutex.
+
+.. c:macro:: NTSYNC_IOC_CREATE_EVENT
+
+ Create an event object. Takes a pointer to struct
+ :c:type:`ntsync_event_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``signaled``
+ - If nonzero, the event is initially signaled, otherwise
+ nonsignaled.
+ * - ``manual``
+ - If nonzero, the event is a manual-reset event, otherwise
+ auto-reset.
+
+ On success, returns a file descriptor the created event.
+
+The ioctls on the individual objects are as follows:
+
+.. c:macro:: NTSYNC_IOC_SEM_POST
+
+ Post to a semaphore object. Takes a pointer to a 32-bit integer,
+ which on input holds the count to be added to the semaphore, and on
+ output contains its previous count.
+
+ If adding to the semaphore's current count would raise the latter
+ past the semaphore's maximum count, the ioctl fails with
+ ``EOVERFLOW`` and the semaphore is not affected. If raising the
+ semaphore's count causes it to become signaled, eligible threads
+ waiting on this semaphore will be woken and the semaphore's count
+ decremented appropriately.
+
+.. c:macro:: NTSYNC_IOC_MUTEX_UNLOCK
+
+ Release a mutex object. Takes a pointer to struct
+ :c:type:`ntsync_mutex_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``owner``
+ - Specifies the owner trying to release this mutex.
+ * - ``count``
+ - On output, contains the previous recursion count.
+
+ If ``owner`` is zero, the ioctl fails with ``EINVAL``. If ``owner``
+ is not the current owner of the mutex, the ioctl fails with
+ ``EPERM``.
+
+ The mutex's count will be decremented by one. If decrementing the
+ mutex's count causes it to become zero, the mutex is marked as
+ unowned and signaled, and eligible threads waiting on it will be
+ woken as appropriate.
+
+.. c:macro:: NTSYNC_IOC_SET_EVENT
+
+ Signal an event object. Takes a pointer to a 32-bit integer, which on
+ output contains the previous state of the event.
+
+ Eligible threads will be woken, and auto-reset events will be
+ designaled appropriately.
+
+.. c:macro:: NTSYNC_IOC_RESET_EVENT
+
+ Designal an event object. Takes a pointer to a 32-bit integer, which
+ on output contains the previous state of the event.
+
+.. c:macro:: NTSYNC_IOC_PULSE_EVENT
+
+ Wake threads waiting on an event object while leaving it in an
+ unsignaled state. Takes a pointer to a 32-bit integer, which on
+ output contains the previous state of the event.
+
+ A pulse operation can be thought of as a set followed by a reset,
+ performed as a single atomic operation. If two threads are waiting on
+ an auto-reset event which is pulsed, only one will be woken. If two
+ threads are waiting a manual-reset event which is pulsed, both will
+ be woken. However, in both cases, the event will be unsignaled
+ afterwards, and a simultaneous read operation will always report the
+ event as unsignaled.
+
+.. c:macro:: NTSYNC_IOC_READ_SEM
+
+ Read the current state of a semaphore object. Takes a pointer to
+ struct :c:type:`ntsync_sem_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``count``
+ - On output, contains the current count of the semaphore.
+ * - ``max``
+ - On output, contains the maximum count of the semaphore.
+
+.. c:macro:: NTSYNC_IOC_READ_MUTEX
+
+ Read the current state of a mutex object. Takes a pointer to struct
+ :c:type:`ntsync_mutex_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``owner``
+ - On output, contains the current owner of the mutex, or zero
+ if the mutex is not currently owned.
+ * - ``count``
+ - On output, contains the current recursion count of the mutex.
+
+ If the mutex is marked as abandoned, the function fails with
+ ``EOWNERDEAD``. In this case, ``count`` and ``owner`` are set to
+ zero.
+
+.. c:macro:: NTSYNC_IOC_READ_EVENT
+
+ Read the current state of an event object. Takes a pointer to struct
+ :c:type:`ntsync_event_args`, which is used as follows:
+
+ .. list-table::
+
+ * - ``signaled``
+ - On output, contains the current state of the event.
+ * - ``manual``
+ - On output, contains 1 if the event is a manual-reset event,
+ and 0 otherwise.
+
+.. c:macro:: NTSYNC_IOC_KILL_OWNER
+
+ Mark a mutex as unowned and abandoned if it is owned by the given
+ owner. Takes an input-only pointer to a 32-bit integer denoting the
+ owner. If the owner is zero, the ioctl fails with ``EINVAL``. If the
+ owner does not own the mutex, the function fails with ``EPERM``.
+
+ Eligible threads waiting on the mutex will be woken as appropriate
+ (and such waits will fail with ``EOWNERDEAD``, as described below).
+
+.. c:macro:: NTSYNC_IOC_WAIT_ANY
+
+ Poll on any of a list of objects, atomically acquiring at most one.
+ Takes a pointer to struct :c:type:`ntsync_wait_args`, which is
+ used as follows:
+
+ .. list-table::
+
+ * - ``timeout``
+ - Absolute timeout in nanoseconds. If ``NTSYNC_WAIT_REALTIME``
+ is set, the timeout is measured against the REALTIME clock;
+ otherwise it is measured against the MONOTONIC clock. If the
+ timeout is equal to or earlier than the current time, the
+ function returns immediately without sleeping. If ``timeout``
+ is U64_MAX, the function will sleep until an object is
+ signaled, and will not fail with ``ETIMEDOUT``.
+ * - ``objs``
+ - Pointer to an array of ``count`` file descriptors
+ (specified as an integer so that the structure has the same
+ size regardless of architecture). If any object is
+ invalid, the function fails with ``EINVAL``.
+ * - ``count``
+ - Number of objects specified in the ``objs`` array.
+ If greater than ``NTSYNC_MAX_WAIT_COUNT``, the function fails
+ with ``EINVAL``.
+ * - ``owner``
+ - Mutex owner identifier. If any object in ``objs`` is a mutex,
+ the ioctl will attempt to acquire that mutex on behalf of
+ ``owner``. If ``owner`` is zero, the ioctl fails with
+ ``EINVAL``.
+ * - ``index``
+ - On success, contains the index (into ``objs``) of the object
+ which was signaled. If ``alert`` was signaled instead,
+ this contains ``count``.
+ * - ``alert``
+ - Optional event object file descriptor. If nonzero, this
+ specifies an "alert" event object which, if signaled, will
+ terminate the wait. If nonzero, the identifier must point to a
+ valid event.
+ * - ``flags``
+ - Zero or more flags. Currently the only flag is
+ ``NTSYNC_WAIT_REALTIME``, which causes the timeout to be
+ measured against the REALTIME clock instead of MONOTONIC.
+ * - ``pad``
+ - Unused, must be set to zero.
+
+ This function attempts to acquire one of the given objects. If unable
+ to do so, it sleeps until an object becomes signaled, subsequently
+ acquiring it, or the timeout expires. In the latter case the ioctl
+ fails with ``ETIMEDOUT``. The function only acquires one object, even
+ if multiple objects are signaled.
+
+ A semaphore is considered to be signaled if its count is nonzero, and
+ is acquired by decrementing its count by one. A mutex is considered
+ to be signaled if it is unowned or if its owner matches the ``owner``
+ argument, and is acquired by incrementing its recursion count by one
+ and setting its owner to the ``owner`` argument. An auto-reset event
+ is acquired by designaling it; a manual-reset event is not affected
+ by acquisition.
+
+ Acquisition is atomic and totally ordered with respect to other
+ operations on the same object. If two wait operations (with different
+ ``owner`` identifiers) are queued on the same mutex, only one is
+ signaled. If two wait operations are queued on the same semaphore,
+ and a value of one is posted to it, only one is signaled.
+
+ If an abandoned mutex is acquired, the ioctl fails with
+ ``EOWNERDEAD``. Although this is a failure return, the function may
+ otherwise be considered successful. The mutex is marked as owned by
+ the given owner (with a recursion count of 1) and as no longer
+ abandoned, and ``index`` is still set to the index of the mutex.
+
+ The ``alert`` argument is an "extra" event which can terminate the
+ wait, independently of all other objects.
+
+ It is valid to pass the same object more than once, including by
+ passing the same event in the ``objs`` array and in ``alert``. If a
+ wakeup occurs due to that object being signaled, ``index`` is set to
+ the lowest index corresponding to that object.
+
+ The function may fail with ``EINTR`` if a signal is received.
+
+.. c:macro:: NTSYNC_IOC_WAIT_ALL
+
+ Poll on a list of objects, atomically acquiring all of them. Takes a
+ pointer to struct :c:type:`ntsync_wait_args`, which is used
+ identically to ``NTSYNC_IOC_WAIT_ANY``, except that ``index`` is
+ always filled with zero on success if not woken via alert.
+
+ This function attempts to simultaneously acquire all of the given
+ objects. If unable to do so, it sleeps until all objects become
+ simultaneously signaled, subsequently acquiring them, or the timeout
+ expires. In the latter case the ioctl fails with ``ETIMEDOUT`` and no
+ objects are modified.
+
+ Objects may become signaled and subsequently designaled (through
+ acquisition by other threads) while this thread is sleeping. Only
+ once all objects are simultaneously signaled does the ioctl acquire
+ them and return. The entire acquisition is atomic and totally ordered
+ with respect to other operations on any of the given objects.
+
+ If an abandoned mutex is acquired, the ioctl fails with
+ ``EOWNERDEAD``. Similarly to ``NTSYNC_IOC_WAIT_ANY``, all objects are
+ nevertheless marked as acquired. Note that if multiple mutex objects
+ are specified, there is no way to know which were marked as
+ abandoned.
+
+ As with "any" waits, the ``alert`` argument is an "extra" event which
+ can terminate the wait. Critically, however, an "all" wait will
+ succeed if all members in ``objs`` are signaled, *or* if ``alert`` is
+ signaled. In the latter case ``index`` will be set to ``count``. As
+ with "any" waits, if both conditions are filled, the former takes
+ priority, and objects in ``objs`` will be acquired.
+
+ Unlike ``NTSYNC_IOC_WAIT_ANY``, it is not valid to pass the same
+ object more than once, nor is it valid to pass the same object in
+ ``objs`` and in ``alert``. If this is attempted, the function fails
+ with ``EINVAL``.
diff --git a/Documentation/userspace-api/sysfs-platform_profile.rst b/Documentation/userspace-api/sysfs-platform_profile.rst
index 4fccde2e4563..7f013356118a 100644
--- a/Documentation/userspace-api/sysfs-platform_profile.rst
+++ b/Documentation/userspace-api/sysfs-platform_profile.rst
@@ -40,3 +40,41 @@ added. Drivers which wish to introduce new profile names must:
1. Explain why the existing profile names cannot be used.
2. Add the new profile name, along with a clear description of the
expected behaviour, to the sysfs-platform_profile ABI documentation.
+
+"Custom" profile support
+========================
+The platform_profile class also supports profiles advertising a "custom"
+profile. This is intended to be set by drivers when the setttings in the
+driver have been modified in a way that a standard profile doesn't represent
+the current state.
+
+Multiple driver support
+=======================
+When multiple drivers on a system advertise a platform profile handler, the
+platform profile handler core will only advertise the profiles that are
+common between all drivers to the ``/sys/firmware/acpi`` interfaces.
+
+This is to ensure there is no ambiguity on what the profile names mean when
+all handlers don't support a profile.
+
+Individual drivers will register a 'platform_profile' class device that has
+similar semantics as the ``/sys/firmware/acpi/platform_profile`` interface.
+
+To discover which driver is associated with a platform profile handler the
+user can read the ``name`` attribute of the class device.
+
+To discover available profiles from the class interface the user can read the
+``choices`` attribute.
+
+If a user wants to select a profile for a specific driver, they can do so
+by writing to the ``profile`` attribute of the driver's class device.
+
+This will allow users to set different profiles for different drivers on the
+same system. If the selected profile by individual drivers differs the
+platform profile handler core will display the profile 'custom' to indicate
+that the profiles are not the same.
+
+While the ``platform_profile`` attribute has the value ``custom``, writing a
+common profile from ``platform_profile_choices`` to the platform_profile
+attribute of the platform profile handler core will set the profile for all
+drivers.