summaryrefslogtreecommitdiff
path: root/security
AgeCommit message (Collapse)Author
2024-09-16Merge tag 'lsm-pr-20240911' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Move the LSM framework to static calls This transitions the vast majority of the LSM callbacks into static calls. Those callbacks which haven't been converted were left as-is due to the general ugliness of the changes required to support the static call conversion; we can revisit those callbacks at a future date. - Add the Integrity Policy Enforcement (IPE) LSM This adds a new LSM, Integrity Policy Enforcement (IPE). There is plenty of documentation about IPE in this patches, so I'll refrain from going into too much detail here, but the basic motivation behind IPE is to provide a mechanism such that administrators can restrict execution to only those binaries which come from integrity protected storage, e.g. a dm-verity protected filesystem. You will notice that IPE requires additional LSM hooks in the initramfs, dm-verity, and fs-verity code, with the associated patches carrying ACK/review tags from the associated maintainers. We couldn't find an obvious maintainer for the initramfs code, but the IPE patchset has been widely posted over several years. Both Deven Bowers and Fan Wu have contributed to IPE's development over the past several years, with Fan Wu agreeing to serve as the IPE maintainer moving forward. Once IPE is accepted into your tree, I'll start working with Fan to ensure he has the necessary accounts, keys, etc. so that he can start submitting IPE pull requests to you directly during the next merge window. - Move the lifecycle management of the LSM blobs to the LSM framework Management of the LSM blobs (the LSM state buffers attached to various kernel structs, typically via a void pointer named "security" or similar) has been mixed, some blobs were allocated/managed by individual LSMs, others were managed by the LSM framework itself. Starting with this pull we move management of all the LSM blobs, minus the XFRM blob, into the framework itself, improving consistency across LSMs, and reducing the amount of duplicated code across LSMs. Due to some additional work required to migrate the XFRM blob, it has been left as a todo item for a later date; from a practical standpoint this omission should have little impact as only SELinux provides a XFRM LSM implementation. - Fix problems with the LSM's handling of F_SETOWN The LSM hook for the fcntl(F_SETOWN) operation had a couple of problems: it was racy with itself, and it was disconnected from the associated DAC related logic in such a way that the LSM state could be updated in cases where the DAC state would not. We fix both of these problems by moving the security_file_set_fowner() hook into the same section of code where the DAC attributes are updated. Not only does this resolve the DAC/LSM synchronization issue, but as that code block is protected by a lock, it also resolve the race condition. - Fix potential problems with the security_inode_free() LSM hook Due to use of RCU to protect inodes and the placement of the LSM hook associated with freeing the inode, there is a bit of a challenge when it comes to managing any LSM state associated with an inode. The VFS folks are not open to relocating the LSM hook so we have to get creative when it comes to releasing an inode's LSM state. Traditionally we have used a single LSM callback within the hook that is triggered when the inode is "marked for death", but not actually released due to RCU. Unfortunately, this causes problems for LSMs which want to take an action when the inode's associated LSM state is actually released; so we add an additional LSM callback, inode_free_security_rcu(), that is called when the inode's LSM state is released in the RCU free callback. - Refactor two LSM hooks to better fit the LSM return value patterns The vast majority of the LSM hooks follow the "return 0 on success, negative values on failure" pattern, however, there are a small handful that have unique return value behaviors which has caused confusion in the past and makes it difficult for the BPF verifier to properly vet BPF LSM programs. This includes patches to convert two of these"special" LSM hooks to the common 0/-ERRNO pattern. - Various cleanups and improvements A handful of patches to remove redundant code, better leverage the IS_ERR_OR_NULL() helper, add missing "static" markings, and do some minor style fixups. * tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (40 commits) security: Update file_set_fowner documentation fs: Fix file_set_fowner LSM hook inconsistencies lsm: Use IS_ERR_OR_NULL() helper function lsm: remove LSM_COUNT and LSM_CONFIG_COUNT ipe: Remove duplicated include in ipe.c lsm: replace indirect LSM hook calls with static calls lsm: count the LSMs enabled at compile time kernel: Add helper macros for loop unrolling init/main.c: Initialize early LSMs after arch code, static keys and calls. MAINTAINERS: add IPE entry with Fan Wu as maintainer documentation: add IPE documentation ipe: kunit test for parser scripts: add boot policy generation program ipe: enable support for fs-verity as a trust provider fsverity: expose verified fsverity built-in signatures to LSMs lsm: add security_inode_setintegrity() hook ipe: add support for dm-verity as a trust provider dm-verity: expose root hash digest and signature data to LSMs block,lsm: add LSM blob and new LSM hooks for block devices ipe: add permissive toggle ...
2024-09-16Merge tag 'selinux-pr-20240911' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Ensure that both IPv4 and IPv6 connections are properly initialized While we always properly initialized IPv4 connections early in their life, we missed the necessary IPv6 change when we were adding IPv6 support. - Annotate the SELinux inode revalidation function to quiet KCSAN KCSAN correctly identifies a race in __inode_security_revalidate() when we check to see if an inode's SELinux has been properly initialized. While KCSAN is correct, it is an intentional choice made for performance reasons; if necessary, we check the state a second time, this time with a lock held, before initializing the inode's state. - Code cleanups, simplification, etc. A handful of individual patches to simplify some SELinux kernel logic, improve return code granularity via ERR_PTR(), follow the guidance on using KMEM_CACHE(), and correct some minor style problems. * tag 'selinux-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix style problems in security/selinux/include/audit.h selinux: simplify avc_xperms_audit_required() selinux: mark both IPv4 and IPv6 accepted connection sockets as labeled selinux: replace kmem_cache_create() with KMEM_CACHE() selinux: annotate false positive data race to avoid KCSAN warnings selinux: refactor code to return ERR_PTR in selinux_netlbl_sock_genattr selinux: Streamline type determination in security_compute_sid
2024-09-16Merge tag 'vfs-6.12.procfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull procfs updates from Christian Brauner: "This contains the following changes for procfs: - Add config options and parameters to block forcing memory writes. This adds a Kconfig option and boot param to allow removing the FOLL_FORCE flag from /proc/<pid>/mem write calls as this can be used in various attacks. The traditional forcing behavior is kept as default because it can break GDB and some other use cases. This is the simpler version that you had requested. - Restrict overmounting of ephemeral entities. It is currently possible to mount on top of various ephemeral entities in procfs. This specifically includes magic links. To recap, magic links are links of the form /proc/<pid>/fd/<nr>. They serve as references to a target file and during path lookup they cause a jump to the target path. Such magic links disappear if the corresponding file descriptor is closed. Currently it is possible to overmount such magic links. This is mostly interesting for an attacker that wants to somehow trick a process into e.g., reopening something that it didn't intend to reopen or to hide a malicious file descriptor. But also it risks leaking mounts for long-running processes. When overmounting a magic link like above, the mount will not be detached when the file descriptor is closed. Only the target mountpoint will disappear. Which has the consequence of making it impossible to unmount that mount afterwards. So the mount will stick around until the process exits and the /proc/<pid>/ directory is cleaned up during proc_flush_pid() when the dentries are pruned and invalidated. That in turn means it's possible for a program to accidentally leak mounts and it's also possible to make a task leak mounts without it's knowledge if the attacker just keeps overmounting things under /proc/<pid>/fd/<nr>. Disallow overmounting of such ephemeral entities. - Cleanup the readdir method naming in some procfs file operations. - Replace kmalloc() and strcpy() with a simple kmemdup() call" * tag 'vfs-6.12.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: proc: fold kmalloc() + strcpy() into kmemdup() proc: block mounting on top of /proc/<pid>/fdinfo/* proc: block mounting on top of /proc/<pid>/fd/* proc: block mounting on top of /proc/<pid>/map_files/* proc: add proc_splice_unmountable() proc: proc_readfdinfo() -> proc_fdinfo_iterate() proc: proc_readfd() -> proc_fd_iterate() proc: add config & param to block forcing mem writes
2024-09-16Merge tag 'vfs-6.12.file' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs file updates from Christian Brauner: "This is the work to cleanup and shrink struct file significantly. Right now, (focusing on x86) struct file is 232 bytes. After this series struct file will be 184 bytes aka 3 cacheline and a spare 8 bytes for future extensions at the end of the struct. With struct file being as ubiquitous as it is this should make a difference for file heavy workloads and allow further optimizations in the future. - struct fown_struct was embedded into struct file letting it take up 32 bytes in total when really it shouldn't even be embedded in struct file in the first place. Instead, actual users of struct fown_struct now allocate the struct on demand. This frees up 24 bytes. - Move struct file_ra_state into the union containg the cleanup hooks and move f_iocb_flags out of the union. This closes a 4 byte hole we created earlier and brings struct file to 192 bytes. Which means struct file is 3 cachelines and we managed to shrink it by 40 bytes. - Reorder struct file so that nothing crosses a cacheline. I suspect that in the future we will end up reordering some members to mitigate false sharing issues or just because someone does actually provide really good perf data. - Shrinking struct file to 192 bytes is only part of the work. Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache is created with SLAB_TYPESAFE_BY_RCU the free pointer must be located outside of the object because the cache doesn't know what part of the memory can safely be overwritten as it may be needed to prevent object recycling. That has the consequence that SLAB_TYPESAFE_BY_RCU may end up adding a new cacheline. So this also contains work to add a new kmem_cache_create_rcu() function that allows the caller to specify an offset where the freelist pointer is supposed to be placed. Thus avoiding the implicit addition of a fourth cacheline. - And finally this removes the f_version member in struct file. The f_version member isn't particularly well-defined. It is mainly used as a cookie to detect concurrent seeks when iterating directories. But it is also abused by some subsystems for completely unrelated things. It is mostly a directory and filesystem specific thing that doesn't really need to live in struct file and with its wonky semantics it really lacks a specific function. For pipes, f_version is (ab)used to defer poll notifications until a write has happened. And struct pipe_inode_info is used by multiple struct files in their ->private_data so there's no chance of pushing that down into file->private_data without introducing another pointer indirection. But pipes don't rely on f_pos_lock so this adds a union into struct file encompassing f_pos_lock and a pipe specific f_pipe member that pipes can use. This union of course can be extended to other file types and is similar to what we do in struct inode already" * tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits) fs: remove f_version pipe: use f_pipe fs: add f_pipe ubifs: store cookie in private data ufs: store cookie in private data udf: store cookie in private data proc: store cookie in private data ocfs2: store cookie in private data input: remove f_version abuse ext4: store cookie in private data ext2: store cookie in private data affs: store cookie in private data fs: add generic_llseek_cookie() fs: use must_set_pos() fs: add must_set_pos() fs: add vfs_setpos_cookie() s390: remove unused f_version ceph: remove unused f_version adi: remove unused f_version mm: Removed @freeptr_offset to prevent doc warning ...
2024-09-09security: Update file_set_fowner documentationMickaël Salaün
Highlight that the file_set_fowner hook is now called with a lock held. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Christian Brauner <brauner@kernel.org> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-03selinux: fix style problems in security/selinux/include/audit.hPaul Moore
Remove the needless indent in the function comment header blocks. Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-01Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull misc fixes from Guenter Roeck. These are fixes for regressions that Guenther has been reporting, and the maintainers haven't picked up and sent in. With rc6 fairly imminent, I'm taking them directly from Guenter. * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: apparmor: fix policy_unpack_test on big endian systems Revert "MIPS: csrc-r4k: Apply verification clocksource flags" microblaze: don't treat zero reserved memory regions as error
2024-08-31Merge tag 'lsm-pr-20240830' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fix from Paul Moore: "One small patch to correct a NFS permissions problem with SELinux and Smack" * tag 'lsm-pr-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: selinux,smack: don't bypass permissions check in inode_setsecctx hook
2024-08-30proc: add config & param to block forcing mem writesAdrian Ratiu
This adds a Kconfig option and boot param to allow removing the FOLL_FORCE flag from /proc/pid/mem write calls because it can be abused. The traditional forcing behavior is kept as default because it can break GDB and some other use cases. Previously we tried a more sophisticated approach allowing distributions to fine-tune /proc/pid/mem behavior, however that got NAK-ed by Linus [1], who prefers this simpler approach with semantics also easier to understand for users. Link: https://lore.kernel.org/lkml/CAHk-=wiGWLChxYmUA5HrT5aopZrB7_2VTa0NLZcxORgkUe5tEQ@mail.gmail.com/ [1] Cc: Doug Anderson <dianders@chromium.org> Cc: Jeff Xu <jeffxu@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Link: https://lore.kernel.org/r/20240802080225.89408-1-adrian.ratiu@collabora.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-29lsm: Use IS_ERR_OR_NULL() helper functionHongbo Li
Use the IS_ERR_OR_NULL() helper instead of open-coding a NULL and an error pointer checks to simplify the code and improve readability. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-28selinux,smack: don't bypass permissions check in inode_setsecctx hookScott Mayhew
Marek Gresko reports that the root user on an NFS client is able to change the security labels on files on an NFS filesystem that is exported with root squashing enabled. The end of the kerneldoc comment for __vfs_setxattr_noperm() states: * This function requires the caller to lock the inode's i_mutex before it * is executed. It also assumes that the caller will make the appropriate * permission checks. nfsd_setattr() does do permissions checking via fh_verify() and nfsd_permission(), but those don't do all the same permissions checks that are done by security_inode_setxattr() and its related LSM hooks do. Since nfsd_setattr() is the only consumer of security_inode_setsecctx(), simplest solution appears to be to replace the call to __vfs_setxattr_noperm() with a call to __vfs_setxattr_locked(). This fixes the above issue and has the added benefit of causing nfsd to recall conflicting delegations on a file when a client tries to change its security label. Cc: stable@kernel.org Reported-by: Marek Gresko <marek.gresko@protonmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218809 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-28selinux: simplify avc_xperms_audit_required()Zhen Lei
By associative and commutative laws, the result of the two 'audited' is zero. Take the second 'audited' as an example: 1) audited = requested & avd->auditallow; 2) audited &= ~requested; ==> audited = ~requested & (requested & avd->auditallow); ==> audited = (~requested & requested) & avd->auditallow; ==> audited = 0 & avd->auditallow; ==> audited = 0; In fact, it is more readable to directly write zero. The value of the first 'audited' is 0 because AUDIT is not allowed. The second 'audited' is zero because there is no AUDITALLOW permission. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-28selinux: mark both IPv4 and IPv6 accepted connection sockets as labeledGuido Trentalancia
The current partial labeling was introduced in 389fb800ac8b ("netlabel: Label incoming TCP connections correctly in SELinux") due to the fact that IPv6 labeling was not supported yet at the time. Signed-off-by: Guido Trentalancia <guido@trentalancia.com> [PM: properly format the referenced commit ID, adjust subject] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-28file: reclaim 24 bytes from f_ownerChristian Brauner
We do embedd struct fown_struct into struct file letting it take up 32 bytes in total. We could tweak struct fown_struct to be more compact but really it shouldn't even be embedded in struct file in the first place. Instead, actual users of struct fown_struct should allocate the struct on demand. This frees up 24 bytes in struct file. That will have some potentially user-visible changes for the ownership fcntl()s. Some of them can now fail due to allocation failures. Practically, that probably will almost never happen as the allocations are small and they only happen once per file. The fown_struct is used during kill_fasync() which is used by e.g., pipes to generate a SIGIO signal. Sending of such signals is conditional on userspace having set an owner for the file using one of the F_OWNER fcntl()s. Such users will be unaffected if struct fown_struct is allocated during the fcntl() call. There are a few subsystems that call __f_setown() expecting file->f_owner to be allocated: (1) tun devices file->f_op->fasync::tun_chr_fasync() -> __f_setown() There are no callers of tun_chr_fasync(). (2) tty devices file->f_op->fasync::tty_fasync() -> __tty_fasync() -> __f_setown() tty_fasync() has no additional callers but __tty_fasync() has. Note that __tty_fasync() only calls __f_setown() if the @on argument is true. It's called from: file->f_op->release::tty_release() -> tty_release() -> __tty_fasync() -> __f_setown() tty_release() calls __tty_fasync() with @on false => __f_setown() is never called from tty_release(). => All callers of tty_release() are safe as well. file->f_op->release::tty_open() -> tty_release() -> __tty_fasync() -> __f_setown() __tty_hangup() calls __tty_fasync() with @on false => __f_setown() is never called from tty_release(). => All callers of __tty_hangup() are safe as well. From the callchains it's obvious that (1) and (2) end up getting called via file->f_op->fasync(). That can happen either through the F_SETFL fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If FASYNC is requested and the file isn't already FASYNC then file->f_op->fasync() is called with @on true which ends up causing both (1) and (2) to call __f_setown(). (1) and (2) are the only subsystems that call __f_setown() from the file->f_op->fasync() handler. So both (1) and (2) have been updated to allocate a struct fown_struct prior to calling fasync_helper() to register with the fasync infrastructure. That's safe as they both call fasync_helper() which also does allocations if @on is true. The other interesting case are file leases: (3) file leases lease_manager_ops->lm_setup::lease_setup() -> __f_setown() Which in turn is called from: generic_add_lease() -> lease_manager_ops->lm_setup::lease_setup() -> __f_setown() So here again we can simply make generic_add_lease() allocate struct fown_struct prior to the lease_manager_ops->lm_setup::lease_setup() which happens under a spinlock. With that the two remaining subsystems that call __f_setown() are: (4) dnotify (5) sockets Both have their own custom ioctls to set struct fown_struct and both have been converted to allocate a struct fown_struct on demand from their respective ioctls. Interactions with O_PATH are fine as well e.g., when opening a /dev/tty as O_PATH then no file->f_op->open() happens thus no file->f_owner is allocated. That's fine as no file operation will be set for those and the device has never been opened. fcntl()s called on such things will just allocate a ->f_owner on demand. Although I have zero idea why'd you care about f_owner on an O_PATH fd. Link: https://lore.kernel.org/r/20240813-work-f_owner-v2-1-4e9343a79f9f@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-27selinux: replace kmem_cache_create() with KMEM_CACHE()Eric Suen
Based on guidance in include/linux/slab.h, replace kmem_cache_create() with KMEM_CACHE() for sources under security/selinux to simplify creation of SLAB caches. Signed-off-by: Eric Suen <ericsu@linux.microsoft.com> [PM: minor grammar nits in the description] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-26lsm: remove LSM_COUNT and LSM_CONFIG_COUNTTetsuo Handa
Because these are equals to MAX_LSM_COUNT. Also, we can avoid dynamic memory allocation for ordered_lsms because MAX_LSM_COUNT is a constant. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-26selinux: annotate false positive data race to avoid KCSAN warningsStephen Smalley
KCSAN flags the check of isec->initialized by __inode_security_revalidate() as a data race. This is indeed a racy check, but inode_doinit_with_dentry() will recheck with isec->lock held. Annotate the check with the data_race() macro to silence the KCSAN false positive. Reported-by: syzbot+319ed1769c0078257262@syzkaller.appspotmail.com Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-25apparmor: fix policy_unpack_test on big endian systemsGuenter Roeck
policy_unpack_test fails on big endian systems because data byte order is expected to be little endian but is generated in host byte order. This results in test failures such as: # policy_unpack_test_unpack_array_with_null_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:150 Expected array_size == (u16)16, but array_size == 4096 (0x1000) (u16)16 == 16 (0x10) # policy_unpack_test_unpack_array_with_null_name: pass:0 fail:1 skip:0 total:1 not ok 3 policy_unpack_test_unpack_array_with_null_name # policy_unpack_test_unpack_array_with_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:164 Expected array_size == (u16)16, but array_size == 4096 (0x1000) (u16)16 == 16 (0x10) # policy_unpack_test_unpack_array_with_name: pass:0 fail:1 skip:0 total:1 Add the missing endianness conversions when generating test data. Fixes: 4d944bcd4e73 ("apparmor: add AppArmor KUnit tests for policy unpack") Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-08-22ipe: Remove duplicated include in ipe.cYang Li
The header files eval.h is included twice in ipe.c, so one inclusion of each can be removed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9796 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-22lsm: replace indirect LSM hook calls with static callsKP Singh
LSM hooks are currently invoked from a linked list as indirect calls which are invoked using retpolines as a mitigation for speculative attacks (Branch History / Target injection) and add extra overhead which is especially bad in kernel hot paths: security_file_ioctl: 0xff...0320 <+0>: endbr64 0xff...0324 <+4>: push %rbp 0xff...0325 <+5>: push %r15 0xff...0327 <+7>: push %r14 0xff...0329 <+9>: push %rbx 0xff...032a <+10>: mov %rdx,%rbx 0xff...032d <+13>: mov %esi,%ebp 0xff...032f <+15>: mov %rdi,%r14 0xff...0332 <+18>: mov $0xff...7030,%r15 0xff...0339 <+25>: mov (%r15),%r15 0xff...033c <+28>: test %r15,%r15 0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56> 0xff...0341 <+33>: mov 0x18(%r15),%r11 0xff...0345 <+37>: mov %r14,%rdi 0xff...0348 <+40>: mov %ebp,%esi 0xff...034a <+42>: mov %rbx,%rdx 0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Indirect calls that use retpolines leading to overhead, not just due to extra instruction but also branch misses. 0xff...0352 <+50>: test %eax,%eax 0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25> 0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58> 0xff...0358 <+56>: xor %eax,%eax 0xff...035a <+58>: pop %rbx 0xff...035b <+59>: pop %r14 0xff...035d <+61>: pop %r15 0xff...035f <+63>: pop %rbp 0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk> The indirect calls are not really needed as one knows the addresses of enabled LSM callbacks at boot time and only the order can possibly change at boot time with the lsm= kernel command line parameter. An array of static calls is defined per LSM hook and the static calls are updated at boot time once the order has been determined. With the hook now exposed as a static call, one can see that the retpolines are no longer there and the LSM callbacks are invoked directly: security_file_ioctl: 0xff...0ca0 <+0>: endbr64 0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1) 0xff...0ca9 <+9>: push %rbp 0xff...0caa <+10>: push %r14 0xff...0cac <+12>: push %rbx 0xff...0cad <+13>: mov %rdx,%rbx 0xff...0cb0 <+16>: mov %esi,%ebp 0xff...0cb2 <+18>: mov %rdi,%r14 0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Static key enabled for SELinux 0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Static key enabled for BPF LSM. This is something that is changed to default to false to avoid the existing side effect issues of BPF LSM [1] in a subsequent patch. 0xff...0cb9 <+25>: xor %eax,%eax 0xff...0cbb <+27>: xchg %ax,%ax 0xff...0cbd <+29>: pop %rbx 0xff...0cbe <+30>: pop %r14 0xff...0cc0 <+32>: pop %rbp 0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk> 0xff...0cc7 <+39>: endbr64 0xff...0ccb <+43>: mov %r14,%rdi 0xff...0cce <+46>: mov %ebp,%esi 0xff...0cd0 <+48>: mov %rbx,%rdx 0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Direct call to SELinux. 0xff...0cd8 <+56>: test %eax,%eax 0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29> 0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23> 0xff...0cde <+62>: endbr64 0xff...0ce2 <+66>: mov %r14,%rdi 0xff...0ce5 <+69>: mov %ebp,%esi 0xff...0ce7 <+71>: mov %rbx,%rdx 0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Direct call to BPF LSM. 0xff...0cef <+79>: test %eax,%eax 0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29> 0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25> 0xff...0cf5 <+85>: endbr64 0xff...0cf9 <+89>: mov %r14,%rdi 0xff...0cfc <+92>: mov %ebp,%esi 0xff...0cfe <+94>: mov %rbx,%rdx 0xff...0d01 <+97>: pop %rbx 0xff...0d02 <+98>: pop %r14 0xff...0d04 <+100>: pop %rbp 0xff...0d05 <+101>: ret 0xff...0d06 <+102>: int3 0xff...0d07 <+103>: int3 0xff...0d08 <+104>: int3 0xff...0d09 <+105>: int3 While this patch uses static_branch_unlikely indicating that an LSM hook is likely to be not present. In most cases this is still a better choice as even when an LSM with one hook is added, empty slots are created for all LSM hooks (especially when many LSMs that do not initialize most hooks are present on the system). There are some hooks that don't use the call_int_hook or call_void_hook. These hooks are updated to use a new macro called lsm_for_each_hook where the lsm_callback is directly invoked as an indirect call. Below are results of the relevant Unixbench system benchmarks with BPF LSM and SELinux enabled with default policies enabled with and without these patches. Benchmark Delta(%): (+ is better) ========================================================================== Execl Throughput +1.9356 File Write 1024 bufsize 2000 maxblocks +6.5953 Pipe Throughput +9.5499 Pipe-based Context Switching +3.0209 Process Creation +2.3246 Shell Scripts (1 concurrent) +1.4975 System Call Overhead +2.7815 System Benchmarks Index Score (Partial Only): +3.4859 In the best case, some syscalls like eventfd_create benefitted to about ~10%. Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: kunit test for parserDeven Bowers
Add various happy/unhappy unit tests for both IPE's policy parser. Besides, a test suite for IPE functionality is available at https://github.com/microsoft/ipe/tree/test-suite Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20scripts: add boot policy generation programDeven Bowers
Enables an IPE policy to be enforced from kernel start, enabling access control based on trust from kernel startup. This is accomplished by transforming an IPE policy indicated by CONFIG_IPE_BOOT_POLICY into a c-string literal that is parsed at kernel startup as an unsigned policy. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: enable support for fs-verity as a trust providerFan Wu
Enable IPE policy authors to indicate trust for a singular fsverity file, identified by the digest information, through "fsverity_digest" and all files using valid fsverity builtin signatures via "fsverity_signature". This enables file-level integrity claims to be expressed in IPE, allowing individual files to be authorized, giving some flexibility for policy authors. Such file-level claims are important to be expressed for enforcing the integrity of packages, as well as address some of the scalability issues in a sole dm-verity based solution (# of loop back devices, etc). This solution cannot be done in userspace as the minimum threat that IPE should mitigate is an attacker downloads malicious payload with all required dependencies. These dependencies can lack the userspace check, bypassing the protection entirely. A similar attack succeeds if the userspace component is replaced with a version that does not perform the check. As a result, this can only be done in the common entry point - the kernel. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20lsm: add security_inode_setintegrity() hookFan Wu
This patch introduces a new hook to save inode's integrity data. For example, for fsverity enabled files, LSMs can use this hook to save the existence of verified fsverity builtin signature into the inode's security blob, and LSMs can make access decisions based on this data. Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: subject line tweak, removed changelog] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add support for dm-verity as a trust providerDeven Bowers
Allows author of IPE policy to indicate trust for a singular dm-verity volume, identified by roothash, through "dmverity_roothash" and all signed and validated dm-verity volumes, through "dmverity_signature". Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: fixed some line length issues in the comments] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20block,lsm: add LSM blob and new LSM hooks for block devicesDeven Bowers
This patch introduces a new LSM blob to the block_device structure, enabling the security subsystem to store security-sensitive data related to block devices. Currently, for a device mapper's mapped device containing a dm-verity target, critical security information such as the roothash and its signing state are not readily accessible. Specifically, while the dm-verity volume creation process passes the dm-verity roothash and its signature from userspace to the kernel, the roothash is stored privately within the dm-verity target, and its signature is discarded post-verification. This makes it extremely hard for the security subsystem to utilize these data. With the addition of the LSM blob to the block_device structure, the security subsystem can now retain and manage important security metadata such as the roothash and the signing state of a dm-verity by storing them inside the blob. Access decisions can then be based on these stored data. The implementation follows the same approach used for security blobs in other structures like struct file, struct inode, and struct superblock. The initialization of the security blob occurs after the creation of the struct block_device, performed by the security subsystem. Similarly, the security blob is freed by the security subsystem before the struct block_device is deallocated or freed. This patch also introduces a new hook security_bdev_setintegrity() to save block device's integrity data to the new LSM blob. For example, for dm-verity, it can use this hook to expose its roothash and signing state to LSMs, then LSMs can save these data into the LSM blob. Please note that the new hook should be invoked every time the security information is updated to keep these data current. For example, in dm-verity, if the mapping table is reloaded and configured to use a different dm-verity target with a new roothash and signing information, the previously stored data in the LSM blob will become obsolete. It is crucial to re-invoke the hook to refresh these data and ensure they are up to date. This necessity arises from the design of device-mapper, where a device-mapper device is first created, and then targets are subsequently loaded into it. These targets can be modified multiple times during the device's lifetime. Therefore, while the LSM blob is allocated during the creation of the block device, its actual contents are not initialized at this stage and can change substantially over time. This includes alterations from data that the LSM 'trusts' to those it does not, making it essential to handle these changes correctly. Failure to address this dynamic aspect could potentially allow for bypassing LSM checks. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: merge fuzz, subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add permissive toggleDeven Bowers
IPE, like SELinux, supports a permissive mode. This mode allows policy authors to test and evaluate IPE policy without it affecting their programs. When the mode is changed, a 1404 AUDIT_MAC_STATUS will be reported. This patch adds the following audit records: audit: MAC_STATUS enforcing=0 old_enforcing=1 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1 audit: MAC_STATUS enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1 The audit record only emit when the value from the user input is different from the current enforce value. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20audit,ipe: add IPE auditing supportDeven Bowers
Users of IPE require a way to identify when and why an operation fails, allowing them to both respond to violations of policy and be notified of potentially malicious actions on their systems with respect to IPE itself. This patch introduces 3 new audit events. AUDIT_IPE_ACCESS(1420) indicates the result of an IPE policy evaluation of a resource. AUDIT_IPE_CONFIG_CHANGE(1421) indicates the current active IPE policy has been changed to another loaded policy. AUDIT_IPE_POLICY_LOAD(1422) indicates a new IPE policy has been loaded into the kernel. This patch also adds support for success auditing, allowing users to identify why an allow decision was made for a resource. However, it is recommended to use this option with caution, as it is quite noisy. Here are some examples of the new audit record types: AUDIT_IPE_ACCESS(1420): audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1 pid=297 comm="sh" path="/root/vol/bin/hello" dev="tmpfs" ino=3897 rule="op=EXECUTE boot_verified=TRUE action=ALLOW" audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1 pid=299 comm="sh" path="/mnt/ipe/bin/hello" dev="dm-0" ino=2 rule="DEFAULT action=DENY" audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1 pid=300 path="/tmp/tmpdp2h1lub/deny/bin/hello" dev="tmpfs" ino=131 rule="DEFAULT action=DENY" The above three records were generated when the active IPE policy only allows binaries from the initramfs to run. The three identical `hello` binary were placed at different locations, only the first hello from the rootfs(initramfs) was allowed. Field ipe_op followed by the IPE operation name associated with the log. Field ipe_hook followed by the name of the LSM hook that triggered the IPE event. Field enforcing followed by the enforcement state of IPE. (it will be introduced in the next commit) Field pid followed by the pid of the process that triggered the IPE event. Field comm followed by the command line program name of the process that triggered the IPE event. Field path followed by the file's path name. Field dev followed by the device name as found in /dev where the file is from. Note that for device mappers it will use the name `dm-X` instead of the name in /dev/mapper. For a file in a temp file system, which is not from a device, it will use `tmpfs` for the field. The implementation of this part is following another existing use case LSM_AUDIT_DATA_INODE in security/lsm_audit.c Field ino followed by the file's inode number. Field rule followed by the IPE rule made the access decision. The whole rule must be audited because the decision is based on the combination of all property conditions in the rule. Along with the syscall audit event, user can know why a blocked happened. For example: audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1 pid=2138 comm="bash" path="/mnt/ipe/bin/hello" dev="dm-0" ino=2 rule="DEFAULT action=DENY" audit[1956]: SYSCALL arch=c000003e syscall=59 success=no exit=-13 a0=556790138df0 a1=556790135390 a2=5567901338b0 a3=ab2a41a67f4f1f4e items=1 ppid=147 pid=1956 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="bash" exe="/usr/bin/bash" key=(null) The above two records showed bash used execve to run "hello" and got blocked by IPE. Note that the IPE records are always prior to a SYSCALL record. AUDIT_IPE_CONFIG_CHANGE(1421): audit: AUDIT1421 old_active_pol_name="Allow_All" old_active_pol_version=0.0.0 old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649 new_active_pol_name="boot_verified" new_active_pol_version=0.0.0 new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F auid=4294967295 ses=4294967295 lsm=ipe res=1 The above record showed the current IPE active policy switch from `Allow_All` to `boot_verified` along with the version and the hash digest of the two policies. Note IPE can only have one policy active at a time, all access decision evaluation is based on the current active policy. The normal procedure to deploy a policy is loading the policy to deploy into the kernel first, then switch the active policy to it. AUDIT_IPE_POLICY_LOAD(1422): audit: AUDIT1422 policy_name="boot_verified" policy_version=0.0.0 policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F2676 auid=4294967295 ses=4294967295 lsm=ipe res=1 The above record showed a new policy has been loaded into the kernel with the policy name, policy version and policy hash. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add userspace interfaceDeven Bowers
As is typical with LSMs, IPE uses securityfs as its interface with userspace. for a complete list of the interfaces and the respective inputs/outputs, please see the documentation under admin-guide/LSM/ipe.rst Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20lsm: add new securityfs delete functionFan Wu
When deleting a directory in the security file system, the existing securityfs_remove requires the directory to be empty, otherwise it will do nothing. This leads to a potential risk that the security file system might be in an unclean state when the intended deletion did not happen. This commit introduces a new function securityfs_recursive_remove to recursively delete a directory without leaving an unclean state. Co-developed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: introduce 'boot_verified' as a trust providerFan Wu
IPE is designed to provide system level trust guarantees, this usually implies that trust starts from bootup with a hardware root of trust, which validates the bootloader. After this, the bootloader verifies the kernel and the initramfs. As there's no currently supported integrity method for initramfs, and it's typically already verified by the bootloader. This patch introduces a new IPE property `boot_verified` which allows author of IPE policy to indicate trust for files from initramfs. The implementation of this feature utilizes the newly added `initramfs_populated` hook. This hook marks the superblock of the rootfs after the initramfs has been unpacked into it. Before mounting the real rootfs on top of the initramfs, initramfs script will recursively remove all files and directories on the initramfs. This is typically implemented by using switch_root(8) (https://man7.org/linux/man-pages/man8/switch_root.8.html). Therefore the initramfs will be empty and not accessible after the real rootfs takes over. It is advised to switch to a different policy that doesn't rely on the `boot_verified` property after this point. This ensures that the trust policies remain relevant and effective throughout the system's operation. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20initramfs,lsm: add a security hook to do_populate_rootfs()Fan Wu
This patch introduces a new hook to notify security system that the content of initramfs has been unpacked into the rootfs. Upon receiving this notification, the security system can activate a policy to allow only files that originated from the initramfs to execute or load into kernel during the early stages of booting. This approach is crucial for minimizing the attack surface by ensuring that only trusted files from the initramfs are operational in the critical boot phase. Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add LSM hooks on execution and kernel readDeven Bowers
IPE's initial goal is to control both execution and the loading of kernel modules based on the system's definition of trust. It accomplishes this by plugging into the security hooks for bprm_check_security, file_mprotect, mmap_file, kernel_load_data, and kernel_read_data. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add evaluation loopDeven Bowers
Introduce a core evaluation function in IPE that will be triggered by various security hooks (e.g., mmap, bprm_check, kexec). This function systematically assesses actions against the defined IPE policy, by iterating over rules specific to the action being taken. This critical addition enables IPE to enforce its security policies effectively, ensuring that actions intercepted by these hooks are scrutinized for policy compliance before they are allowed to proceed. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20ipe: add policy parserDeven Bowers
IPE's interpretation of the what the user trusts is accomplished through its policy. IPE's design is to not provide support for a single trust provider, but to support multiple providers to enable the end-user to choose the best one to seek their needs. This requires the policy to be rather flexible and modular so that integrity providers, like fs-verity, dm-verity, or some other system, can plug into the policy with minimal code changes. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: added NULL check in parse_rule() as discussed] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-19lsm: add IPE lsmDeven Bowers
Integrity Policy Enforcement (IPE) is an LSM that provides an complimentary approach to Mandatory Access Control than existing LSMs today. Existing LSMs have centered around the concept of access to a resource should be controlled by the current user's credentials. IPE's approach, is that access to a resource should be controlled by the system's trust of a current resource. The basis of this approach is defining a global policy to specify which resource can be trusted. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-15KEYS: trusted: dcp: fix leak of blob encryption keyDavid Gstir
Trusted keys unseal the key blob on load, but keep the sealed payload in the blob field so that every subsequent read (export) will simply convert this field to hex and send it to userspace. With DCP-based trusted keys, we decrypt the blob encryption key (BEK) in the Kernel due hardware limitations and then decrypt the blob payload. BEK decryption is done in-place which means that the trusted key blob field is modified and it consequently holds the BEK in plain text. Every subsequent read of that key thus send the plain text BEK instead of the encrypted BEK to userspace. This issue only occurs when importing a trusted DCP-based key and then exporting it again. This should rarely happen as the common use cases are to either create a new trusted key and export it, or import a key blob and then just use it without exporting it again. Fix this by performing BEK decryption and encryption in a dedicated buffer. Further always wipe the plain text BEK buffer to prevent leaking the key via uninitialized memory. Cc: stable@vger.kernel.org # v6.10+ Fixes: 2e8a0f40a39c ("KEYS: trusted: Introduce NXP DCP-backed trusted keys") Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-08-15KEYS: trusted: fix DCP blob payload length assignmentDavid Gstir
The DCP trusted key type uses the wrong helper function to store the blob's payload length which can lead to the wrong byte order being used in case this would ever run on big endian architectures. Fix by using correct helper function. Cc: stable@vger.kernel.org # v6.10+ Fixes: 2e8a0f40a39c ("KEYS: trusted: Introduce NXP DCP-backed trusted keys") Suggested-by: Richard Weinberger <richard@nod.at> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405240610.fj53EK0q-lkp@intel.com/ Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-08-15lockdown: Make lockdown_lsmid staticYue Haibing
Fix sparse warning: security/lockdown/lockdown.c:79:21: warning: symbol 'lockdown_lsmid' was not declared. Should it be static? Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-12lsm: add the inode_free_security_rcu() LSM implementation hookPaul Moore
The LSM framework has an existing inode_free_security() hook which is used by LSMs that manage state associated with an inode, but due to the use of RCU to protect the inode, special care must be taken to ensure that the LSMs do not fully release the inode state until it is safe from a RCU perspective. This patch implements a new inode_free_security_rcu() implementation hook which is called when it is safe to free the LSM's internal inode state. Unfortunately, this new hook does not have access to the inode itself as it may already be released, so the existing inode_free_security() hook is retained for those LSMs which require access to the inode. Cc: stable@vger.kernel.org Reported-by: syzbot+5446fbf332b0602ede0b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/00000000000076ba3b0617f65cc8@google.com Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-12lsm: cleanup lsm_hooks.hPaul Moore
Some cleanup and style corrections for lsm_hooks.h. * Drop the lsm_inode_alloc() extern declaration, it is not needed. * Relocate lsm_get_xattr_slot() and extern variables in the file to improve grouping of related objects. * Don't use tabs to needlessly align structure fields. Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-08selinux: revert our use of vma_is_initial_heap()Paul Moore
Unfortunately it appears that vma_is_initial_heap() is currently broken for applications that do not currently have any heap allocated, e.g. brk == start_brk. The breakage is such that it will cause SELinux to check for the process/execheap permission on memory regions that cross brk/start_brk even when there is no heap. The proper fix would be to correct vma_is_initial_heap(), but as there are multiple callers I am hesitant to unilaterally modify the helper out of concern that I would end up breaking some other subsystem. The mm developers have been made aware of the situation and hopefully they will have a fix at some point in the future, but we need a fix soon so we are simply going to revert our use of vma_is_initial_heap() in favor of our old logic/code which works as expected, even in the face of a zero size heap. We can return to using vma_is_initial_heap() at some point in the future when it is fixed. Cc: stable@vger.kernel.org Reported-by: Marc Reisner <reisner.marc@gmail.com> Closes: https://lore.kernel.org/all/ZrPmoLKJEf1wiFmM@marcreisner.com Fixes: 68df1baf158f ("selinux: use vma_is_initial_stack() and vma_is_initial_heap()") Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-07selinux: add the processing of the failure of avc_add_xperms_decision()Zhen Lei
When avc_add_xperms_decision() fails, the information recorded by the new avc node is incomplete. In this case, the new avc node should be released instead of replacing the old avc node. Cc: stable@vger.kernel.org Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-06selinux: fix potential counting error in avc_add_xperms_decision()Zhen Lei
The count increases only when a node is successfully added to the linked list. Cc: stable@vger.kernel.org Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-31lsm: Refactor return value of LSM hook inode_copy_up_xattrXu Kuohai
To be consistent with most LSM hooks, convert the return value of hook inode_copy_up_xattr to 0 or a negative error code. Before: - Hook inode_copy_up_xattr returns 0 when accepting xattr, 1 when discarding xattr, -EOPNOTSUPP if it does not know xattr, or any other negative error code otherwise. After: - Hook inode_copy_up_xattr returns 0 when accepting xattr, *-ECANCELED* when discarding xattr, -EOPNOTSUPP if it does not know xattr, or any other negative error code otherwise. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-31lsm: Refactor return value of LSM hook vm_enough_memoryXu Kuohai
To be consistent with most LSM hooks, convert the return value of hook vm_enough_memory to 0 or a negative error code. Before: - Hook vm_enough_memory returns 1 if permission is granted, 0 if not. - LSM_RET_DEFAULT(vm_enough_memory_mm) is 1. After: - Hook vm_enough_memory reutrns 0 if permission is granted, negative error code if not. - LSM_RET_DEFAULT(vm_enough_memory_mm) is 0. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-29lsm: infrastructure management of the perf_event security blobCasey Schaufler
Move management of the perf_event->security blob out of the individual security modules and into the security infrastructure. Instead of allocating the blobs from within the modules the modules tell the infrastructure how much space is required, and the space is allocated there. There are no longer any modules that require the perf_event_free() hook. The hook definition has been removed. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> [PM: subject tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-29lsm: infrastructure management of the infiniband blobCasey Schaufler
Move management of the infiniband security blob out of the individual security modules and into the LSM infrastructure. The security modules tell the infrastructure how much space they require at initialization. There are no longer any modules that require the ib_free() hook. The hook definition has been removed. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> [PM: subject tweak, selinux style fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-29lsm: infrastructure management of the dev_tun blobCasey Schaufler
Move management of the dev_tun security blob out of the individual security modules and into the LSM infrastructure. The security modules tell the infrastructure how much space they require at initialization. There are no longer any modules that require the dev_tun_free hook. The hook definition has been removed. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> [PM: subject tweak, selinux style fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-29lsm: add helper for blob allocationsCasey Schaufler
Create a helper function lsm_blob_alloc() for general use in the hook specific functions that allocate LSM blobs. Change the hook specific functions to use this helper. This reduces the code size by a small amount and will make adding new instances of infrastructure managed security blobs easier. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> [PM: subject tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>