summaryrefslogtreecommitdiff
path: root/Documentation/filesystems
AgeCommit message (Collapse)Author
2023-08-28Merge tag 'v6.6-vfs.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual filesystems. Features: - Block mode changes on symlinks and rectify our broken semantics - Report file modifications via fsnotify() for splice - Allow specifying an explicit timeout for the "rootwait" kernel command line option. This allows to timeout and reboot instead of always waiting indefinitely for the root device to show up - Use synchronous fput for the close system call Cleanups: - Get rid of open-coded lockdep workarounds for async io submitters and replace it all with a single consolidated helper - Simplify epoll allocation helper - Convert simple_write_begin and simple_write_end to use a folio - Convert page_cache_pipe_buf_confirm() to use a folio - Simplify __range_close to avoid pointless locking - Disable per-cpu buffer head cache for isolated cpus - Port ecryptfs to kmap_local_page() api - Remove redundant initialization of pointer buf in pipe code - Unexport the d_genocide() function which is only used within core vfs - Replace printk(KERN_ERR) and WARN_ON() with WARN() Fixes: - Fix various kernel-doc issues - Fix refcount underflow for eventfds when used as EFD_SEMAPHORE - Fix a mainly theoretical issue in devpts - Check the return value of __getblk() in reiserfs - Fix a racy assert in i_readcount_dec - Fix integer conversion issues in various functions - Fix LSM security context handling during automounts that prevented NFS superblock sharing" * tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits) cachefiles: use kiocb_{start,end}_write() helpers ovl: use kiocb_{start,end}_write() helpers aio: use kiocb_{start,end}_write() helpers io_uring: use kiocb_{start,end}_write() helpers fs: create kiocb_{start,end}_write() helpers fs: add kerneldoc to file_{start,end}_write() helpers io_uring: rename kiocb_end_write() local helper splice: Convert page_cache_pipe_buf_confirm() to use a folio libfs: Convert simple_write_begin and simple_write_end to use a folio fs/dcache: Replace printk and WARN_ON by WARN fs/pipe: remove redundant initialization of pointer buf fs: Fix kernel-doc warnings devpts: Fix kernel-doc warnings doc: idmappings: fix an error and rephrase a paragraph init: Add support for rootwait timeout parameter vfs: fix up the assert in i_readcount_dec fs: Fix one kernel-doc comment docs: filesystems: idmappings: clarify from where idmappings are taken fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing ...
2023-08-28Merge tag 'v6.6-vfs.tmpfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull libfs and tmpfs updates from Christian Brauner: "This cycle saw a lot of work for tmpfs that required changes to the vfs layer. Andrew, Hugh, and I decided to take tmpfs through vfs this cycle. Things will go back to mm next cycle. Features ======== - By far the biggest work is the quota support for tmpfs. New tmpfs quota infrastructure is added to support it and a new QFMT_SHMEM uapi option is exposed. This offers user and group quotas to tmpfs (project quotas will be added later). Similar to other filesystems tmpfs quota are not supported within user namespaces yet. - Add support for user xattrs. While tmpfs already supports security xattrs (security.*) and POSIX ACLs for a long time it lacked support for user xattrs (user.*). With this pull request tmpfs will be able to support a limited number of user xattrs. This is accompanied by a fix (see below) to limit persistent simple xattr allocations. - Add support for stable directory offsets. Currently tmpfs relies on the libfs provided cursor-based mechanism for readdir. This causes issues when a tmpfs filesystem is exported via NFS. NFS clients do not open directories. Instead, each server-side readdir operation opens the directory, reads it, and then closes it. Since the cursor state for that directory is associated with the opened file it is discarded after each readdir operation. Such directory offsets are not just cached by NFS clients but also various userspace libraries based on these clients. As it stands there is no way to invalidate the caches when directory offsets have changed and the whole application depends on unchanging directory offsets. At LSFMM we discussed how to solve this problem and decided to support stable directory offsets. libfs now allows filesystems like tmpfs to use an xarrary to map a directory offset to a dentry. This mechanism is currently only used by tmpfs but can be supported by others as well. Fixes ===== - Change persistent simple xattrs allocations in libfs from GFP_KERNEL to GPF_KERNEL_ACCOUNT so they're subject to memory cgroup limits. Since this is a change to libfs it affects both tmpfs and kernfs. - Correctly verify {g,u}id mount options. A new filesystem context is created via fsopen() which records the namespace that becomes the owning namespace of the superblock when fsconfig(FSCONFIG_CMD_CREATE) is called for filesystems that are mountable in namespaces. However, fsconfig() calls can occur in a namespace different from the namespace where fsopen() has been called. Currently, when fsconfig() is called to set {g,u}id mount options the requested {g,u}id is mapped into a k{g,u}id according to the namespace where fsconfig() was called from. The resulting k{g,u}id is not guaranteed to be resolvable in the namespace of the filesystem (the one that fsopen() was called in). This means it's possible for an unprivileged user to create files owned by any group in a tmpfs mount since it's possible to set the setid bits on the tmpfs directory. The contract for {g,u}id mount options and {g,u}id values in general set from userspace has always been that they are translated according to the caller's idmapping. In so far, tmpfs has been doing the correct thing. But since tmpfs is mountable in unprivileged contexts it is also necessary to verify that the resulting {k,g}uid is representable in the namespace of the superblock to avoid such bugs. The new mount api's cross-namespace delegation abilities are already widely used. Having talked to a bunch of userspace this is the most faithful solution with minimal regression risks" * tag 'v6.6-vfs.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs mm: invalidation check mapping before folio_contains tmpfs: trivial support for direct IO tmpfs,xattr: enable limited user extended attributes tmpfs: track free_ispace instead of free_inodes xattr: simple_xattr_set() return old_xattr to be freed tmpfs: verify {g,u}id mount options correctly shmem: move spinlock into shmem_recalc_inode() to fix quota support libfs: Remove parent dentry locking in offset_iterate_dir() libfs: Add a lock class for the offset map's xa_lock shmem: stable directory offsets shmem: Refactor shmem_symlink() libfs: Add directory operations for stable offsets shmem: fix quota lock nesting in huge hole handling shmem: Add default quota limit mount options shmem: quota support shmem: prepare shmem quota infrastructure quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks shmem: make shmem_get_inode() return ERR_PTR instead of NULL shmem: make shmem_inode_acct_block() return error
2023-08-24mm: allow ->huge_fault() to be called without the mmap_lock heldMatthew Wilcox (Oracle)
Remove the checks for the VMA lock being held, allowing the page fault path to call into the filesystem instead of retrying with the mmap_lock held. This will improve scalability for DAX page faults. Also update the documentation to match (and fix some other changes that have happened recently). Link: https://lkml.kernel.org/r/20230818202335.2739663-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm: convert do_set_pte() to set_pte_range()Yin Fengwei
set_pte_range() allows to setup page table entries for a specific range. It takes advantage of batched rmap update for large folio. It now takes care of calling update_mmu_cache_range(). Link: https://lkml.kernel.org/r/20230802151406.3735276-37-willy@infradead.org Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24ceph: update documentation regarding snapshot naming limitationsLuís Henriques
Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-08-18Documentation: Fix typosBjorn Helgaas
Fix typos in Documentation. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230814212822.193684-4-helgaas@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-08-18docs: vfs: clean up after the iterate() removalJonathan Corbet
Commit 3e3271549670 ("vfs: get rid of old '->iterate' directory operation") removed the iterate() file_operations member, but neglected to clean up the associated documentation. Get rid of the leftovers. Link: https://lore.kernel.org/r/874jl945bv.fsf@meer.lwn.net Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-08-16doc: idmappings: fix an error and rephrase a paragraphGONG, Ruiqi
If the modified paragraph is referring to the idmapping mentioned in the previous paragraph (i.e. `u0:k10000:r10000`), then it is `u0` that the upper idmapset starts with, not `u1000`. Fix this error and rephrase this paragraph a bit to make this reference more explicit. Reported-by: Wang Lei <wanglei249@huawei.com> Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Message-Id: <20230816033210.914262-1-gongruiqi@huaweicloud.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-15docs: filesystems: idmappings: clarify from where idmappings are takenAlexander Mikhalitsyn
Let's clarify from where we take idmapping of each type: - caller - filesystem - mount Cc: Jonathan Corbet <corbet@lwn.net> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-doc@vger.kernel.org Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Message-Id: <20230625182047.26854-1-aleksandr.mikhalitsyn@canonical.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-12ovl: auto generate uuid for new overlay filesystemsAmir Goldstein
Add a new mount option uuid=auto, which is the default. If a persistent UUID xattr is found it is used. Otherwise, an existing ovelrayfs with copied up subdirs in upper dir that was never mounted with uuid=on retains the null UUID. A new overlayfs with no copied up subdirs, generates the persistent UUID on first mount. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: store persistent uuid/fsid with uuid=onAmir Goldstein
With uuid=on, store a persistent uuid in xattr on the upper dir to give the overlayfs instance a persistent identifier. This also makes f_fsid persistent and more reliable for reporting fid info in fanotify events. uuid=on is not supported on non-upper overlayfs or with upper fs that does not support xattrs. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: add support for unique fsid per instanceAmir Goldstein
The legacy behavior of ovl_statfs() reports the f_fsid filled by underlying upper fs. This fsid is not unique among overlayfs instances on the same upper fs. With mount option uuid=on, generate a non-persistent uuid per overlayfs instance and use it as the seed for f_fsid, similar to tmpfs. This is useful for reporting fanotify events with fid info from different instances of overlayfs over the same upper fs. The old behavior of null uuid and upper fs fsid is retained with the mount option uuid=null, which is the default. The mount option uuid=off that disables uuid checks in underlying layers also retains the legacy behavior. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-12ovl: Add framework for verity supportAlexander Larsson
This adds the scaffolding (docs, config, mount options) for supporting the new digest field in the metacopy xattr. This contains a fs-verity digest that need to match the fs-verity digest of the lowerdata file. The mount option "verity" specifies how this xattr is handled. If you enable verity ("verity=on") all existing xattrs are validated before use, and during metacopy we generate verity xattr in the upper metacopy file (if the source file has verity enabled). This means later accesses can guarantee that the same data is used. Additionally you can use "verity=require". In this mode all metacopy files must have a valid verity xattr. For this to work metadata copy-up must be able to create a verity xattr (so that later accesses are validated). Therefore, in this mode, if the lower data file doesn't have fs-verity enabled we fall back to a full copy rather than a metacopy. Actual implementation follows in a separate commit. Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-08-10docs: add maintainer entry profile for XFSDarrick J. Wong
Create a new document to list what I think are (within the scope of XFS) our shared goals and community roles. Since I will be stepping down shortly, I feel it's important to write down somewhere all the hats that I have been wearing for the past six years. Also, document important extra details about how to contribute to XFS. Cc: corbet@lwn.net Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
2023-08-10tmpfs,xattr: enable limited user extended attributesHugh Dickins
Enable "user." extended attributes on tmpfs, limiting them by tracking the space they occupy, and deducting that space from the limited ispace (unless tmpfs mounted with nr_inodes=0 to leave that ispace unlimited). tmpfs inodes and simple xattrs are both unswappable, and have to be in lowmem on a 32-bit highmem kernel: so the ispace limit is appropriate for xattrs, without any need for a further mount option. Add simple_xattr_space() to give approximate but deterministic estimate of the space taken up by each xattr: with simple_xattrs_free() outputting the space freed if required (but kernfs and even some tmpfs usages do not require that, so don't waste time on strlen'ing if not needed). Security and trusted xattrs were already supported: for consistency and simplicity, account them from the same pool; though there's a small risk that a tmpfs with enough space before would now be considered too small. When extended attributes are used, "df -i" does show more IUsed and less IFree than can be explained by the inodes: document that (manpage later). xfstests tests/generic which were not run on tmpfs before but now pass: 020 037 062 070 077 097 103 117 337 377 454 486 523 533 611 618 728 with no new failures. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Message-Id: <2e63b26e-df46-5baa-c7d6-f9a8dd3282c5@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09libfs: Add directory operations for stable offsetsChuck Lever
Create a vector of directory operations in fs/libfs.c that handles directory seeks and readdir via stable offsets instead of the current cursor-based mechanism. For the moment these are unused. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Message-Id: <168814732984.530310.11190772066786107220.stgit@manet.1015granger.net> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09shmem: Add default quota limit mount optionsLukas Czerner
Allow system administrator to set default global quota limits at tmpfs mount time. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230725144510.253763-7-cem@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09shmem: quota supportCarlos Maiolino
Now the basic infra-structure is in place, enable quota support for tmpfs. This offers user and group quotas to tmpfs (project quotas will be added later). Also, as other filesystems, the tmpfs quota is not supported within user namespaces yet, so idmapping is not translated. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230725144510.253763-6-cem@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06vfs: get rid of old '->iterate' directory operationLinus Torvalds
All users now just use '->iterate_shared()', which only takes the directory inode lock for reading. Filesystems that never got convered to shared mode now instead use a wrapper that drops the lock, re-takes it in write mode, calls the old function, and then downgrades the lock back to read mode. This way the VFS layer and other callers no longer need to care about filesystems that never got converted to the modern era. The filesystems that use the new wrapper are ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf. Honestly, several of them look like they really could just iterate their directories in shared mode and skip the wrapper entirely, but the point of this change is to not change semantics or fix filesystems that haven't been fixed in the last 7+ years, but to finally get rid of the dual iterators. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-27tmpfs: fix Documentation of noswap and huge mount optionsHugh Dickins
The noswap mount option is surely not one of the three options for sizing: move its description down. The huge= mount option does not accept numeric values: those are just in an internal enum. Delete those numbers, and follow the manpage text more closely (but there's not yet any fadvise() or fcntl() which applies here). /sys/kernel/mm/transparent_hugepage/shmem_enabled is hard to describe, and barely relevant to mounting a tmpfs: just refer to transhuge.rst (while still using the words deny and force, to help as informal reminders). [rdunlap@infradead.org: fixup Docs table for huge mount options] Link: https://lkml.kernel.org/r/20230725052333.26857-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/986cb0bf-9780-354-9bb-4bf57aadbab@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Fixes: d0f5a85442d1 ("shmem: update documentation") Fixes: 2c6efe9cf2d7 ("shmem: add support to ignore swap") Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-24doc: Correct the description of ->release_folioMatthew Wilcox (Oracle)
The filesystem ->release_folio method is called under more circumstances now than when the documentation was written. The second sentence describing the interpretation of the return value is the wrong polarity (false indicates failure, not success). And the third sentence is also wrong (the kernel calls try_to_free_buffers() instead). So replace the entire paragraph with a detailed description of what the state of the folio may be, the meaning of the gfp parameter, why the method is being called and what the filesystem is expected to do. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-07-21afs: Documentation: correct reference to CONFIG_AFS_FSLukas Bulwahn
Commit 0795e7c031c4 ("[AFS]: Update the AFS fs documentation.") adds a new section listing the build configuration options that need to be enabled for the AFS file system. The documentation refers to CONFIG_AFS, but the option is called CONFIG_AFS_FS, since the beginning of Linux's git history. Refer to the config option with the correct name. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230720094301.9888-1-lukas.bulwahn@gmail.com
2023-07-17fs: distinguish between user initiated freeze and kernel initiated freezeDarrick J. Wong
Userspace can freeze a filesystem using the FIFREEZE ioctl or by suspending the block device; this state persists until userspace thaws the filesystem with the FITHAW ioctl or resuming the block device. Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for the fsfreeze ioctl") we only allow the first freeze command to succeed. The kernel may decide that it is necessary to freeze a filesystem for its own internal purposes, such as suspends in progress, filesystem fsck activities, or quiescing a device prior to removal. Userspace thaw commands must never break a kernel freeze, and kernel thaw commands shouldn't undo userspace's freeze command. Introduce a couple of freeze holder flags and wire it into the sb_writers state. One kernel and one userspace freeze are allowed to coexist at the same time; the filesystem will not thaw until both are lifted. I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing behaviors. Cc: mcgrof@kernel.org Cc: jack@suse.cz Cc: hch@infradead.org Cc: ruansy.fnst@fujitsu.com Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>
2023-07-11fscrypt: improve the "Encryption modes and usage" sectionEric Biggers
As the number of supported encryption modes has grown, the part of the "Encryption modes and usage" section that describes the supported encryption modes has gotten a bit messy. It presents useful information, but it's a bit lacking in high-level context. Rework the section to hopefully be much more useful. Link: https://lore.kernel.org/r/20230630064811.22569-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-07-11mm: Introduce VM_SHADOW_STACK for shadow stack memoryYu-cheng Yu
New hardware extensions implement support for shadow stack memory, such as x86 Control-flow Enforcement Technology (CET). Add a new VM flag to identify these areas, for example, to be used to properly indicate shadow stack PTEs to the hardware. Shadow stack VMA creation will be tightly controlled and limited to anonymous memory to make the implementation simpler and since that is all that is required. The solution will rely on pte_mkwrite() to create the shadow stack PTEs, so it will not be required for vm_get_page_prot() to learn how to create shadow stack memory. For this reason document that VM_SHADOW_STACK should not be mixed with VM_SHARED. Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20230613001108.3040476-15-rick.p.edgecombe%40intel.com
2023-07-08Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
2023-07-08docs: update ocfs2-devel mailing list addressAnthony Iliopoulos
The ocfs2-devel mailing list has been migrated to the kernel.org infrastructure, update all related documentation pointers to reflect the change. Link: https://lkml.kernel.org/r/20230628013437.47030-3-ailiop@suse.com Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Acked-by: Joseph Qi <jiangqi903@gmail.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-05Merge tag 'f2fs-for-6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this cycle, we've mainly investigated the zoned block device support along with patches such as correcting write pointers between f2fs and storage, adding asynchronous zone reset flow, and managing the number of open zones. Other than them, f2fs adds another mount option, "errors=x" to specify how to handle when it detects an unexpected behavior at runtime. Enhancements: - support 'errors=remount-ro|continue|panic' mount option - enforce some inode flag policies - allow .tmp compression given extensions - add some ioctls to manage the f2fs compression - improve looped node chain flow - avoid issuing small-sized discard commands during checkpoint - implement an asynchronous zone reset Bug fixes: - fix deadlock in xattr and inode page lock - fix and add sanity check in some error paths - fix to avoid NULL pointer dereference f2fs_write_end_io() along with put_super - set proper flags to quota files - fix potential deadlock due to unpaired node_write lock use - fix over-estimating free section during FG GC - fix the wrong condition to determine atomic context As usual, also there are a number of patches with code refactoring and minor clean-ups" * tag 'f2fs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (46 commits) f2fs: fix to do sanity check on direct node in truncate_dnode() f2fs: only set release for file that has compressed data f2fs: fix compile warning in f2fs_destroy_node_manager() f2fs: fix error path handling in truncate_dnode() f2fs: fix deadlock in i_xattr_sem and inode page lock f2fs: remove unneeded page uptodate check/set f2fs: update mtime and ctime in move file range method f2fs: compress tmp files given extension f2fs: refactor struct f2fs_attr macro f2fs: convert to use sbi directly f2fs: remove redundant assignment to variable err f2fs: do not issue small discard commands during checkpoint f2fs: check zone write pointer points to the end of zone f2fs: add f2fs_ioc_get_compress_blocks f2fs: cleanup MIN_INLINE_XATTR_SIZE f2fs: add helper to check compression level f2fs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method f2fs: do more sanity check on inode f2fs: compress: fix to check validity of i_compress_flag field f2fs: add sanity compress level check for compressed file ...
2023-06-29Merge tag 'fsnotify_for_v6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: - Support for fanotify events returning file handles for filesystems not exportable via NFS - Improved error handling exportfs functions - Add missing FS_OPEN events when unusual open helpers are used * tag 'fsnotify_for_v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: move fsnotify_open() hook into do_dentry_open() exportfs: check for error return value from exportfs_encode_*() fanotify: support reporting non-decodeable file handles exportfs: allow exporting non-decodeable file handles to userspace exportfs: add explicit flag to request non-decodeable file handles exportfs: change connectable argument to bit flags
2023-06-29Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Various cleanups and bug fixes in ext4's extent status tree, journalling, and block allocator subsystems. Also improve performance for parallel DIO overwrites" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (55 commits) ext4: avoid updating the superblock on a r/o mount if not needed jbd2: skip reading super block if it has been verified ext4: fix to check return value of freeze_bdev() in ext4_shutdown() ext4: refactoring to use the unified helper ext4_quotas_off() ext4: turn quotas off if mount failed after enabling quotas ext4: update doc about journal superblock description ext4: add journal cycled recording support jbd2: continue to record log between each mount jbd2: remove j_format_version jbd2: factor out journal initialization from journal_get_superblock() jbd2: switch to check format version in superblock directly jbd2: remove unused feature macros ext4: ext4_put_super: Remove redundant checking for 'sbi->s_journal_bdev' ext4: Fix reusing stale buffer heads from last failed mounting ext4: allow concurrent unaligned dio overwrites ext4: clean up mballoc criteria comments ext4: make ext4_zeroout_es() return void ext4: make ext4_es_insert_extent() return void ext4: make ext4_es_insert_delayed_block() return void ext4: make ext4_es_remove_extent() return void ...
2023-06-29Merge tag 'ovl-update-6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs update from Amir Goldstein: - fix two NULL pointer deref bugs (Zhihao Cheng) - add support for "data-only" lower layers destined to be used by composefs - port overlayfs to the new mount api (Christian Brauner) * tag 'ovl-update-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: (26 commits) ovl: add Amir as co-maintainer ovl: reserve ability to reconfigure mount options with new mount api ovl: modify layer parameter parsing ovl: port to new mount api ovl: factor out ovl_parse_options() helper ovl: store enum redirect_mode in config instead of a string ovl: pass ovl_fs to xino helpers ovl: clarify ovl_get_root() semantics ovl: negate the ofs->share_whiteout boolean ovl: check type and offset of struct vfsmount in ovl_entry ovl: implement lazy lookup of lowerdata in data-only layers ovl: prepare for lazy lookup of lowerdata inode ovl: prepare to store lowerdata redirect for lazy lowerdata lookup ovl: implement lookup in data-only layers ovl: introduce data-only lower layers ovl: remove unneeded goto instructions ovl: deduplicate lowerdata and lowerstack[] ovl: deduplicate lowerpath and lowerstack[] ovl: move ovl_entry into ovl_inode ovl: factor out ovl_free_entry() and ovl_stack_*() helpers ...
2023-06-28Merge tag 'net-next-6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer *should* be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" * tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ...
2023-06-27Merge tag 'hardening-v6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "There are three areas of note: A bunch of strlcpy()->strscpy() conversions ended up living in my tree since they were either Acked by maintainers for me to carry, or got ignored for multiple weeks (and were trivial changes). The compiler option '-fstrict-flex-arrays=3' has been enabled globally, and has been in -next for the entire devel cycle. This changes compiler diagnostics (though mainly just -Warray-bounds which is disabled) and potential UBSAN_BOUNDS and FORTIFY _warning_ coverage. In other words, there are no new restrictions, just potentially new warnings. Any new FORTIFY warnings we've seen have been fixed (usually in their respective subsystem trees). For more details, see commit df8fc4e934c12b. The under-development compiler attribute __counted_by has been added so that we can start annotating flexible array members with their associated structure member that tracks the count of flexible array elements at run-time. It is possible (likely?) that the exact syntax of the attribute will change before it is finalized, but GCC and Clang are working together to sort it out. Any changes can be made to the macro while we continue to add annotations. As an example of that last case, I have a treewide commit waiting with such annotations found via Coccinelle: https://git.kernel.org/linus/adc5b3cb48a049563dc673f348eab7b6beba8a9b Also see commit dd06e72e68bcb4 for more details. Summary: - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko) - Convert strreplace() to return string start (Andy Shevchenko) - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook) - Add missing function prototypes seen with W=1 (Arnd Bergmann) - Fix strscpy() kerndoc typo (Arne Welzel) - Replace strlcpy() with strscpy() across many subsystems which were either Acked by respective maintainers or were trivial changes that went ignored for multiple weeks (Azeem Shaikh) - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers) - Add KUnit tests for strcat()-family - Enable KUnit tests of FORTIFY wrappers under UML - Add more complete FORTIFY protections for strlcat() - Add missed disabling of FORTIFY for all arch purgatories. - Enable -fstrict-flex-arrays=3 globally - Tightening UBSAN_BOUNDS when using GCC - Improve checkpatch to check for strcpy, strncpy, and fake flex arrays - Improve use of const variables in FORTIFY - Add requested struct_size_t() helper for types not pointers - Add __counted_by macro for annotating flexible array size members" * tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (54 commits) netfilter: ipset: Replace strlcpy with strscpy uml: Replace strlcpy with strscpy um: Use HOST_DIR for mrproper kallsyms: Replace all non-returning strlcpy with strscpy sh: Replace all non-returning strlcpy with strscpy of/flattree: Replace all non-returning strlcpy with strscpy sparc64: Replace all non-returning strlcpy with strscpy Hexagon: Replace all non-returning strlcpy with strscpy kobject: Use return value of strreplace() lib/string_helpers: Change returned value of the strreplace() jbd2: Avoid printing outside the boundary of the buffer checkpatch: Check for 0-length and 1-element arrays riscv/purgatory: Do not use fortified string functions s390/purgatory: Do not use fortified string functions x86/purgatory: Do not use fortified string functions acpi: Replace struct acpi_table_slit 1-element array with flex-array clocksource: Replace all non-returning strlcpy with strscpy string: use __builtin_memcpy() in strlcpy/strlcat staging: most: Replace all non-returning strlcpy with strscpy drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy ...
2023-06-26ext4: update doc about journal superblock descriptionZhang Yi
Update the description in doc about the s_head parameter in the journal superblock. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20230322013353.1843306-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-06-26Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds
Pull fsverity updates from Eric Biggers: "Several updates for fs/verity/: - Do all hashing with the shash API instead of with the ahash API. This simplifies the code and reduces API overhead. It should also make things slightly easier for XFS's upcoming support for fsverity. It does drop fsverity's support for off-CPU hash accelerators, but that support was incomplete and not known to be used - Update and export fsverity_get_digest() so that it's ready for overlayfs's upcoming support for fsverity checking of lowerdata - Improve the documentation for builtin signature support - Fix a bug in the large folio support" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: improve documentation for builtin signature support fsverity: rework fsverity_get_digest() again fsverity: simplify error handling in verify_data_block() fsverity: don't use bio_first_page_all() in fsverity_verify_bio() fsverity: constify fsverity_hash_alg fsverity: use shash API instead of ahash API
2023-06-26Merge tag 'v6.5/vfs.rename.locking' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs rename locking updates from Christian Brauner: "This contains the work from Jan to fix problems with cross-directory renames originally reported in [1]. To quickly sum it up some filesystems (so far we know at least about ext4, udf, f2fs, ocfs2, likely also reiserfs, gfs2 and others) need to lock the directory when it is being renamed into another directory. This is because we need to update the parent pointer in the directory in that case and if that races with other operations on the directory, in particular a conversion from one directory format into another, bad things can happen. So far we've done the locking in the filesystem code but recently Darrick pointed out in [2] that the RENAME_EXCHANGE case was missing. That one is particularly nasty because RENAME_EXCHANGE can arbitrarily mix regular files and directories and proper lock ordering is not achievable in the filesystems alone. This patch set adds locking into vfs_rename() so that not only parent directories but also moved inodes, regardless of whether they are directories or not, are locked when calling into the filesystem. This means establishing a locking order for unrelated directories. New helpers are added for this purpose and our documentation is updated to cover this in detail. The locking is now actually easier to follow as we now always lock source and target. We've always locked the target independent of whether it was a directory or file and we've always locked source if it was a regular file. The exact details for why this came about can be found in [3] and [4]" Link: https://lore.kernel.org/all/20230117123735.un7wbamlbdihninm@quack3 [1] Link: https://lore.kernel.org/all/20230517045836.GA11594@frogsfrogsfrogs [2] Link: https://lore.kernel.org/all/20230526-schrebergarten-vortag-9cd89694517e@brauner [3] Link: https://lore.kernel.org/all/20230530-seenotrettung-allrad-44f4b00139d4@brauner [4] * tag 'v6.5/vfs.rename.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: Restrict lock_two_nondirectories() to non-directory inodes fs: Lock moved directories fs: Establish locking order for unrelated directories Revert "f2fs: fix potential corruption when moving a directory" Revert "udf: Protect rename against modification of moved directory" ext4: Remove ext4 locking of moved directory
2023-06-24sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)David Howells
Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: mptcp@lists.linux.dev cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-20fsverity: improve documentation for builtin signature supportEric Biggers
fsverity builtin signatures (CONFIG_FS_VERITY_BUILTIN_SIGNATURES) aren't the only way to do signatures with fsverity, and they have some major limitations. Yet, more users have tried to use them, e.g. recently by https://github.com/ostreedev/ostree/pull/2640. In most cases this seems to be because users aren't sufficiently familiar with the limitations of this feature and what the alternatives are. Therefore, make some updates to the documentation to try to clarify the properties of this feature and nudge users in the right direction. Note that the Integrity Policy Enforcement (IPE) LSM, which is not yet upstream, is planned to use the builtin signatures. (This differs from IMA, which uses its own signature mechanism.) For that reason, my earlier patch "fsverity: mark builtin signatures as deprecated" (https://lore.kernel.org/r/20221208033548.122704-1-ebiggers@kernel.org), which marked builtin signatures as "deprecated", was controversial. This patch therefore stops short of marking the feature as deprecated. I've also revised the language to focus on better explaining the feature and what its alternatives are. Link: https://lore.kernel.org/r/20230620041937.5809-1-ebiggers@kernel.org Reviewed-by: Colin Walters <walters@verbum.org> Reviewed-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-06-19ovl: store enum redirect_mode in config instead of a stringAmir Goldstein
Do all the logic to set the mode during mount options parsing and do not keep the option string around. Use a constant_table to translate from enum redirect mode to string in preperation for new mount api option parsing. The mount option "off" is translated to either "follow" or "nofollow", depending on the "redirect_always_follow" build/module config, so in effect, there are only three possible redirect modes. This results in a minor change to the string that is displayed in show_options() - when redirect_dir is enabled by default and the user mounts with the option "redirect_dir=off", instead of displaying the mode "redirect_dir=off" in show_options(), the displayed mode will be either "redirect_dir=follow" or "redirect_dir=nofollow", depending on the value of "redirect_always_follow" build/module config. The displayed mode reflects the effective mode, so mounting overlayfs again with the dispalyed redirect_dir option will result with the same effective and displayed mode. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19ovl: introduce data-only lower layersAmir Goldstein
Introduce the format lowerdir=lower1:lower2::lowerdata1::lowerdata2 where the lower layers on the right of the :: separators are not merged into the overlayfs merge dirs. Data-only lower layers are only allowed at the bottom of the stack. The files in those layers are only meant to be accessible via absolute redirect from metacopy files in lower layers. Following changes will implement lookup in the data layers. This feature was requested for composefs ostree use case, where the lower data layer should only be accessiable via absolute redirects from metacopy inodes. The lower data layers are not required to a have a unique uuid or any uuid at all, because they are never used to compose the overlayfs inode st_ino/st_dev. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-02fs: Lock moved directoriesJan Kara
When a directory is moved to a different directory, some filesystems (udf, ext4, ocfs2, f2fs, and likely gfs2, reiserfs, and others) need to update their pointer to the parent and this must not race with other operations on the directory. Lock the directories when they are moved. Although not all filesystems need this locking, we perform it in vfs_rename() because getting the lock ordering right is really difficult and we don't want to expose these locking details to filesystems. CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230601105830.13168-5-jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-05-30autofs: use flexible array in ioctl structureArnd Bergmann
Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") introduced a warning for the autofs_dev_ioctl structure: In function 'check_name', inlined from 'validate_dev_ioctl' at fs/autofs/dev-ioctl.c:131:9, inlined from '_autofs_dev_ioctl' at fs/autofs/dev-ioctl.c:624:8: fs/autofs/dev-ioctl.c:33:14: error: 'strchr' reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread] 33 | if (!strchr(name, '/')) | ^~~~~~~~~~~~~~~~~ In file included from include/linux/auto_dev-ioctl.h:10, from fs/autofs/autofs_i.h:10, from fs/autofs/dev-ioctl.c:14: include/uapi/linux/auto_dev-ioctl.h: In function '_autofs_dev_ioctl': include/uapi/linux/auto_dev-ioctl.h:112:14: note: source object 'path' of size 0 112 | char path[0]; | ^~~~ This is easily fixed by changing the gnu 0-length array into a c99 flexible array. Since this is a uapi structure, we have to be careful about possible regressions but this one should be fine as they are equivalent here. While it would break building with ancient gcc versions that predate c99, it helps building with --std=c99 and -Wpedantic builds in user space, as well as non-gnu compilers. This means we probably also want it fixed in stable kernels. Cc: stable@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: Gustavo A. R. Silva" <gustavoars@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230523081944.581710-1-arnd@kernel.org
2023-05-24smb3: move Documentation/filesystems/cifs to Documentation/filesystems/smbSteve French
Documentation/filesystems/cifs contains both server and client information so its pathname is misleading. In addition, the directory fs/smb now contains both server and client, so move Documentation/filesystems/cifs to Documentation/filesystems/smb Suggested-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-24cifs: correct references in Documentation to old fs/cifs pathSteve French
The fs/cifs directory has moved to fs/smb/client, correct mentions of this in Documentation and comments. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-22exportfs: add explicit flag to request non-decodeable file handlesAmir Goldstein
So far, all callers of exportfs_encode_inode_fh(), except for fsnotify's show_mark_fhandle(), check that filesystem can decode file handles, but we would like to add more callers that do not require a file handle that can be decoded. Introduce a flag to explicitly request a file handle that may not to be decoded later and a wrapper exportfs_encode_fid() that sets this flag and convert show_mark_fhandle() to use the new wrapper. This will be used to allow adding fanotify support to filesystems that do not support NFS export. Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230502124817.3070545-3-amir73il@gmail.com>
2023-05-16Documentation/filesystems: ramfs-rootfs-initramfs: use :Author:Randy Dunlap
Use the :Author: markup instead of making it a chapter heading. This cleans up the table of contents for this file. Fixes: 7f46a240b0a1 ("[PATCH] ramfs, rootfs, and initramfs docs") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Rob Landley <rob@landley.net> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230508055928.3548-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-05-16Documentation/filesystems: sharedsubtree: add section headingsRandy Dunlap
Several of the sections are missing underlines. This makes the generated contents have missing entries, so add the underlines. Fixes: 16c01b20ae05 ("doc/filesystems: more mount cleanups") Fixes: 9cfcceea8f7e ("[PATCH] Complete description of shared subtrees.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230508055938.6550-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-05-08f2fs: support errors=remount-ro|continue|panic mountoptionChao Yu
This patch supports errors=remount-ro|continue|panic mount option for f2fs. f2fs behaves as below in three different modes: mode continue remount-ro panic access ops normal noraml N/A syscall errors -EIO -EROFS N/A mount option rw ro N/A pending dir write keep keep N/A pending non-dir write drop keep N/A pending node write drop keep N/A pending meta write keep keep N/A By default it uses "continue" mode. [Yangtao helps to clean up function's name] Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-05-04Merge tag '9p-6.4-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: "This includes a number of patches that didn't quite make the cut last merge window while we addressed some outstanding issues and review comments. It includes some new caching modes for those that only want readahead caches and reworks how we do writeback caching so we are not keeping extra references around which both causes performance problems and uses lots of additional resources on the server. It also includes a new flag to force disabling of xattrs which can also cause major performance issues, particularly if the underlying filesystem on the server doesn't support them. Finally it adds a couple of additional mount options to better support directio and enabling caches when the server doesn't support qid.version. There was one late-breaking bug report that has also been included as its own patch where I forgot to propagate an embarassing bit-logic fix to the various variations of open" * tag '9p-6.4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: Fix bit operation logic error fs/9p: Rework cache modes and add new options to Documentation fs/9p: remove writeback fid and fix per-file modes fs/9p: Add new mount modes 9p: Add additional debug flags and open modes fs/9p: allow disable of xattr support on mount fs/9p: Remove unnecessary superblock flags fs/9p: Consolidate file operations and add readahead and writeback
2023-04-29Merge tag 'ntfs3_for_6.4' of ↵Linus Torvalds
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: "New code: - add missed "nocase" in ntfs_show_options - extend information on failures/errors - small optimizations Fixes: - some logic errors - some dead code was removed - code is refactored and reformatted according to the new version of clang-format Code removal: - 'noacsrules' option. Currently, this option does not work properly, and its use leads to unstable results. If we figure out how to implement it without errors, we will add it later - writepage" * tag 'ntfs3_for_6.4' of https://github.com/Paragon-Software-Group/linux-ntfs3: (30 commits) fs/ntfs3: Fix root inode checking fs/ntfs3: Print details about mount fails fs/ntfs3: Add missed "nocase" in ntfs_show_options fs/ntfs3: Code formatting and refactoring fs/ntfs3: Changed ntfs_get_acl() to use dentry fs/ntfs3: Remove field sbi->used.bitmap.set_tail fs/ntfs3: Undo critial modificatins to keep directory consistency fs/ntfs3: Undo endian changes fs/ntfs3: Optimization in ntfs_set_state() fs/ntfs3: Fix ntfs_create_inode() fs/ntfs3: Remove noacsrules fs/ntfs3: Use bh_read to simplify code fs/ntfs3: Fix a possible null-pointer dereference in ni_clear() fs/ntfs3: Refactoring of various minor issues fs/ntfs3: Restore overflow checking for attr size in mi_enum_attr fs/ntfs3: Check for extremely large size of $AttrDef fs/ntfs3: Improved checking of attribute's name length fs/ntfs3: Add null pointer checks fs/ntfs3: fix spelling mistake "attibute" -> "attribute" fs/ntfs3: Add length check in indx_get_root ...