Age | Commit message (Collapse) | Author |
|
xfs_filemap_huge_fault only ever serves DAX faults, so hard code the
call to xfs_dax_read_fault and open code __xfs_filemap_fault in the
only remaining caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
Only two of the callers of __xfs_filemap_fault every handle read faults.
Split the write_fault handling out of __xfs_filemap_fault so that all
callers call that directly either conditionally or unconditionally and
only leave the read fault handling in __xfs_filemap_fault.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
Split the xfs_filemap_fault trace event into separate ones for read and
write faults and move them into the applicable locations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
It's just read in from the superblock and used without doing any
validity checks at all on the value.
Fixes: fb4f2b4e5a82 ("xfs: add sparse inode chunk alignment superblock field")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
xfs_zero_extent does some really odd gymnstics to calculate the block
layer sectors numbers passed to blkdev_issue_zeroout. This is because it
used to call sb_issue_zeroout and the calculations in that helper got
open coded here in the rather misleadingly named commit 3dc29161070a
("dax: use sb_issue_zerout instead of calling dax_clear_sectors").
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
There are two invocations of xfs_alloc_log_agf in xfs_alloc_put_freelist.
The AGF does not change between the two calls. Although this does not pose
any practical problems, it seems like a small mistake. Therefore, fix it
by removing the first xfs_alloc_log_agf invocation.
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
The DLM_LKF_QUECVT flag needs to be set for "upward" lock conversions to
ensure fairness, but setting it for "downward" lock conversions will
lead to a failure. The flag is currently set based on the GLF_BLOCKING
flag and it's not immediately obvious why this is correct. Simplify
things by figuring out if a lock conversion is "upward" by looking at
the before and after locking modes instead of relying on the
GLF_BLOCKING flag.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
When function evict_should_delete() returns SHOULD_DEFER_EVICTION, gh is
never initialized, but that isn't obvious; if it did initialize gh and
then return SHOULD_DEFER_EVICTION, gfs2_evict_inode() would fail to
release it. To clarify the code, change gfs2_evict_inode() to always
check if gh needs to be released, no matter what evict_should_delete()
returns.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Function gfs2_inode_refresh() is only used in fs/gfs2/glops.c.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Use get_random_u32() instead of get_random_bytes() to remove the last
remaining call to get_random_bytes().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Randomize the delay of GLF_VERIFY_DELETE work. This avoids thundering
herd problems when multiple nodes schedule that kind of work in response
to an inode being unlinked remotely.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
In the unlikely case that we're trying to queue GLF_TRY_TO_EVICT work
for an inode that already has GLF_VERIFY_DELETE work queued, we want to
make sure that the GLF_TRY_TO_EVICT work gets scheduled immediately
instead of waiting for the delayed work timer to expire.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Try to be a bit more clear and remove some duplications. We cannot
actually get rid of the verification step eventually, so remove the
comment saying so.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Move calls to gfs2_queue_verify_delete() into gfs2_evict_inode().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Function delete_work_func() was previously assuming that the
GLF_TRY_TO_EVICT and GLF_VERIFY_DELETE flags won't both be set at the
same time, but there probably are races in which that can happen, so
handle that case correctly.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Move those definitions into the the scope in which they are used.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
In case an iopen glock cannot be upgraded, function
gfs2_upgrade_iopen_glock() needs to communicate to gfs2_evict_inode()
whether deleting the inode should be deferred or skipped altogether.
Change the function to return the appropriate enum evict_behavior value
to indicate that.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Rename enum dinode_demise to evict_behavior and its items
SHOULD_DELETE_DINODE to EVICT_SHOULD_DELETE,
SHOULD_NOT_DELETE_DINODE to EVICT_SHOULD_SKIP_DELETE, and
SHOULD_DEFER_EVICTION to EVICT_SHOULD_DEFER_DELETE.
In gfs2_evict_inode(), add a separate variable of type enum
evict_behavior instead of implicitly casting to int.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The GIF_DEFERRED_DELETE flag indicates an action that gfs2_evict_inode()
should take, so rename the flag to GIF_DEFER_DELETE to clarify.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Move function needs_demote() to glock.h and rename it to
glock_needs_demote(). In handle_callback(), wake up the glock when
setting the GLF_PENDING_DEMOTE flag as well. (Setting the GLF_DEMOTE
flag already triggered a wake-up.)
With that, check for glock_needs_demote() in gfs2_upgrade_iopen_glock()
to wake up when either of those flags is set for the inode glock: the
faster we can react to contention, the better.
The GLF_PENDING_DEMOTE flag is only used for inode glocks (see
gfs2_glock_cb()) so it's okay to only check for the GLF_DEMOTE flag in
gfs2_drop_inode(). Still, using glock_needs_demote() there as well
makes the code a little easier to read.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Convert direct io requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert writeback requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert retrieve requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert ioctl requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert non-writeback write requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert read requests to use folios instead of pages.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert readdir requests to use a folio instead of a page.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert readlink requests to use a folio instead of a page.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Convert cuse requests to use a folio instead of a page.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Until all requests have been converted to use folios instead of pages,
virtio will need to support both types. Once all requests have been
converted, then virtio will support just folios.
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
This adds support in struct fuse_args_pages and fuse_copy_pages() for
using folios instead of pages for transferring data. Both folios and
pages must be supported right now in struct fuse_args_pages and
fuse_copy_pages() until all request types have been converted to use
folios. Once all have been converted, then
struct fuse_args_pages and fuse_copy_pages() will only support folios.
Right now in fuse, all folios are one page (large folios are not yet
supported). As such, copying folio->page is sufficient for copying
the entire folio in fuse_copy_pages().
No functional changes.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
If Client send simultaneous SMB operations to ksmbd, It exhausts too much
memory through the "ksmbd_work_cache”. It will cause OOM issue.
ksmbd has a credit mechanism but it can't handle this problem. This patch
add the check if it exceeds max credits to prevent this problem by assuming
that one smb request consumes at least one credit.
Cc: stable@vger.kernel.org # v5.15+
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_user_session_put should be called under smb3_preauth_hash_rsp().
It will avoid freeing session before calling smb3_preauth_hash_rsp().
Cc: stable@vger.kernel.org # v5.15+
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
There is a race condition between ksmbd_smb2_session_create and
ksmbd_expire_session. This patch add missing sessions_table_lock
while adding/deleting session from global session table.
Cc: stable@vger.kernel.org # v5.15+
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Set FMODE_CAN_ATOMIC_WRITE flag if we can atomic write for that inode.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> #On ppc64
|
|
Validate that an atomic write adheres to length/offset rules. Currently
we can only write a single FS block.
For an IOCB with IOCB_ATOMIC set to get as far as xfs_file_write_iter(),
FMODE_CAN_ATOMIC_WRITE will need to be set for the file; for this,
ATOMICWRITES flags would also need to be set for the inode.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Support providing info on atomic write unit min and max for an inode.
For simplicity, currently we limit the min at the FS block size. As for
max, we limit also at FS block size, as there is no current method to
guarantee extent alignment or granularity for regular files.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Support direct I/O atomic writes by producing a single bio with REQ_ATOMIC
flag set.
Initially FSes (XFS) should only support writing a single FS block
atomically.
As with any atomic write, we should produce a single bio which covers the
complete write length.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
[djwong: clarify a couple of things in the docs]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
The XFS code will need this.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Multi-threaded buffered reads to the same file exposed significant
inode spinlock contention in nfs_clear_invalid_mapping().
Eliminate this spinlock contention by checking flags without locking,
instead using smp_rmb and smp_load_acquire accordingly, but then take
spinlock and double-check these inode flags.
Also refactor nfs_set_cache_invalid() slightly to use
smp_store_release() to pair with nfs_clear_invalid_mapping()'s
smp_load_acquire().
While this fix is beneficial for all multi-threaded buffered reads
issued by an NFS client, this issue was identified in the context of
surprisingly low LOCALIO performance with 4K multi-threaded buffered
read IO. This fix dramatically speeds up LOCALIO performance:
before: read: IOPS=1583k, BW=6182MiB/s (6482MB/s)(121GiB/20002msec)
after: read: IOPS=3046k, BW=11.6GiB/s (12.5GB/s)(232GiB/20001msec)
Fixes: 17dfeb911339 ("NFS: Fix races in nfs_revalidate_mapping")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Fix the possibility of racing nfs_local_probe() resulting in:
list_add double add: new=ffff8b99707f9f58, prev=ffff8b99707f9f58, next=ffffffffc0f30000.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:35!
Add nfs_uuid_init() to properly initialize all nfs_uuid_t members
(particularly its list_head).
Switch to returning bool from nfs_uuid_begin(), returns false if
nfs_uuid_t is already in-use (its list_head is on a list). Update
nfs_local_probe() to return early if the nfs_client's cl_uuid
(nfs_uuid_t) is in-use.
Also, switch nfs_uuid_begin() from using list_add_tail_rcu() to
list_add_tail() -- rculist was used in an earlier version of the
localio code that had a lockless nfs_uuid_lookup interface.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
When asked to set both an atime and an mtime to the current system time,
ensure that the setting is atomic by calling inode_update_timestamps()
only once with the appropriate flags.
Fixes: e12912d94137 ("NFSv4: Add support for delegated atime and mtime attributes")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
When the client does an exclusive create and the server decides to store
the verifier in the timestamps, a SETATTR is subsequently sent to fix up
those timestamps. When that is the case, suppress the exceptions for
attribute delegations in nfs4_bitmap_copy_adjust().
Fixes: 32215c1f893a ("NFSv4: Don't request atime/mtime/size if they are delegated to us")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Fix the following KMSAN warning:
CPU: 1 UID: 0 PID: 7651 Comm: cp Tainted: G B
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
=====================================================
=====================================================
BUG: KMSAN: uninit-value in decode_getfattr_attrs+0x2d6d/0x2f90
decode_getfattr_attrs+0x2d6d/0x2f90
decode_getfattr_generic+0x806/0xb00
nfs4_xdr_dec_getattr+0x1de/0x240
rpcauth_unwrap_resp_decode+0xab/0x100
rpcauth_unwrap_resp+0x95/0xc0
call_decode+0x4ff/0xb50
__rpc_execute+0x57b/0x19d0
rpc_execute+0x368/0x5e0
rpc_run_task+0xcfe/0xee0
nfs4_proc_getattr+0x5b5/0x990
__nfs_revalidate_inode+0x477/0xd00
nfs_access_get_cached+0x1021/0x1cc0
nfs_do_access+0x9f/0xae0
nfs_permission+0x1e4/0x8c0
inode_permission+0x356/0x6c0
link_path_walk+0x958/0x1330
path_lookupat+0xce/0x6b0
filename_lookup+0x23e/0x770
vfs_statx+0xe7/0x970
vfs_fstatat+0x1f2/0x2c0
__se_sys_newfstatat+0x67/0x880
__x64_sys_newfstatat+0xbd/0x120
x64_sys_call+0x1826/0x3cf0
do_syscall_64+0xd0/0x1b0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The KMSAN warning is triggered in decode_getfattr_attrs(), when calling
decode_attr_mdsthreshold(). It appears that fattr->mdsthreshold is not
initialized.
Fix the issue by initializing fattr->mdsthreshold to NULL in
nfs_fattr_init().
Cc: stable@vger.kernel.org # v3.5.x
Fixes: 88034c3d88c2 ("NFSv4.1 mdsthreshold attribute xdr")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
If a timeout is specified in the mount options, it currently applies to
both the NFS protocol and (with v3) the MOUNT protocol. This is
sensible when they both use the same underlying protocol, or those
protocols are compatible w.r.t timeouts as RDMA and TCP are.
However if, for example, NFS is using TCP and MOUNT is using UDP then
using the same timeout doesn't make much sense.
If you
mount -o vers=3,proto=tcp,mountproto=udp,timeo=600,retrans=5 \
server:/path /mountpoint
then the timeo=600 which was intended for the NFS/TCP request will
apply to the MOUNT/UDP requests with the result that there will only be
one request sent (because UDP has a maximum timeout of 60 seconds).
This is not what a reasonable person might expect.
This patch disables the sharing of timeout information in cases where
the underlying protocols are not compatible.
Fixes: c9301cb35b59 ("nfs: hornor timeo and retrans option when mounting NFSv3")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Recently, we got a customer report that CIFS triggers oops while
reconnecting to a server. [0]
The workload runs on Kubernetes, and some pods mount CIFS servers
in non-root network namespaces. The problem rarely happened, but
it was always while the pod was dying.
The root cause is wrong reference counting for network namespace.
CIFS uses kernel sockets, which do not hold refcnt of the netns that
the socket belongs to. That means CIFS must ensure the socket is
always freed before its netns; otherwise, use-after-free happens.
The repro steps are roughly:
1. mount CIFS in a non-root netns
2. drop packets from the netns
3. destroy the netns
4. unmount CIFS
We can reproduce the issue quickly with the script [1] below and see
the splat [2] if CONFIG_NET_NS_REFCNT_TRACKER is enabled.
When the socket is TCP, it is hard to guarantee the netns lifetime
without holding refcnt due to async timers.
Let's hold netns refcnt for each socket as done for SMC in commit
9744d2bf1976 ("smc: Fix use-after-free in tcp_write_timer_handler().").
Note that we need to move put_net() from cifs_put_tcp_session() to
clean_demultiplex_info(); otherwise, __sock_create() still could touch a
freed netns while cifsd tries to reconnect from cifs_demultiplex_thread().
Also, maybe_get_net() cannot be put just before __sock_create() because
the code is not under RCU and there is a small chance that the same
address happened to be reallocated to another netns.
[0]:
CIFS: VFS: \\XXXXXXXXXXX has not responded in 15 seconds. Reconnecting...
CIFS: Serverclose failed 4 times, giving up
Unable to handle kernel paging request at virtual address 14de99e461f84a07
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[14de99e461f84a07] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in: cls_bpf sch_ingress nls_utf8 cifs cifs_arc4 cifs_md4 dns_resolver tcp_diag inet_diag veth xt_state xt_connmark nf_conntrack_netlink xt_nat xt_statistic xt_MASQUERADE xt_mark xt_addrtype ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment nft_compat nf_tables nfnetlink overlay nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sm4_ce_cipher sm4 sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 sha1_ce ena button sch_fq_codel loop fuse configfs dmi_sysfs sha2_ce sha256_arm64 dm_mirror dm_region_hash dm_log dm_mod dax efivarfs
CPU: 5 PID: 2690970 Comm: cifsd Not tainted 6.1.103-109.184.amzn2023.aarch64 #1
Hardware name: Amazon EC2 r7g.4xlarge/, BIOS 1.0 11/1/2018
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : fib_rules_lookup+0x44/0x238
lr : __fib_lookup+0x64/0xbc
sp : ffff8000265db790
x29: ffff8000265db790 x28: 0000000000000000 x27: 000000000000bd01
x26: 0000000000000000 x25: ffff000b4baf8000 x24: ffff00047b5e4580
x23: ffff8000265db7e0 x22: 0000000000000000 x21: ffff00047b5e4500
x20: ffff0010e3f694f8 x19: 14de99e461f849f7 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 3f92800abd010002
x11: 0000000000000001 x10: ffff0010e3f69420 x9 : ffff800008a6f294
x8 : 0000000000000000 x7 : 0000000000000006 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffff001924354280 x3 : ffff8000265db7e0
x2 : 0000000000000000 x1 : ffff0010e3f694f8 x0 : ffff00047b5e4500
Call trace:
fib_rules_lookup+0x44/0x238
__fib_lookup+0x64/0xbc
ip_route_output_key_hash_rcu+0x2c4/0x398
ip_route_output_key_hash+0x60/0x8c
tcp_v4_connect+0x290/0x488
__inet_stream_connect+0x108/0x3d0
inet_stream_connect+0x50/0x78
kernel_connect+0x6c/0xac
generic_ip_connect+0x10c/0x6c8 [cifs]
__reconnect_target_unlocked+0xa0/0x214 [cifs]
reconnect_dfs_server+0x144/0x460 [cifs]
cifs_reconnect+0x88/0x148 [cifs]
cifs_readv_from_socket+0x230/0x430 [cifs]
cifs_read_from_socket+0x74/0xa8 [cifs]
cifs_demultiplex_thread+0xf8/0x704 [cifs]
kthread+0xd0/0xd4
Code: aa0003f8 f8480f13 eb18027f 540006c0 (b9401264)
[1]:
CIFS_CRED="/root/cred.cifs"
CIFS_USER="Administrator"
CIFS_PASS="Password"
CIFS_IP="X.X.X.X"
CIFS_PATH="//${CIFS_IP}/Users/Administrator/Desktop/CIFS_TEST"
CIFS_MNT="/mnt/smb"
DEV="enp0s3"
cat <<EOF > ${CIFS_CRED}
username=${CIFS_USER}
password=${CIFS_PASS}
domain=EXAMPLE.COM
EOF
unshare -n bash -c "
mkdir -p ${CIFS_MNT}
ip netns attach root 1
ip link add eth0 type veth peer veth0 netns root
ip link set eth0 up
ip -n root link set veth0 up
ip addr add 192.168.0.2/24 dev eth0
ip -n root addr add 192.168.0.1/24 dev veth0
ip route add default via 192.168.0.1 dev eth0
ip netns exec root sysctl net.ipv4.ip_forward=1
ip netns exec root iptables -t nat -A POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE
mount -t cifs ${CIFS_PATH} ${CIFS_MNT} -o vers=3.0,sec=ntlmssp,credentials=${CIFS_CRED},rsize=65536,wsize=65536,cache=none,echo_interval=1
touch ${CIFS_MNT}/a.txt
ip netns exec root iptables -t nat -D POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE
"
umount ${CIFS_MNT}
[2]:
ref_tracker: net notrefcnt@000000004bbc008d has 1/1 users at
sk_alloc (./include/net/net_namespace.h:339 net/core/sock.c:2227)
inet_create (net/ipv4/af_inet.c:326 net/ipv4/af_inet.c:252)
__sock_create (net/socket.c:1576)
generic_ip_connect (fs/smb/client/connect.c:3075)
cifs_get_tcp_session.part.0 (fs/smb/client/connect.c:3160 fs/smb/client/connect.c:1798)
cifs_mount_get_session (fs/smb/client/trace.h:959 fs/smb/client/connect.c:3366)
dfs_mount_share (fs/smb/client/dfs.c:63 fs/smb/client/dfs.c:285)
cifs_mount (fs/smb/client/connect.c:3622)
cifs_smb3_do_mount (fs/smb/client/cifsfs.c:949)
smb3_get_tree (fs/smb/client/fs_context.c:784 fs/smb/client/fs_context.c:802 fs/smb/client/fs_context.c:794)
vfs_get_tree (fs/super.c:1800)
path_mount (fs/namespace.c:3508 fs/namespace.c:3834)
__x64_sys_mount (fs/namespace.c:3848 fs/namespace.c:4057 fs/namespace.c:4034 fs/namespace.c:4034)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"17 hotfixes. 9 are cc:stable. 13 are MM and 4 are non-MM.
The usual collection of singletons - please see the changelogs"
* tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify()
mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats
mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes
mm: shrinker: avoid memleak in alloc_shrinker_info
.mailmap: update e-mail address for Eugen Hristev
vmscan,migrate: fix page count imbalance on node stats when demoting pages
mailmap: update Jarkko's email addresses
mm: allow set/clear page_type again
nilfs2: fix potential deadlock with newly created symlinks
Squashfs: fix variable overflow in squashfs_readpage_block
kasan: remove vmalloc_percpu test
tools/mm: -Werror fixes in page-types/slabinfo
mm, swap: avoid over reclaim of full clusters
mm: fix PSWPIN counter for large folios swap-in
mm: avoid VM_BUG_ON when try to map an anon large folio to zero page.
mm/codetag: fix null pointer check logic for ref and tag
mm/gup: stop leaking pinned pages in low memory conditions
|
|
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
lift setting ->revents into the caller, so that failure exits (including
the early one) would be plain returns.
We need the scope of our struct fd to end before the store to ->revents,
since that's shared with the failure exits prior to the point where we
can do fdget().
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
take the logics from fdget() to fdput() into an inlined helper - with existing
wait_key_set() subsumed into that.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|