diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-02 13:37:55 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-02 13:37:55 -0700 |
commit | 42df1cbf6a4726934cc5dac12bf263aa73c49fa3 (patch) | |
tree | 129e32104dccc660c3c8fca03b8bba418e572d09 /net/ipv4/tcp.c | |
parent | 98e247464088a11ce2328a214fdb87d4c06f8db6 (diff) | |
parent | 14b146b688ad9593f5eee93d51a34d09a47e50b5 (diff) |
Merge tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring zerocopy support from Jens Axboe:
"This adds support for efficient support for zerocopy sends through
io_uring. Both ipv4 and ipv6 is supported, as well as both TCP and
UDP.
The core network changes to support this is in a stable branch from
Jakub that both io_uring and net-next has pulled in, and the io_uring
changes are layered on top of that.
All of the work has been done by Pavel"
* tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block: (34 commits)
io_uring: notification completion optimisation
io_uring: export req alloc from core
io_uring/net: use unsigned for flags
io_uring/net: make page accounting more consistent
io_uring/net: checks errors of zc mem accounting
io_uring/net: improve io_get_notif_slot types
selftests/io_uring: test zerocopy send
io_uring: enable managed frags with register buffers
io_uring: add zc notification flush requests
io_uring: rename IORING_OP_FILES_UPDATE
io_uring: flush notifiers after sendzc
io_uring: sendzc with fixed buffers
io_uring: allow to pass addr into sendzc
io_uring: account locked pages for non-fixed zc
io_uring: wire send zc request type
io_uring: add notification slot registration
io_uring: add rsrc referencing for notifiers
io_uring: complete notifiers in tw
io_uring: cache struct io_notif
io_uring: add zc notification infrastructure
...
Diffstat (limited to 'net/ipv4/tcp.c')
-rw-r--r-- | net/ipv4/tcp.c | 33 |
1 files changed, 22 insertions, 11 deletions
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 766881775abb..a628daa51113 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1203,17 +1203,23 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) flags = msg->msg_flags; - if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) { + if ((flags & MSG_ZEROCOPY) && size) { skb = tcp_write_queue_tail(sk); - uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); - if (!uarg) { - err = -ENOBUFS; - goto out_err; - } - zc = sk->sk_route_caps & NETIF_F_SG; - if (!zc) - uarg->zerocopy = 0; + if (msg->msg_ubuf) { + uarg = msg->msg_ubuf; + net_zcopy_get(uarg); + zc = sk->sk_route_caps & NETIF_F_SG; + } else if (sock_flag(sk, SOCK_ZEROCOPY)) { + uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); + if (!uarg) { + err = -ENOBUFS; + goto out_err; + } + zc = sk->sk_route_caps & NETIF_F_SG; + if (!zc) + uarg->zerocopy = 0; + } } if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) && @@ -1336,8 +1342,13 @@ new_segment: copy = min_t(int, copy, pfrag->size - pfrag->offset); - if (tcp_downgrade_zcopy_pure(sk, skb) || - !sk_wmem_schedule(sk, copy)) + if (unlikely(skb_zcopy_pure(skb) || skb_zcopy_managed(skb))) { + if (tcp_downgrade_zcopy_pure(sk, skb)) + goto wait_for_space; + skb_zcopy_downgrade_managed(skb); + } + + if (!sk_wmem_schedule(sk, copy)) goto wait_for_space; err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, |