summaryrefslogtreecommitdiff
path: root/net
diff options
context:
space:
mode:
authorIlya Maximets <i.maximets@ovn.org>2023-07-03 19:53:29 +0200
committerDaniel Borkmann <daniel@iogearbox.net>2023-07-04 10:19:48 +0200
commitf7306acec9aae9893d15e745c8791124d42ab10a (patch)
treeb9b44039b1c0585cfd045109f069ed31b8fa6c1a /net
parent3de4d22cc9ac7c9f38e10edcf54f9a8891a9c2aa (diff)
xsk: Honor SO_BINDTODEVICE on bind
Initial creation of an AF_XDP socket requires CAP_NET_RAW capability. A privileged process might create the socket and pass it to a non-privileged process for later use. However, that process will be able to bind the socket to any network interface. Even though it will not be able to receive any traffic without modification of the BPF map, the situation is not ideal. Sockets already have a mechanism that can be used to restrict what interface they can be attached to. That is SO_BINDTODEVICE. To change the SO_BINDTODEVICE binding the process will need CAP_NET_RAW. Make xsk_bind() honor the SO_BINDTODEVICE in order to allow safer workflow when non-privileged process is using AF_XDP. The intended workflow is following: 1. First process creates a bare socket with socket(AF_XDP, ...). 2. First process loads the XSK program to the interface. 3. First process adds the socket fd to a BPF map. 4. First process ties socket fd to a particular interface using SO_BINDTODEVICE. 5. First process sends socket fd to a second process. 6. Second process allocates UMEM. 7. Second process binds socket to the interface with bind(...). 8. Second process sends/receives the traffic. All the steps above are possible today if the first process is privileged and the second one has sufficient RLIMIT_MEMLOCK and no capabilities. However, the second process will be able to bind the socket to any interface it wants on step 7 and send traffic from it. With the proposed change, the second process will be able to bind the socket only to a specific interface chosen by the first process at step 4. Fixes: 965a99098443 ("xsk: add support for bind for Rx") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/bpf/20230703175329.3259672-1-i.maximets@ovn.org
Diffstat (limited to 'net')
-rw-r--r--net/xdp/xsk.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5a8c0dd250af..31dca4ecb2c5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
struct sock *sk = sock->sk;
struct xdp_sock *xs = xdp_sk(sk);
struct net_device *dev;
+ int bound_dev_if;
u32 flags, qid;
int err = 0;
@@ -899,6 +900,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
XDP_USE_NEED_WAKEUP))
return -EINVAL;
+ bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
+ if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex)
+ return -EINVAL;
+
rtnl_lock();
mutex_lock(&xs->mutex);
if (xs->state != XSK_READY) {