summaryrefslogtreecommitdiff
path: root/net/tipc/socket.c
AgeCommit message (Collapse)Author
2021-08-16tipc: call tipc_wait_for_connect only when dlen is not 0Xin Long
__tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg() or tipc_connect(). The difference is in tipc_connect(), it will call tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting is done. So there's no need to wait in __tipc_sendmsg() for this case. This patch is to fix it by calling tipc_wait_for_connect() only when dlen is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect(). Note this also fixes the failure in tipcutils/test/ptts/: # ./tipcTS & # ./tipcTC 9 (hang) Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23tipc: fix sleeping in tipc accept routineHoang Le
The release_sock() is blocking function, it would change the state after sleeping. In order to evaluate the stated condition outside the socket lock context, switch to use wait_woken() instead. Fixes: 6398e23cdb1d8 ("tipc: standardize accept routine") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23tipc: fix implicit-connect for SYN+Xin Long
For implicit-connect, when it's either SYN- or SYN+, an ACK should be sent back to the client immediately. It's not appropriate for the client to enter established state only after receiving data from the server. On client side, after the SYN is sent out, tipc_wait_for_connect() should be called to wait for the ACK if timeout is set. This patch also restricts __tipc_sendstream() to call __sendmsg() only when it's in TIPC_OPEN state, so that the client can program in a single loop doing both connecting and data sending like: for (...) sendmsg(dest, buf); This makes the implicit-connect more implicit. Fixes: b97bf3fd8f6a ("[TIPC] Initial merge") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10tipc: socket.c: fix the use of copular verbgushengxian
Fix the use of copular verb. Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03tipc: simplify handling of lookup scope during multicast message receptionJon Maloy
We introduce a new macro TIPC_ANY_SCOPE to make the handling of the lookup scope value more comprehensible during multicast reception. The (unchanged) rules go as follows: 1) Multicast messages sent from own node are delivered to all matching sockets on the own node, irrespective of their binding scope. 2) Multicast messages sent from other nodes arrive here because they have found TIPC_CLUSTER_SCOPE bindings emanating from this node. Those messages should be delivered to exactly those sockets, but not to local sockets bound with TIPC_NODE_SCOPE, since the latter obviously were not meant to be visible for those senders. 3) Group multicast/broadcast messages are delivered to the sockets with a binding scope matching exactly the lookup scope indicated in the message header, and nobody else. Reviewed-by: Xin Long <lucien.xin@gmail.com> Tested-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03tipc: refactor function tipc_sk_anc_data_recv()Jon Maloy
We refactor tipc_sk_anc_data_recv() to make it slightly more comprehensible, but also to facilitate application of some additions to the code in a future commit. Reviewed-by: Xin Long <lucien.xin@gmail.com> Tested-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03tipc: eliminate redundant fields in struct tipc_sockJon Maloy
We eliminate the redundant fields conn_type and conn_instance in struct tipc_sock. On the connecting side, this information is already present in the unused (after the connection is established) part of the pre-allocated header, and on the accepting side, we put it there when the new socket is created. Reviewed-by: Xin Long <lucien.xin@gmail.com> Tested-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14Revert "net:tipc: Fix a double free in tipc_sk_mcast_rcv"Hoang Le
This reverts commit 6bf24dc0cc0cc43b29ba344b66d78590e687e046. Above fix is not correct and caused memory leak issue. Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv") Acked-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-03-29tipc: fix htmldoc and smatch warningsJon Maloy
We fix a warning from the htmldoc tool and an indentation error reported by smatch. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29net:tipc: Fix a double free in tipc_sk_mcast_rcvLv Yunlong
In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq) finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time. Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after the branch completed. My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because this skb will be freed by kfree_skb(skb) finally. Fixes: cb1b728096f54 ("tipc: eliminate race condition at multicast reception") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_find_service()Jon Maloy
We reduce the signature of tipc_find_service() and tipc_create_service(). The reason for doing this might not be obvious, but we plan to let struct tipc_uaddr contain information that is relevant for these functions in a later commit. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_nametbl_lookup_group()Jon Maloy
We reduce the signature of tipc_nametbl_lookup_group() by using a struct tipc_uaddr pointer. This entails a couple of minor changes in the functions tipc_send_group_mcast/anycast/unicast/bcast() in socket.c Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_nametbl_lookup_mcast_nodes()Jon Maloy
We follow up the preceding commits by reducing the signature of the function tipc_nametbl_lookup_mcast_nodes(). Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_namtbl_lookup_mcast_sockets()Jon Maloy
We reduce the signature of this function according to the same principle as the preceding commits. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: refactor tipc_sendmsg() and tipc_lookup_anycast()Jon Maloy
We simplify the signature if function tipc_nametbl_lookup_anycast(), using address structures instead of discrete integers. This also makes it possible to make some improvements to the functions __tipc_sendmsg() in socket.c and tipc_msg_lookup_dest() in msg.c. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: rename binding table lookup functionsJon Maloy
The binding table provides four different lookup functions, which purpose is not obvious neither by their names nor by the (lack of) descriptions. We now give these functions names that better match their purposes, and improve the comments that describe what they are doing. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_nametbl_withdraw() functionsJon Maloy
Following the principles of the preceding commits, we reduce the number of parameters passed along in tipc_sk_withdraw(), tipc_nametbl_withdraw() and associated functions. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: simplify signature of tipc_namtbl_publish()Jon Maloy
Using the new address structure tipc_uaddr, we simplify the signature of function tipc_sk_publish() and tipc_namtbl_publish() so that fewer parameters need to be passed around. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17tipc: re-organize members of struct publicationJon Maloy
In a future commit we will introduce more members to struct publication. In order to keep this structure comprehensible we now group some of its current fields into the sub-structures where they really belong, - A struct tipc_service_range for the functional address the publication is representing. - A struct tipc_socket_addr for the socket bound to that service range. We also rename the stack variable 'publ' to just 'p' in a few places. This is just as easy to understand in the given context, and keeps the number of wrapped code lines to a minimum. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-01net/tipc: fix all function Return: notationRandy Dunlap
Fix Return: kernel-doc notation in all net/tipc/ source files. Also keep ReST list notation intact for output formatting. Fix a few typos in comments. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01net/tipc: fix socket.c kernel-docRandy Dunlap
Fix socket.c kernel-doc warnings in preparation for adding to the networking docbook. Also, for rcvbuf_limit(), use bullet notation so that the lines do not run together. ../net/tipc/socket.c:130: warning: Function parameter or member 'cong_links' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'probe_unacked' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'snd_win' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'peer_caps' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'rcv_win' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'group' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'oneway' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'nagle_start' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'snd_backlog' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'msg_acc' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'pkt_cnt' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'expect_ack' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'nodelay' not described in 'tipc_sock' ../net/tipc/socket.c:130: warning: Function parameter or member 'group_is_open' not described in 'tipc_sock' ../net/tipc/socket.c:267: warning: Function parameter or member 'sk' not described in 'tsk_advance_rx_queue' ../net/tipc/socket.c:295: warning: Function parameter or member 'sk' not described in 'tsk_rej_rx_queue' ../net/tipc/socket.c:295: warning: Function parameter or member 'error' not described in 'tsk_rej_rx_queue' ../net/tipc/socket.c:894: warning: Function parameter or member 'tsk' not described in 'tipc_send_group_msg' ../net/tipc/socket.c:1187: warning: Function parameter or member 'net' not described in 'tipc_sk_mcast_rcv' ../net/tipc/socket.c:1323: warning: Function parameter or member 'inputq' not described in 'tipc_sk_conn_proto_rcv' ../net/tipc/socket.c:1323: warning: Function parameter or member 'xmitq' not described in 'tipc_sk_conn_proto_rcv' ../net/tipc/socket.c:1885: warning: Function parameter or member 'sock' not described in 'tipc_recvmsg' ../net/tipc/socket.c:1993: warning: Function parameter or member 'sock' not described in 'tipc_recvstream' ../net/tipc/socket.c:2313: warning: Function parameter or member 'xmitq' not described in 'tipc_sk_filter_rcv' ../net/tipc/socket.c:2404: warning: Function parameter or member 'xmitq' not described in 'tipc_sk_enqueue' ../net/tipc/socket.c:2456: warning: Function parameter or member 'net' not described in 'tipc_sk_rcv' ../net/tipc/socket.c:2693: warning: Function parameter or member 'kern' not described in 'tipc_accept' ../net/tipc/socket.c:3816: warning: Excess function parameter 'sysctl_tipc_sk_filter' description in 'tipc_sk_filtering' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01net/tipc: fix various kernel-doc warningsRandy Dunlap
kernel-doc and Sphinx fixes to eliminate lots of warnings in preparation for adding to the networking docbook. ../net/tipc/crypto.c:57: warning: cannot understand function prototype: 'enum ' ../net/tipc/crypto.c:69: warning: cannot understand function prototype: 'enum ' ../net/tipc/crypto.c:130: warning: Function parameter or member 'tfm' not described in 'tipc_tfm' ../net/tipc/crypto.c:130: warning: Function parameter or member 'list' not described in 'tipc_tfm' ../net/tipc/crypto.c:172: warning: Function parameter or member 'stat' not described in 'tipc_crypto_stats' ../net/tipc/crypto.c:232: warning: Function parameter or member 'flags' not described in 'tipc_crypto' ../net/tipc/crypto.c:329: warning: Function parameter or member 'ukey' not described in 'tipc_aead_key_validate' ../net/tipc/crypto.c:329: warning: Function parameter or member 'info' not described in 'tipc_aead_key_validate' ../net/tipc/crypto.c:482: warning: Function parameter or member 'aead' not described in 'tipc_aead_tfm_next' ../net/tipc/trace.c:43: warning: cannot understand function prototype: 'unsigned long sysctl_tipc_sk_filter[5] __read_mostly = ' Documentation/networking/tipc:57: ../net/tipc/msg.c:584: WARNING: Unexpected indentation. Documentation/networking/tipc:63: ../net/tipc/name_table.c:536: WARNING: Unexpected indentation. Documentation/networking/tipc:63: ../net/tipc/name_table.c:537: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/networking/tipc:78: ../net/tipc/socket.c:3809: WARNING: Unexpected indentation. Documentation/networking/tipc:78: ../net/tipc/socket.c:3807: WARNING: Inline strong start-string without end-string. Documentation/networking/tipc:72: ../net/tipc/node.c:904: WARNING: Unexpected indentation. Documentation/networking/tipc:39: ../net/tipc/crypto.c:97: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/networking/tipc:39: ../net/tipc/crypto.c:98: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/networking/tipc:39: ../net/tipc/crypto.c:141: WARNING: Inline strong start-string without end-string. ../net/tipc/discover.c:82: warning: Function parameter or member 'skb' not described in 'tipc_disc_init_msg' ../net/tipc/msg.c:69: warning: Function parameter or member 'gfp' not described in 'tipc_buf_acquire' ../net/tipc/msg.c:382: warning: Function parameter or member 'offset' not described in 'tipc_msg_build' ../net/tipc/msg.c:708: warning: Function parameter or member 'net' not described in 'tipc_msg_lookup_dest' ../net/tipc/subscr.c:65: warning: Function parameter or member 'seq' not described in 'tipc_sub_check_overlap' ../net/tipc/subscr.c:65: warning: Function parameter or member 'found_lower' not described in 'tipc_sub_check_overlap' ../net/tipc/subscr.c:65: warning: Function parameter or member 'found_upper' not described in 'tipc_sub_check_overlap' ../net/tipc/udp_media.c:75: warning: Function parameter or member 'proto' not described in 'udp_media_addr' ../net/tipc/udp_media.c:75: warning: Function parameter or member 'port' not described in 'udp_media_addr' ../net/tipc/udp_media.c:75: warning: Function parameter or member 'ipv4' not described in 'udp_media_addr' ../net/tipc/udp_media.c:75: warning: Function parameter or member 'ipv6' not described in 'udp_media_addr' ../net/tipc/udp_media.c:98: warning: Function parameter or member 'rcast' not described in 'udp_bearer' Also fixed a typo of "duest" to "dest". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27tipc: update address terminology in codeJon Maloy
We update the terminology in the code so that deprecated structure names and macros are replaced with those currently recommended in the user API. struct tipc_portid -> struct tipc_socket_addr struct tipc_name -> struct tipc_service_addr struct tipc_name_seq -> struct tipc_service_range TIPC_ADDR_ID -> TIPC_SOCKET_ADDR TIPC_ADDR_NAME -> TIPC_SERVICE_ADDR TIPC_ADDR_NAMESEQ -> TIPC_SERVICE_RANGE TIPC_CFG_SRV -> TIPC_NODE_STATE Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27tipc: refactor tipc_sk_bind() functionJon Maloy
We refactor the tipc_sk_bind() function, so that the lock handling is handled separately from the logics. We also move some sanity tests to earlier in the call chain, to the function tipc_bind(). Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30tipc: add stricter control of reserved service typesJon Maloy
TIPC reserves 64 service types for current and future internal use. Therefore, the bind() function is meant to block regular user sockets from being bound to these values, while it should let through such bindings from internal users. However, since we at the design moment saw no way to distinguish between regular and internal users the filter function ended up with allowing all bindings of the reserved types which were really in use ([0,1]), and block all the rest ([2,63]). This is risky, since a regular user may bind to the service type representing the topology server (TIPC_TOP_SRV == 1) or the one used for indicating neighboring node status (TIPC_CFG_SRV == 0), and wreak havoc for users of those services, i.e., most users. The reality is however that TIPC_CFG_SRV never is bound through the bind() function, since it doesn't represent a regular socket, and TIPC_TOP_SRV can also be made to bypass the checks in tipc_bind() by introducing a different entry function, tipc_sk_bind(). It should be noted that although this is a change of the API semantics, there is no risk we will break any currently working applications by doing this. Any application trying to bind to the values in question would be badly broken from the outset, so there is no chance we would find any such applications in real-world production systems. v2: Added warning printout when a user is blocked from binding, as suggested by Jakub Kicinski Acked-by: Yung Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20201030012938.489557-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18net: tipc: delete duplicated wordsRandy Dunlap
Drop repeated words in net/tipc/. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jon Maloy <jmaloy@redhat.com> Cc: Ying Xue <ying.xue@windriver.com> Cc: tipc-discussion@lists.sourceforge.net Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10tipc: fix shutdown() of connection oriented socketTetsuo Handa
I confirmed that the problem fixed by commit 2a63866c8b51a3f7 ("tipc: fix shutdown() of connectionless socket") also applies to stream socket. ---------- #include <sys/socket.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char *argv[]) { int fds[2] = { -1, -1 }; socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds); if (fork() == 0) _exit(read(fds[0], NULL, 1)); shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */ wait(NULL); /* To be woken up by _exit(). */ return 0; } ---------- Since shutdown(SHUT_RDWR) should affect all processes sharing that socket, unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right behavior. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
We got slightly different patches removing a double word in a comment in net/ipv4/raw.c - picked the version from net. Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached values instead of VNIC login response buffer (following what commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login response buffer") did). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi Kivilinna. 2) Fix loss of RTT samples in rxrpc, from David Howells. 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu. 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka. 5) We disable BH for too lokng in sctp_get_port_local(), add a cond_resched() here as well, from Xin Long. 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu. 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from Yonghong Song. 8) Missing of_node_put() in mt7530 DSA driver, from Sumera Priyadarsini. 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan. 10) Fix geneve tunnel checksumming bug in hns3, from Yi Li. 11) Memory leak in rxkad_verify_response, from Dinghao Liu. 12) In tipc, don't use smp_processor_id() in preemptible context. From Tuong Lien. 13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu. 14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter. 15) Fix ABI mismatch between driver and firmware in nfp, from Louis Peens. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits) net/smc: fix sock refcounting in case of termination net/smc: reset sndbuf_desc if freed net/smc: set rx_off for SMCR explicitly net/smc: fix toleration of fake add_link messages tg3: Fix soft lockup when tg3_reset_task() fails. doc: net: dsa: Fix typo in config code sample net: dp83867: Fix WoL SecureOn password nfp: flower: fix ABI mismatch between driver and firmware tipc: fix shutdown() of connectionless socket ipv6: Fix sysctl max for fib_multipath_hash_policy drivers/net/wan/hdlc: Change the default of hard_header_len to 0 net: gemini: Fix another missing clk_disable_unprepare() in probe net: bcmgenet: fix mask check in bcmgenet_validate_flow() amd-xgbe: Add support for new port mode net: usb: dm9601: Add USB ID of Keenetic Plus DSL vhost: fix typo in error message net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() pktgen: fix error message with wrong function name net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode cxgb4: fix thermal zone device registration ...
2020-09-02tipc: fix shutdown() of connectionless socketTetsuo Handa
syzbot is reporting hung task at nbd_ioctl() [1], for there are two problems regarding TIPC's connectionless socket's shutdown() operation. ---------- #include <fcntl.h> #include <sys/socket.h> #include <sys/ioctl.h> #include <linux/nbd.h> #include <unistd.h> int main(int argc, char *argv[]) { const int fd = open("/dev/nbd0", 3); alarm(5); ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0)); ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */ return 0; } ---------- One problem is that wait_for_completion() from flush_workqueue() from nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when nbd_start_device_ioctl() received a signal at wait_event_interruptible(), for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl() is failing to wake up a WQ thread sleeping at wait_woken() from tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from nbd_read_stat() from recv_work() scheduled by nbd_start_device() from nbd_start_device_ioctl(). Fix this problem by always invoking sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is called. The other problem is that tipc_wait_for_rcvmsg() cannot return when tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg() needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown() does) when the socket is connectionless. [1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b2 Reported-by: syzbot <syzbot+e36f41d207137b5d12f7@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31tipc: Remove unused macro TIPC_FWD_MSGYueHaibing
There is no caller in tree any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-18net: tipc: Convert to use the preferred fallthrough macroMiaohe Lin
Convert the uses of fallthrough comments to fallthrough macro. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: pass a sockptr_t into ->setsockoptChristoph Hellwig
Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13net: tipc: kerneldoc fixesAndrew Lunn
Simple fixes which require no deep knowledge of the code. Cc: Jon Maloy <jmaloy@redhat.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-11tipc: fix kernel WARNING in tipc_msg_append()Tuong Lien
syzbot found the following issue: WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline] WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline] WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242 Kernel panic - not syncing: panic_on_warn set ... This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer dereference in streaming") that tried to build at least one buffer even when the message data length is zero... However, it now exposes another bug that the 'mss' can be zero and the 'cpy' will be negative, thus the above kernel WARNING will appear! The zero value of 'mss' is never expected because it means Nagle is not enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'), so the function 'tipc_msg_append()' must not be called at all. But that was in this particular case since the message data length was zero, and the 'send <= maxnagle' check became true. We resolve the issue by explicitly checking if Nagle is enabled for the socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We also reinforce the function to against such a negative values if any. Reported-by: syzbot+75139a7d2605236b0b7f@syzkaller.appspotmail.com Fixes: c0bceb97db9e ("tipc: add smart nagle feature") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01tipc: Fix NULL pointer dereference in __tipc_sendstream()YueHaibing
tipc_sendstream() may send zero length packet, then tipc_msg_append() do not alloc skb, skb_peek_tail() will get NULL, msg_set_ack_required will trigger NULL pointer dereference. Reported-by: syzbot+8eac6d030e7807c21d32@syzkaller.appspotmail.com Fixes: 0a3e060f340d ("tipc: add test for Nagle algorithm effectiveness") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tipc: call tsk_set_importance from tipc_topsrv_create_listenerChristoph Hellwig
Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can just use the internal helper. The only change needed is to pass a struct sock instead of tipc_sock, which is private to socket.c Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26tipc: add test for Nagle algorithm effectivenessTuong Lien
When streaming in Nagle mode, we try to bundle small messages from user as many as possible if there is one outstanding buffer, i.e. not ACK-ed by the receiving side, which helps boost up the overall throughput. So, the algorithm's effectiveness really depends on when Nagle ACK comes or what the specific network latency (RTT) is, compared to the user's message sending rate. In a bad case, the user's sending rate is low or the network latency is small, there will not be many bundles, so making a Nagle ACK or waiting for it is not meaningful. For example: a user sends its messages every 100ms and the RTT is 50ms, then for each messages, we require one Nagle ACK but then there is only one user message sent without any bundles. In a better case, even if we have a few bundles (e.g. the RTT = 300ms), but now the user sends messages in medium size, then there will not be any difference at all, that says 3 x 1000-byte data messages if bundled will still result in 3 bundles with MTU = 1500. When Nagle is ineffective, the delay in user message sending is clearly wasted instead of sending directly. Besides, adding Nagle ACKs will consume some processor load on both the sending and receiving sides. This commit adds a test on the effectiveness of the Nagle algorithm for an individual connection in the network on which it actually runs. Particularly, upon receipt of a Nagle ACK we will compare the number of bundles in the backlog queue to the number of user messages which would be sent directly without Nagle. If the ratio is good (e.g. >= 2), Nagle mode will be kept for further message sending. Otherwise, we will leave Nagle and put a 'penalty' on the connection, so it will have to spend more 'one-way' messages before being able to re-enter Nagle. In addition, the 'ack-required' bit is only set when really needed that the number of Nagle ACKs will be reduced during Nagle mode. Testing with benchmark showed that with the patch, there was not much difference in throughput for small messages since the tool continuously sends messages without a break, so Nagle would still take in effect. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13tipc: fix large latency in smart Nagle streamingTuong Lien
Currently when a connection is in Nagle mode, we set the 'ack_required' bit in the last sending buffer and wait for the corresponding ACK prior to pushing more data. However, on the receiving side, the ACK is issued only when application really reads the whole data. Even if part of the last buffer is received, we will not do the ACK as required. This might cause an unnecessary delay since the receiver does not always fetch the message as fast as the sender, resulting in a large latency in the user message sending, which is: [one RTT + the receiver processing time]. The commit makes Nagle ACK as soon as possible i.e. when a message with the 'ack_required' arrives in the receiving side's stack even before it is processed or put in the socket receive queue... This way, we can limit the streaming latency to one RTT as committed in Nagle mode. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26tipc: Add a missing case of TIPC_DIRECT_MSG typeHoang Le
In the commit f73b12812a3d ("tipc: improve throughput between nodes in netns"), we're missing a check to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for this message type. So, throughput improvement is not significant as expected. Besides that, when sending a large message with that type, we're also handle wrong receiving queue, it should be enqueued in socket receiving instead of multicast messages. Fix this by adding the missing case for TIPC_DIRECT_MSG. Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns") Reported-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-10tipc: fix successful connect() but timed outTuong Lien
In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we fixed the issue with the 'connect()' that returns zero even though the connecting has failed by waiting for the connection to be 'ESTABLISHED' really. However, the approach has one drawback in conjunction with our 'lightweight' connection setup mechanism that the following scenario can happen: (server) (client) +- accept()| | wait_for_conn() | | |connect() -------+ | |<-------[SYN]---------| > sleeping | | *CONNECTING | |--------->*ESTABLISHED | | |--------[ACK]-------->*ESTABLISHED > wakeup() send()|--------[DATA]------->|\ > wakeup() send()|--------[DATA]------->| | > wakeup() . . . . |-> recvq . . . . . | . send()|--------[DATA]------->|/ > wakeup() close()|--------[FIN]-------->*DISCONNECTING | *DISCONNECTING | | | ~~~~~~~~~~~~~~~~~~> schedule() | wait again . . | ETIMEDOUT Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED' and the 'wait_for_conn()' process is woken up but not run. Meanwhile, the server starts to send a number of data following by a 'close()' shortly without waiting any response from the client, which then forces the client socket to be 'DISCONNECTING' immediately. When the wait process is switched to be running, it continues to wait until the timer expires because of the unexpected socket state. The client 'connect()' will finally get ‘-ETIMEDOUT’ and force to release the socket whereas there remains the messages in its receive queue. Obviously the issue would not happen if the server had some delay prior to its 'close()' (or the number of 'DATA' messages is large enough), but any kind of delay would make the connection setup/shutdown "heavy". We solve this by simply allowing the 'connect()' returns zero in this particular case. The socket is already 'DISCONNECTING', so any further write will get '-EPIPE' but the socket is still able to read the messages existing in its receive queue. Note: This solution doesn't break the previous one as it deals with a different situation that the socket state is 'DISCONNECTING' but has no error (i.e. sk->sk_err = 0). Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code") Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08tipc: fix wrong connect() return codeTuong Lien
The current 'tipc_wait_for_connect()' function does a wait-loop for the condition 'sk->sk_state != TIPC_CONNECTING' to conclude if the socket connecting has done. However, when the condition is met, it returns '0' even in the case the connecting is actually failed, the socket state is set to 'TIPC_DISCONNECTING' (e.g. when the server socket has closed..). This results in a wrong return code for the 'connect()' call from user, making it believe that the connection is established and go ahead with building, sending a message, etc. but finally failed e.g. '-EPIPE'. This commit fixes the issue by changing the wait condition to the 'tipc_sk_connected(sk)', so the function will return '0' only when the connection is really established. Otherwise, either the socket 'sk_err' if any or '-ETIMEDOUT'/'-EINTR' will be returned correspondingly. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08tipc: fix link overflow issue at socket shutdownTuong Lien
When a socket is suddenly shutdown or released, it will reject all the unreceived messages in its receive queue. This applies to a connected socket too, whereas there is only one 'FIN' message required to be sent back to its peer in this case. In case there are many messages in the queue and/or some connections with such messages are shutdown at the same time, the link layer will easily get overflowed at the 'TIPC_SYSTEM_IMPORTANCE' backlog level because of the message rejections. As a result, the link will be taken down. Moreover, immediately when the link is re-established, the socket layer can continue to reject the messages and the same issue happens... The commit refactors the '__tipc_shutdown()' function to only send one 'FIN' in the situation mentioned above. For the connectionless case, it is unavoidable but usually there is no rejections for such socket messages because they are 'dest-droppable' by default. In addition, the new code makes the other socket states clear (e.g.'TIPC_LISTEN') and treats as a separate case to avoid misbehaving. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10tipc: fix retrans failure due to wrong destinationTuong Lien
When a user message is sent, TIPC will check if the socket has faced a congestion at link layer. If that happens, it will make a sleep to wait for the congestion to disappear. This leaves a gap for other users to take over the socket (e.g. multi threads) since the socket is released as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free to send messages to various destinations (e.g. via 'sendto()'), then the socket's preformatted header has to be updated correspondingly prior to the actual payload message building. Unfortunately, the latter action is done before the first action which causes a condition issue that the destination of a certain message can be modified incorrectly in the middle, leading to wrong destination when that message is built. Consequently, when the message is sent to the link layer, it gets stuck there forever because the peer node will simply reject it. After a number of retransmission attempts, the link is eventually taken down and the retransmission failure is reported. This commit fixes the problem by rearranging the order of actions to prevent the race condition from occurring, so the message building is 'atomic' and its header will not be modified by anyone. Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28tipc: fix duplicate SYN messages under link congestionTung Nguyen
Scenario: 1. A client socket initiates a SYN message to a listening socket. 2. The send link is congested, the SYN message is put in the send link and a wakeup message is put in wakeup queue. 3. The congestion situation is abated, the wakeup message is pulled out of the wakeup queue. Function tipc_sk_push_backlog() is called to send out delayed messages by Nagle. However, the client socket is still in CONNECTING state. So, it sends the SYN message in the socket write queue to the listening socket again. 4. The listening socket receives the first SYN message and creates first server socket. The client socket receives ACK- and establishes a connection to the first server socket. The client socket closes its connection with the first server socket. 5. The listening socket receives the second SYN message and creates second server socket. The second server socket sends ACK- to the client socket, but it has been closed. It results in connection reset error when reading from the server socket in user space. Solution: return from function tipc_sk_push_backlog() immediately if there is pending SYN message in the socket write queue. Fixes: c0bceb97db9e ("tipc: add smart nagle feature") Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28tipc: fix wrong timeout input for tipc_wait_for_cond()Tung Nguyen
In function __tipc_shutdown(), the timeout value passed to tipc_wait_for_cond() is not jiffies. This commit fixes it by converting that value from milliseconds to jiffies. Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28tipc: fix wrong socket reference counter after tipc_sk_timeout() returnsTung Nguyen
When tipc_sk_timeout() is executed but user space is grabbing ownership, this function rearms itself and returns. However, the socket reference counter is not reduced. This causes potential unexpected behavior. This commit fixes it by calling sock_put() before tipc_sk_timeout() returns in the above-mentioned case. Fixes: afe8792fec69 ("tipc: refactor function tipc_sk_timeout()") Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>