summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2014-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) If the user gives us a msg_namelen of 0, don't try to interpret anything pointed to by msg_name. From Ani Sinha. 2) Fix some bnx2i/bnx2fc randconfig compilation errors. The gist of the issue is that we firstly have drivers that span both SCSI and networking. And at the top of that chain of dependencies we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are selected. But since select is a sledgehammer and ignores dependencies, everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also explicitly select their dependencies and so on and so forth. Generally speaking 'select' is supposed to only be used for child nodes, those which have no dependencies of their own. And this whole chain of dependencies in the scsi layer violates that rather strongly. So just make SCSI_NETLINK depend upon it's dependencies, and so on and so forth for the things selecting it (either directly or indirectly). From Anish Bhatt and Randy Dunlap. 3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert. 4) Actually notice netdev feature changes in rtl_open() code, from Hayes Wang. 5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov. 6) Missing memory barrier in sunvnet driver, from David Stevens. 7) Don't leave anycast addresses around when ipv6 interface is destroyed, from Sabrina Dubroca. 8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is initialized in SFC driver, from Edward Cree. 9) Fix missing DMA error checking in 3c59x, from Neal Horman. 10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently, fix from Samuel Gauthier. 11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a build error. 12) Fix macvlan regression wherein we stopped emitting broadcast/multicast frames over software devices. From Nicolas Dichtel. 13) Fix infiniband bug due to unintended overflow of skb->cb[], from Eric Dumazet. And add an assertion so this doesn't happen again. 14) dm9000_parse_dt() should return error pointers, not NULL. From Tobias Klauser. 15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma net: bcmgenet: fix TX reclaim accounting for fragments ipv4: do not use this_cpu_ptr() in preemptible context dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt() r8169: fix an if condition r8152: disable ALDPS ipoib: validate struct ipoib_cb size net: sched: shrink struct qdisc_skb_cb to 28 bytes tg3: Work around HW/FW limitations with vlan encapsulated frames macvlan: allow to enqueue broadcast pkt on virtual device pch_gbe: 'select' NET_PTP_CLASSIFY. scsi: Use 'depends' with LIBFC instead of 'select'. openvswitch: restore OVS_FLOW_CMD_NEW notifications genetlink: add function genl_has_listeners() lib: rhashtable: remove second linux/log2.h inclusion net: allow macvlans to move to net namespace 3c59x: Fix bad offset spec in skb_frag_dma_map 3c59x: Add dma error checking and recovery sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG can: at91_can: add missing prepare and unprepare of the clock ...
2014-09-22ipv4: do not use this_cpu_ptr() in preemptible contextEric Dumazet
this_cpu_ptr() in preemptible context is generally bad Sep 22 05:05:55 br kernel: [ 94.608310] BUG: using smp_processor_id() in preemptible [00000000] code: ip/2261 Sep 22 05:05:55 br kernel: [ 94.608316] caller is tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel] Sep 22 05:05:55 br kernel: [ 94.608319] CPU: 3 PID: 2261 Comm: ip Not tainted 3.17.0-rc5 #82 We can simply use raw_cpu_ptr(), as preemption is safe in these contexts. Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Joe <joe9mail@gmail.com> Fixes: 9a4aa9af447f ("ipv4: Use percpu Cache route in IP tunnels") Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: sched: fix compile warning in cls_u32Eric Dumazet
$ grep CONFIG_CLS_U32_MARK .config # CONFIG_CLS_U32_MARK is not set net/sched/cls_u32.c: In function 'u32_change': net/sched/cls_u32.c:852:1: warning: label 'errout' defined but not used [-Wunused-label] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2014-09-22 We generate a blackhole or queueing route if a packet matches an IPsec policy but a state can't be resolved. Here we assume that dst_output() is called to kill these packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system without the necessary transformations. This pull request contains two patches to fix this issue: 1) Fix for blackhole routed packets. 2) Fix for queue routed packets. Both patches are serious stable candidates. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22tcp: avoid possible arithmetic overflowsEric Dumazet
icsk_rto is a 32bit field, and icsk_backoff can reach 15 by default, or more if some sysctl (eg tcp_retries2) are changed. Better use 64bit to perform icsk_rto << icsk_backoff operations As Joe Perches suggested, add a helper for this. Yuchung spotted the tcp_v4_err() case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallbackDaniel Borkmann
RFC2710 (MLDv1), section 3.7. says: The length of a received MLD message is computed by taking the IPv6 Payload Length value and subtracting the length of any IPv6 extension headers present between the IPv6 header and the MLD message. If that length is greater than 24 octets, that indicates that there are other fields present *beyond* the fields described above, perhaps belonging to a *future backwards-compatible* version of MLD. An implementation of the version of MLD specified in this document *MUST NOT* send an MLD message longer than 24 octets and MUST ignore anything past the first 24 octets of a received MLD message. RFC3810 (MLDv2), section 8.2.1. states for *listeners* regarding presence of MLDv1 routers: In order to be compatible with MLDv1 routers, MLDv2 hosts MUST operate in version 1 compatibility mode. [...] When Host Compatibility Mode is MLDv2, a host acts using the MLDv2 protocol on that interface. When Host Compatibility Mode is MLDv1, a host acts in MLDv1 compatibility mode, using *only* the MLDv1 protocol, on that interface. [...] While section 8.3.1. specifies *router* behaviour regarding presence of MLDv1 routers: MLDv2 routers may be placed on a network where there is at least one MLDv1 router. The following requirements apply: If an MLDv1 router is present on the link, the Querier MUST use the *lowest* version of MLD present on the network. This must be administratively assured. Routers that desire to be compatible with MLDv1 MUST have a configuration option to act in MLDv1 mode; if an MLDv1 router is present on the link, the system administrator must explicitly configure all MLDv2 routers to act in MLDv1 mode. When in MLDv1 mode, the Querier MUST send periodic General Queries truncated at the Multicast Address field (i.e., 24 bytes long), and SHOULD also warn about receiving an MLDv2 Query (such warnings must be rate-limited). The Querier MUST also fill in the Maximum Response Delay in the Maximum Response Code field, i.e., the exponential algorithm described in section 5.1.3. is not used. [...] That means that we should not get queries from different versions of MLD. When there's a MLDv1 router present, MLDv2 enforces truncation and MRC == MRD (both fields are overlapping within the 24 octet range). Section 8.3.2. specifies behaviour in the presence of MLDv1 multicast address *listeners*: MLDv2 routers may be placed on a network where there are hosts that have not yet been upgraded to MLDv2. In order to be compatible with MLDv1 hosts, MLDv2 routers MUST operate in version 1 compatibility mode. MLDv2 routers keep a compatibility mode per multicast address record. The compatibility mode of a multicast address is determined from the Multicast Address Compatibility Mode variable, which can be in one of the two following states: MLDv1 or MLDv2. The Multicast Address Compatibility Mode of a multicast address record is set to MLDv1 whenever an MLDv1 Multicast Listener Report is *received* for that multicast address. At the same time, the Older Version Host Present timer for the multicast address is set to Older Version Host Present Timeout seconds. The timer is re-set whenever a new MLDv1 Report is received for that multicast address. If the Older Version Host Present timer expires, the router switches back to Multicast Address Compatibility Mode of MLDv2 for that multicast address. [...] That means, what can happen is the following scenario, that hosts can act in MLDv1 compatibility mode when they previously have received an MLDv1 query (or, simply operate in MLDv1 mode-only); and at the same time, an MLDv2 router could start up and transmits MLDv2 startup query messages while being unaware of the current operational mode. Given RFC2710, section 3.7 we would need to answer to that with an MLDv1 listener report, so that the router according to RFC3810, section 8.3.2. would receive that and internally switch to MLDv1 compatibility as well. Right now, I believe since the initial implementation of MLDv2, Linux hosts would just silently drop such MLDv2 queries instead of replying with an MLDv1 listener report, which would prevent a MLDv2 router going into fallback mode (until it receives other MLDv1 queries). Since the mapping of MRC to MRD in exactly such cases can make use of the exponential algorithm from 5.1.3, we cannot [strictly speaking] be aware in MLDv1 of the encoding in MRC, it seems also not mentioned by the RFC. Since encodings are the same up to 32767, assume in such a situation this value as a hard upper limit we would clamp. We have asked one of the RFC authors on that regard, and he mentioned that there seem not to be any implementations that make use of that exponential algorithm on startup messages. In any case, this patch fixes this MLD interoperability issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: sched: cls_u32 changes to knode must appear atomic to readersJohn Fastabend
Changes to the cls_u32 classifier must appear atomic to the readers. Before this patch if a change is requested for both the exts and ifindex, first the ifindex is updated then the exts with tcf_exts_change(). This opens a small window where a reader can have a exts chain with an incorrect ifindex. This violates the the RCU semantics. Here we resolve this by always passing u32_set_parms() a copy of the tc_u_knode to work on and then inserting it into the hash table after the updates have been successfully applied. Tested with the following short script: #tc filter add dev p3p2 parent 8001:0 protocol ip prio 99 handle 1: \ u32 divisor 256 #tc filter add dev p3p2 parent 8001:0 protocol ip prio 99 \ u32 link 1: hashkey mask ffffff00 at 12 \ match ip src 192.168.8.0/2 #tc filter add dev p3p2 parent 8001:0 protocol ip prio 102 \ handle 1::10 u32 classid 1:2 ht 1: \ match ip src 192.168.8.0/8 match ip tos 0x0a 1e #tc filter change dev p3p2 parent 8001:0 protocol ip prio 102 \ handle 1::10 u32 classid 1:2 ht 1: \ match ip src 1.1.0.0/8 match ip tos 0x0b 1e CC: Eric Dumazet <edumazet@google.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: cls_u32: fix missed pcpu_success free_percpuJohn Fastabend
This fixes a missed free_percpu in the unwind code path and when keys are destroyed. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22udp: Need to make ip6_udp_tunnel.c have GPL licenseTom Herbert
Unable to load various tunneling modules without this: [ 80.679049] fou: Unknown symbol udp_sock_create6 (err 0) [ 91.439939] ip6_udp_tunnel: Unknown symbol ip6_local_out (err 0) [ 91.439954] ip6_udp_tunnel: Unknown symbol __put_net (err 0) [ 91.457792] vxlan: Unknown symbol udp_sock_create6 (err 0) [ 91.457831] vxlan: Unknown symbol udp_tunnel6_xmit_skb (err 0) Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: keep original skb which only needs header checking during software GSOJason Wang
Commit ce93718fb7cdbc064c3000ff59e4d3200bdfa744 ("net: Don't keep around original SKB when we software segment GSO frames") frees the original skb after software GSO even for dodgy gso skbs. This breaks the stream throughput from untrusted sources, since only header checking was done during software GSO instead of a true segmentation. This patch fixes this by freeing the original gso skb only when it was really segmented by software. Fixes ce93718fb7cdbc064c3000ff59e4d3200bdfa744 ("net: Don't keep around original SKB when we software segment GSO frames.") Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: dsa: add {get, set}_wol callbacks to slave devicesFlorian Fainelli
Allow switch drivers to implement per-port Wake-on-LAN getter and setters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: dsa: allow switch drivers to implement suspend/resume hooksFlorian Fainelli
Add an abstraction layer to suspend/resume switch devices, doing the following split: - suspend/resume the slave network devices and their corresponding PHY devices - suspend/resume the switch hardware using switch driver callbacks Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22net: sched: shrink struct qdisc_skb_cb to 28 bytesEric Dumazet
We cannot make struct qdisc_skb_cb bigger without impacting IPoIB, or increasing skb->cb[] size. Commit e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()") broke IPoIB. Only current offender is sch_choke, and this one do not need an absolutely precise flow key. If we store 17 bytes of flow key, its more than enough. (Its the actual size of flow_keys if it was a packed structure, but we might add new fields at the end of it later) Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19udp_tunnel: Only build ip6_udp_tunnel.c when IPV6 is selectedAndy Zhou
Functions supplied in ip6_udp_tunnel.c are only needed when IPV6 is selected. When IPV6 is not selected, those functions are stubbed out in udp_tunnel.h. ================================================================== net/ipv6/ip6_udp_tunnel.c:15:5: error: redefinition of 'udp_sock_create6' int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg, In file included from net/ipv6/ip6_udp_tunnel.c:9:0: include/net/udp_tunnel.h:36:19: note: previous definition of 'udp_sock_create6' was here static inline int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg, ================================================================== Fixes: fd384412e udp_tunnel: Seperate ipv6 functions into its own file Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19openvswitch: restore OVS_FLOW_CMD_NEW notificationsSamuel Gauthier
Since commit fb5d1e9e127a ("openvswitch: Build flow cmd netlink reply only if needed."), the new flows are not notified to the listeners of OVS_FLOW_MCGROUP. This commit fixes the problem by using the genl function, ie genl_has_listerners() instead of netlink_has_listeners(). Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19gre: Setup and TX path for gre/UDP foo-over-udp encapsulationTom Herbert
Added netlink attrs to configure FOU encapsulation for GRE, netlink handling of these flags, and properly adjust MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19ipip: Setup and TX path for ipip/UDP foo-over-udp encapsulationTom Herbert
Add netlink handling for IP tunnel encapsulation parameters and and adjustment of MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19sit: Setup and TX path for sit/UDP foo-over-udp encapsulationTom Herbert
Added netlink handling of IP tunnel encapulation paramters, properly adjust MTU for encapsulation. Added ip_tunnel_encap call to ipip6_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: Changes to ip_tunnel to support foo-over-udp encapsulationTom Herbert
This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19fou: Add GRO supportTom Herbert
Implement fou_gro_receive and fou_gro_complete, and populate these in the correponsing udp_offloads for the socket. Added ipproto to udp_offloads and pass this from UDP to the fou GRO routine in proto field of napi_gro_cb structure. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19fou: Support for foo-over-udp RX pathTom Herbert
This patch provides a receive path for foo-over-udp. This allows direct encapsulation of IP protocols over UDP. The bound destination port is used to map to an IP protocol, and the XFRM framework (udp_encap_rcv) is used to receive encapsulated packets. Upon reception, the encapsulation header is logically removed (pointer to transport header is advanced) and the packet is reinjected into the receive path with the IP protocol indicated by the mapping. Netlink is used to configure FOU ports. The configuration information includes the port number to bind to and the IP protocol corresponding to that port. This should support GRE/UDP (http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02), as will as the other IP tunneling protocols (IPIP, SIT). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: Export inet_offloads and inet6_offloadsTom Herbert
Want to be able to use these in foo-over-udp offloads, etc. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: sched: cls_u32: rcu can not be last nodeJohn Fastabend
tc_u32_sel 'sel' in tc_u_knode expects to be the last element in the structure and pads the structure with tc_u32_key fields for each key. kzalloc(sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key), GFP_KERNEL) CC: Eric Dumazet <edumazet@google.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19Merge tag 'master-2014-09-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-17 Please pull this batch of fixes intended for the 3.17 stream... Arend van Spriel sends a trio of minor brcmfmac fixes, including a fix for a Kconfig/build issue, a fix for a crash (null reference), and a regression fix related to event handling on a P2P interface. Hante Meuleman follows-up with a brcmfmac fix for a memory leak. Johannes Stezenbach brings an ath9k_htc fix for a regression related to hardware decryption offload. Marcel Holtmann delivers a one-liner to properly mark a device ID table in rfkill-gpio. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: sched: use __skb_queue_head_init() where applicableEric Dumazet
pfifo_fast and htb use skb lists, without needing their spinlocks. (They instead use the standard qdisc lock) We can use __skb_queue_head_init() instead of skb_queue_head_init() to be consistent. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: dsa: allow switch drivers to specify phy_device::dev_flagsFlorian Fainelli
Some switch drivers (e.g: bcm_sf2) may have to communicate specific workarounds or flags towards the PHY device driver. Allow switches driver to be delegated that task by introducing a get_phy_flags() callback which will do just that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: add alloc_skb_with_frags() helperEric Dumazet
Extract from sock_alloc_send_pskb() code building skb with frags, so that we can reuse this in other contexts. Intent is to use it from tcp_send_rcvq(), tcp_collapse(), ... We also want to replace some skb_linearize() calls to a more reliable strategy in pathological cases where we need to reduce number of frags. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19tcp: do not fake tcp headers in tcp_send_rcvq()Eric Dumazet
Now we no longer rely on having tcp headers for skbs in receive queue, tcp repair do not need to build fake ones. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functionsAndy Zhou
Simplify l2tp implementation using common UDP tunnel APIs. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19udp-tunnel: Add a few more UDP tunnel APIsAndy Zhou
Added a few more UDP tunnel APIs that can be shared by UDP based tunnel protocol implementation. The main ones are highlighted below. setup_udp_tunnel_sock() configures UDP listener socket for receiving UDP encapsulated packets. udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb using UDP encapsulation. udp_tunnel_sock_release() closes the UDP tunnel listener socket. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19udp_tunnel: Seperate ipv6 functions into its own file.Andy Zhou
Add ip6_udp_tunnel.c for ipv6 UDP tunnel functions to avoid ifdefs in udp_tunnel.c Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net: sched: cls_cgroup need tcf_exts_init in all casesJohn Fastabend
This ensures the tcf_exts_init() is called for all cases. Fixes: 952313bd62589cae216a57 ("net: sched: cls_cgroup use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16Merge branch 'net_next_ovs' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following patches adds recirculation and hash action to OVS. First patch removes pointer to stack object. Next three patches does code restructuring which is required for last patch. Recirculation implementation is changed, according to comments from David Miller, to avoid using recursive calls in OVS. It is using queue to record recirc action and deferred recirc is executed at the end of current actions execution. v1-v2: Changed subsystem name in subject to openvswitch v2-v3: Added patch to remove pkt_key pointer from skb->cb. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net: rfkill: gpio: Enable module auto-loading for ACPI based switchesMarcel Holtmann
For the ACPI based switches the MODULE_DEVICE_TABLE is missing to export the entries for module auto-loading. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-09-16net: sched: cls_fw: add missing tcf_exts_init call in fw_change()John Fastabend
When allocating a new structure we also need to call tcf_exts_init to initialize exts. A follow up patch might be in order to remove some of this code and do tcf_exts_assign(). With this we could remove the tcf_exts_init/tcf_exts_change pattern for some of the classifiers. As part of the future tcf_actions RCU series this will need to be done. For now fix the call here. Fixes e35a8ee5993ba81fd6c0 ("net: sched: fw use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net: sched: cls_cgroup fix possible memory leak of 'new'John Fastabend
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: 54996b529ab70ca1d6f40677cd2698c4f7127e87 commit: c7953ef23042b7c4fc2be5ecdd216aacff6df5eb [625/646] net: sched: cls_cgroup use RCU net/sched/cls_cgroup.c:130 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:135 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:139 cls_cgroup_change() warn: possible memory leak of 'new' Fixes: c7953ef23042b7c4fc2be5ecdd216aac ("net: sched: cls_cgroup use RCU") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net: sched: cls_u32 add missing rcu_assign_pointer and annotationJohn Fastabend
Add missing rcu_assign_pointer and missing annotation for ht_up in cls_u32.c Caught by kbuild bot, >> net/sched/cls_u32.c:378:36: sparse: incorrect type in initializer (different address spaces) net/sched/cls_u32.c:378:36: expected struct tc_u_hnode *ht net/sched/cls_u32.c:378:36: got struct tc_u_hnode [noderef] <asn:4>*ht_up >> net/sched/cls_u32.c:610:54: sparse: incorrect type in argument 4 (different address spaces) net/sched/cls_u32.c:610:54: expected struct tc_u_hnode *ht net/sched/cls_u32.c:610:54: got struct tc_u_hnode [noderef] <asn:4>*ht_up >> net/sched/cls_u32.c:684:18: sparse: incorrect type in assignment (different address spaces) net/sched/cls_u32.c:684:18: expected struct tc_u_hnode [noderef] <asn:4>*ht_up net/sched/cls_u32.c:684:18: got struct tc_u_hnode *[assigned] ht >> net/sched/cls_u32.c:359:18: sparse: dereference of noderef expression Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net: sched: fix unsued cpu variableJohn Fastabend
kbuild test robot reported an unused variable cpu in cls_u32.c after the patch below. This happens when PERF and MARK config variables are disabled Fix this is to use separate variables for perf and mark and define the cpu variable inside the ifdef logic. Fixes: 459d5f626da7 ("net: sched: make cls_u32 per cpu")' Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net_sched: fix a null pointer dereference in tcindex_set_parms()WANG Cong
This patch fixes the following crash: [ 42.199159] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 42.200027] IP: [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] PGD d2319067 PUD d4ffe067 PMD 0 [ 42.200027] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.200027] CPU: 0 PID: 541 Comm: tc Not tainted 3.17.0-rc4+ #603 [ 42.200027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 42.200027] task: ffff8800d22d2670 ti: ffff8800ce790000 task.ti: ffff8800ce790000 [ 42.200027] RIP: 0010:[<ffffffff817e3fc4>] [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] RSP: 0018:ffff8800ce793898 EFLAGS: 00010202 [ 42.200027] RAX: 0000000000000001 RBX: ffff8800d1786498 RCX: 0000000000000000 [ 42.200027] RDX: ffffffff82114ec8 RSI: ffffffff82114ec8 RDI: ffffffff82114ec8 [ 42.200027] RBP: ffff8800ce793958 R08: 00000000000080d0 R09: 0000000000000001 [ 42.200027] R10: ffff8800ce7939a0 R11: 0000000000000246 R12: ffff8800d017d238 [ 42.200027] R13: 0000000000000018 R14: ffff8800d017c6a0 R15: ffff8800d1786620 [ 42.200027] FS: 00007f4e24539740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 42.200027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.200027] CR2: 0000000000000018 CR3: 00000000cff38000 CR4: 00000000000006f0 [ 42.200027] Stack: [ 42.200027] ffff8800ce0949f0 0000000000000000 0000000200000003 ffff880000000000 [ 42.200027] ffff8800ce7938b8 ffff8800ce7938b8 0000000600000007 0000000000000000 [ 42.200027] ffff8800ce7938d8 ffff8800ce7938d8 0000000600000007 ffff8800ce0949f0 [ 42.200027] Call Trace: [ 42.200027] [<ffffffff817e4169>] tcindex_change+0xdb/0xee [ 42.200027] [<ffffffff817c16ca>] tc_ctl_tfilter+0x44d/0x63f [ 42.200027] [<ffffffff8179d161>] rtnetlink_rcv_msg+0x181/0x194 [ 42.200027] [<ffffffff8179cf9d>] ? rtnl_lock+0x17/0x19 [ 42.200027] [<ffffffff8179cfe0>] ? __rtnl_unlock+0x17/0x17 [ 42.200027] [<ffffffff817ee296>] netlink_rcv_skb+0x49/0x8b [ 43.462494] [<ffffffff8179cfc2>] rtnetlink_rcv+0x23/0x2a [ 43.462494] [<ffffffff817ec8df>] netlink_unicast+0xc7/0x148 [ 43.462494] [<ffffffff817ed413>] netlink_sendmsg+0x5cb/0x63d [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff817757b8>] __sock_sendmsg_nosec+0x25/0x27 [ 43.462494] [<ffffffff81778165>] sock_sendmsg+0x57/0x71 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff81152c06>] ? might_fault+0xa0/0xa4 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff817838fd>] ? verify_iovec+0x69/0xb7 [ 43.462494] [<ffffffff817784f8>] ___sys_sendmsg+0x21d/0x2bb [ 43.462494] [<ffffffff81009db3>] ? native_sched_clock+0x35/0x37 [ 43.462494] [<ffffffff8109ab53>] ? sched_clock_local+0x12/0x72 [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff8109ada4>] ? sched_clock_cpu+0xa0/0xb9 [ 43.462494] [<ffffffff810aee37>] ? __lock_acquire+0x5fe/0xde4 [ 43.462494] [<ffffffff8119f570>] ? rcu_read_lock_held+0x36/0x38 [ 43.462494] [<ffffffff8119f75a>] ? __fcheck_files.isra.7+0x4b/0x57 [ 43.462494] [<ffffffff8119fbf2>] ? __fget_light+0x30/0x54 [ 43.462494] [<ffffffff81779012>] __sys_sendmsg+0x42/0x60 [ 43.462494] [<ffffffff81779042>] SyS_sendmsg+0x12/0x1c [ 43.462494] [<ffffffff819d24d2>] system_call_fastpath+0x16/0x1b 'p->h' could be NULL while 'cp->h' is always update to date. Fixes: commit 331b72922c5f58d48fd ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16net_sched: fix memory leak in cls_tcindexWANG Cong
Fixes: commit 331b72922c5f58d48fd ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16xfrm: Generate queueing routes only from route lookup functionsSteffen Klassert
Currently we genarate a queueing route if we have matching policies but can not resolve the states and the sysctl xfrm_larval_drop is disabled. Here we assume that dst_output() is called to kill the queued packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating queueing routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: a0073fe18e71 ("xfrm: Add a state resolution packet queue") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-09-16xfrm: Generate blackhole routes only from route lookup functionsSteffen Klassert
Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: 2774c131b1d ("xfrm: Handle blackhole route creation via afinfo.") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-09-15openvswitch: Add recirc and hash action.Andy Zhou
Recirc action allows a packet to reenter openvswitch processing. currently openvswitch lookup flow for packet received and execute set of actions on that packet, with help of recirc action we can process/modify the packet and recirculate it back in openvswitch for another pass. OVS hash action calculates 5-tupple hash and set hash in flow-key hash. This can be used along with recirculation for distributing packets among different ports for bond devices. For example: OVS bonding can use following actions: Match on: bond flow; Action: hash, recirc(id) Match on: recirc-id == id and hash lower bits == a; Action: output port_bond_a Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15openvswitch: simplify sample action implementationAndy Zhou
The current sample() function implementation is more complicated than necessary in handling single user space action optimization and skb reference counting. There is no functional changes. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15openvswitch: Use tun_key only for egress tunnel path.Pravin B Shelar
Currently tun_key is used for passing tunnel information on ingress and egress path, this cause confusion. Following patch removes its use on ingress path make it egress only parameter. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-09-15openvswitch: refactor ovs flow extract API.Pravin B Shelar
OVS flow extract is called on packet receive or packet execute code path. Following patch defines separate API for extracting flow-key in packet execute code path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-09-15openvswitch: Remove pkt_key from OVS_CBPravin B Shelar
OVS keeps pointer to packet key in skb->cb, but the packet key is store on stack. This could make code bit tricky. So it is better to get rid of the pointer. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15net: dsa: fix mii_bus to host_dev replacementFlorian Fainelli
dsa_of_probe() still used cd->mii_bus instead of cd->host_dev when building with CONFIG_OF=y. Fix this by making the replacement here as well. Fixes: b4d2394d01b ("dsa: Replace mii_bus with a generic host device") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-15net_sched: use tcindex_filter_result_init()WANG Cong
Fixes: commit 331b72922c5f58d48fd ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>