diff options
author | Vasudev Kamath <vasudev@copyninja.info> | 2021-11-16 12:51:48 +0530 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2021-11-17 13:59:49 +0000 |
commit | 738baea4970b36580cb1dd4f9b3fd5247aa1c7f5 (patch) | |
tree | ba6e1d2bec80b2c3683bd280ac7a9ed12e0bd31c | |
parent | 2b425ef8c16ce523709e1d7c0a4c8c2e02eb441e (diff) |
Documentation: networking: net_failover: Fix documentation
Update net_failover documentation with missing and incomplete
details to get a proper working setup.
Signed-off-by: Vasudev Kamath <vasudev@copyninja.info>
Reviewed-by: Krishna Kumar <krikku@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-rw-r--r-- | Documentation/networking/net_failover.rst | 111 |
1 files changed, 88 insertions, 23 deletions
diff --git a/Documentation/networking/net_failover.rst b/Documentation/networking/net_failover.rst index e143ab79a960..3a662f2b4d6e 100644 --- a/Documentation/networking/net_failover.rst +++ b/Documentation/networking/net_failover.rst @@ -35,7 +35,7 @@ To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY feature on the virtio-net interface and assign the same MAC address to both virtio-net and VF interfaces. -Here is an example XML snippet that shows such configuration. +Here is an example libvirt XML snippet that shows such configuration: :: <interface type='network'> @@ -45,18 +45,32 @@ Here is an example XML snippet that shows such configuration. <model type='virtio'/> <driver name='vhost' queues='4'/> <link state='down'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> + <teaming type='persistent'/> + <alias name='ua-backup0'/> </interface> <interface type='hostdev' managed='yes'> <mac address='52:54:00:00:12:53'/> <source> <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/> </source> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> + <teaming type='transient' persistent='ua-backup0'/> </interface> +In this configuration, the first device definition is for the virtio-net +interface and this acts as the 'persistent' device indicating that this +interface will always be plugged in. This is specified by the 'teaming' tag with +required attribute type having value 'persistent'. The link state for the +virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers +the VF passthrough device for normal communication. The virtio-net device will +be brought UP during live migration to allow uninterrupted communication. + +The second device definition is for the VF passthrough interface. Here the +'teaming' tag is provided with type 'transient' indicating that this device may +periodically be unplugged. A second attribute - 'persistent' is provided and +points to the alias name declared for the virtio-net device. + Booting a VM with the above configuration will result in the following 3 -netdevs created in the VM. +interfaces created in the VM: :: 4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 @@ -65,13 +79,36 @@ netdevs created in the VM. valid_lft 42482sec preferred_lft 42482sec inet6 fe80::97d8:db2:8c10:b6d6/64 scope link valid_lft forever preferred_lft forever - 5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000 + 5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000 link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff 7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000 link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff -ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave -'standby' and 'primary' netdevs respectively. +Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby' +virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface. + +One point to note here is that some user space network configuration daemons +like systemd-networkd, ifupdown, etc, do not understand the 'net_failover' +device; and on the first boot, the VM might end up with both 'failover' device +and VF accquiring IP addresses (either same or different) from the DHCP server. +This will result in lack of connectivity to the VM. So some tweaks might be +needed to these network configuration daemons to make sure that an IP is +received only on the 'failover' device. + +Below is the patch snippet used with 'cloud-ifupdown-helper' script found on +Debian cloud images: + +:: + @@ -27,6 +27,8 @@ do_setup() { + local working="$cfgdir/.$INTERFACE" + local final="$cfgdir/$INTERFACE" + + + if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi + + + if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then + # interface is already known to ifupdown, no need to generate cfg + log "Skipping configuration generation for $INTERFACE" + Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode ================================================================== @@ -80,40 +117,68 @@ net_failover also enables hypervisor controlled live migration to be supported with VMs that have direct attached SR-IOV VF devices by automatic failover to the paravirtual datapath when the VF is unplugged. -Here is a sample script that shows the steps to initiate live migration on -the source hypervisor. +Here is a sample script that shows the steps to initiate live migration from +the source hypervisor. Note: It is assumed that the VM is connected to a +software bridge 'br0' which has a single VF attached to it along with the vnet +device to the VM. This is not the VF that was passthrough'd to the VM (seen in +the vf.xml file). :: - # cat vf_xml + # cat vf.xml <interface type='hostdev' managed='yes'> <mac address='52:54:00:00:12:53'/> <source> <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/> </source> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> + <teaming type='transient' persistent='ua-backup0'/> </interface> - # Source Hypervisor + # Source Hypervisor migrate.sh #!/bin/bash - DOMAIN=fedora27-tap01 - PF=enp66s0f0 - VF_NUM=5 - TAP_IF=tap01 - VF_XML= + DOMAIN=vm-01 + PF=ens6np0 + VF=ens6v1 # VF attached to the bridge. + VF_NUM=1 + TAP_IF=vmtap01 # virtio-net interface in the VM. + VF_XML=vf.xml MAC=52:54:00:00:12:53 ZERO_MAC=00:00:00:00:00:00 + # Set the virtio-net interface up. virsh domif-setlink $DOMAIN $TAP_IF up - bridge fdb del $MAC dev $PF master - virsh detach-device $DOMAIN $VF_XML + + # Remove the VF that was passthrough'd to the VM. + virsh detach-device --live --config $DOMAIN $VF_XML + ip link set $PF vf $VF_NUM mac $ZERO_MAC - virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system + # Add FDB entry for traffic to continue going to the VM via + # the VF -> br0 -> vnet interface path. + bridge fdb add $MAC dev $VF + bridge fdb add $MAC dev $TAP_IF master + + # Migrate the VM + virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system + + # Clean up FDB entries after migration completes. + bridge fdb del $MAC dev $VF + bridge fdb del $MAC dev $TAP_IF master - # Destination Hypervisor +On the destination hypervisor, a shared bridge 'br0' is created before migration +starts, and a VF from the destination PF is added to the bridge. Similarly an +appropriate FDB entry is added. + +The following script is executed on the destination hypervisor once migration +completes, and it reattaches the VF to the VM and brings down the virtio-net +interface. + +:: + # reattach-vf.sh #!/bin/bash - virsh attach-device $DOMAIN $VF_XML - virsh domif-setlink $DOMAIN $TAP_IF down + bridge fdb del 52:54:00:00:12:53 dev ens36v0 + bridge fdb del 52:54:00:00:12:53 dev vmtap01 master + virsh attach-device --config --live vm01 vf.xml + virsh domif-setlink vm01 vmtap01 down |