虚拟化-OVS-DPDK在CentOS7上的部署调试

OVS-DPDK部署

基于CentOS 7.9 x86_64 物理机,网卡 mlx cx5
初始化系统环境

1
2
3
4
5
6
7
8
9
10
11
yum makecache
yum -y update
yum install -y epel-release
yum install -y net-tools tcpdump telnet wget zip unzip vim
yum install -y gcc gcc-c++ kernel-devel kernel-headers kernel.x86_64 net-tools
yum install -y numactl-devel.x86_64 numactl-libs.x86_64
yum install -y libpcap.x86_64 libpcap-devel.x86_64 libcap-ng-devel
yum install -y pciutils
yum install -y autoconf automake libtool

#reboot (内核有可能升级最好重启一下)

安装麦乐思OFED驱动(非mlx步骤请忽略)

1
2
3
4
yum install tcl tk
/mnt/mlnxofedinstall --dpdk --upstream-libs
cat /proc/cpuinfo | grep pdpe1gb
reboot

源码安装DPDK

1
2
3
4
5
6
wget http://fast.dpdk.org/rel/dpdk-19.11.7.tar.xz
tar -xvf dpdk-19.11.7.tar.xz
cd dpdk-stable-19.11.7/
mkdir -p /usr/src/dpdk
make config T=x86_64-native-linuxapp-gcc

1
2
3
修改支持MLX CX4/CX5
sed -i 's/\(CONFIG_RTE_LIBRTE_MLX5_PMD=\)n/\1y/g' config/common_base
sed -i 's/\(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=\)n/\1y/g' config/common_base
1
2
make install T=x86_64-native-linuxapp-gcc DESTDIR=/usr/src/dpdk
make install T=x86_64-native-linuxapp-gcc DESTDIR=/usr
1
2
拷贝共享库到/usr/lib64下
~~cp x86_64-native-linuxapp-gcc/lib/librte_pmd_mlx5_glue* /usr/lib64/~~

源码安装ovs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
wget https://www.openvswitch.org/releases/openvswitch-2.13.1.tar.gz
yum -y install python3 python3-devel python36-six
# 解压编译
tar -zxvf openvswitch-2.13.1.tar.gz
cd openvswitch-2.13.1/
./boot.sh
./configure \
--with-dpdk=/usr/src/dpdk \
--prefix=/usr \
--exec-prefix=/usr \
--sysconfdir=/etc \
--localstatedir=/var
make
make install

网卡绑定

  1. 系统设置
    系统BIOS需要打开VT-d,并且通过grub配置iommu和intel_iommu参数来支持VFIO驱动,修改/boot/grub2/grub.cfg,找到引导的相应内核参数,在后面添加:
    iommu=pt intel_iommu=on,例如,
    1
    linux16 /vmlinuz-3.10.0-327.36.2.el7.x86_64 root=/dev/mapper/centos_dell-root ro crashkernel=auto rd.lvm.lv=centos_dell/root rd.lvm.lv=centos_dell/swap nomodeset rhgb quiet
    可在系统启动后使用如下命令查看:
    1
    cat /proc/cmdline
  2. 设置dpdk驱动
    1
    modprobe vfio-pci
  3. 网卡绑定到dpdk
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    ifdown ens33
    dpdk-devbind --bind=vfio-pci ens33


    Network devices using DPDK-compatible driver
    ============================================
    0000:02:01.0 '82545EM Gigabit Ethernet Controller (Copper) 100f' drv=vfio-pci unused=e1000

    Network devices using kernel driver
    ===================================
    0000:02:02.0 '82545EM Gigabit Ethernet Controller (Copper) 100f' if=ens34 drv=e1000 unused=vfio-pci *Active*
    0000:02:03.0 '82545EM Gigabit Ethernet Controller (Copper) 100f' if=ens35 drv=e1000 unused=vfio-pci

    No 'Baseband' devices detected
    ==============================

    No 'Crypto' devices detected
    ============================

    No 'Eventdev' devices detected
    ==============================

    No 'Mempool' devices detected
    =============================

    No 'Compress' devices detected
    ==============================

    No 'Misc (rawdev)' devices detected
    ===================================

    配置大页内存

  4. 查看大页内存配置
    1
    grep HugePages_ /proc/meminfo
  5. 修改hugepage的页数为1024
    1
    2
    echo 1024 > /proc/sys/vm/nr_hugepages
    echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf
    1
    mount -t hugetlbfs  none /dev/hugepages
  6. 配置开机启动
    1
    2
    3
    chmod 755 /etc/rc.d/rc.local
    echo '/usr/sbin/modprobe vfio-pci' >> /etc/rc.d/rc.local
    echo 'mount -t hugetlbfs none /dev/hugepages' >> /etc/rc.d/rc.local

    启动ovs-db-server

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    mkdir -p /etc/openvswitch
    mkdir -p /var/run/openvswitch
    mkdir -p /var/log/openvswitch
    #ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
    # ovsdb-server --remote=punix:/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --monitor

    # 增加 dpdk 相关配置参数
    # ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
    # ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true (vhostuser支持 vhostuserclient )
    # ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
    #ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0xe

    启动 ovs-vswitchd

    1
    2
    3
    4
    export DB_SOCK=/var/run/openvswitch/db.sock
    /usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --db-sock="$DB_SOCK" start
    ps axu|grep ovs
    ovs-vsctl list open_vswitch 查看dpdk是否被初始化

    创建ovs网桥和端口

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    # ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev    # 使用 ovs-dpdk 必须指定datapath_type类型为netdev
    # ovs-vsctl add-br br1 -- set bridge br1 datapath_type=netdev
    # ovs-vsctl add-port br-int dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuserclient0.sock
    # ovs-vsctl add-port br1 dpdkvhostuserclient1 -- set Interface dpdkvhostuserclient1 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuserclient1.sock
    # ovs-vsctl add-port br0 dpdk0 -- set interface dpdk0 type=dpdk options:dpdk-devargs=0000:02:01.0
    # ovs-vsctl add-port br1 dpdk1 -- set interface dpdk1 type=dpdk options:dpdk-devargs=0000:02:01.0
    # ovs-vsctl set Interface dpdkvhostuserclient0 options:n_rxq=2
    # ovs-vsctl set Interface dpdkvhostuserclient1 options:n_rxq=2
    # vs-vsctl set Interface dpdk0 options:n_rxq=2
    # vs-vsctl set Interface dpdk1 options:n_rxq=2


    ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:07:00.0,n_rxq_desc=1024,n_txq_desc=1024,n_rxq=1,pmd-rxq-affinity="0:1" ofport_request=1


    server201配置

    1
    2
    3
    4
    5
    ovs-vsctl add-br br-tun -- set bridge br-tun datapath_type=netdev
    ovs-vsctl add-port br-tun vxlan-1 -- set interface vxlan-1 type=vxlan ofport_request=100 options:remote_ip=192.168.100.202
    ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev
    ovs-vsctl add-port br-int dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuserclient0.sock
    ovs-vsctl set Interface dpdkvhostuserclient0 options:n_rxq=2

    server202配置

    1
    2
    3
    4
    5
    ovs-vsctl add-br br-tun -- set bridge br-tun datapath_type=netdev
    ovs-vsctl add-port br-tun vxlan-1 -- set interface vxlan-1 type=vxlan ofport_request=100 options:remote_ip=192.168.100.201
    ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev
    ovs-vsctl add-port br-int dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuserclient0.sock
    ovs-vsctl set Interface dpdkvhostuserclient0 options:n_rxq=2

troubleshooting

1
2
3
4
#dpdk-devbind --bind=vfio-pci ens34 报错
Warning: routing table indicates that interface 0000:02:01.0 is active. Not modifying
解决:
#ifdown ens33

虚拟机测试

1
qemu-system-x86_64 -enable-kvm -m 1024 -smp 2-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user-1-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -devicevirtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:05 -objectmemory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,share=on -numanode,memdev=mem -mem-prealloc -net user, -net nic /home/CentOS7.qcow2 -vn c0.0.0.0:30

参考:

https://docs.nvidia.com/networking/pages/releaseview.action?pageId=15053908