KVM使用心得

虚拟化技术基础

内存地址

  • 虚拟地址
  • 物理地址

时序复用,将时间进行切片,如果每5毫秒为一份,则再某一程序执行5毫秒后,不管程序是否执行完,cpu都会切换到处理下一个程序,如果该程序没处理完,则等到下一次轮询到该程序时,再接着处理。

空间复用,将物理内存切片,拿出处理当前进程所需要的N片分配。

tcp/ip协议栈

使用ip netns创建新的名称空间充当路由器

ip netns 命令详解

1
2
3
4
5
6
7
8
9
10
[root@node01 ~]#ip netns help
Usage: ip netns list #列出当前网络名称空间
ip netns add NAME #创建网络名称空间
ip netns set NAME NETNSID
ip [-all] netns delete [NAME]
ip netns identify [PID]
ip netns pids NAME
ip [-all] netns exec [NAME] cmd ... #在名称空间上执行shell命令
ip netns monitor
ip netns list-id

用网络名称空间模拟路由器使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
[root@node01 ~]#ip netns list
[root@node01 ~]#ip netns add router1
[root@node01 ~]#ip netns list
router1
[root@node01 ~]#ip netns exec router1 ifconfig -a
lo: flags=8<LOOPBACK> mtu 65536
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@node01 ~]#ip netns exec router1 ifconfig lo 127.0.0.1/8 up
[root@node01 ~]#ip netns exec router1 ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@node01 ~]#cat /proc/sys/net/ipv4/ip_forward
0
[root@node01 ~]#ip netns exec router1 cat /proc/sys/net/ipv4/ip_forward
0
[root@node01 ~]#echo 1 > /proc/sys/net/ipv4/ip_forward
[root@node01 ~]#cat /proc/sys/net/ipv4/ip_forward
1
[root@node01 ~]#ip netns exec router1 cat /proc/sys/net/ipv4/ip_forward
0

由以上操作可知,router1是与宿主机名称空间完全隔离的一个路由器,即使用ip netns命令,可以实现主机上的虚拟机之间通信的功能。

使用brctl管理软交换机

brctl命令详解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@node01 ~]#brctl -h
Usage: brctl [commands]
commands:
addbr <bridge> add bridge #创建软交换机
delbr <bridge> delete bridge #删除软交换机
addif <bridge> <device> add interface to bridge
delif <bridge> <device> delete interface from bridge
hairpin <bridge> <port> {on|off} turn hairpin on/off
setageing <bridge> <time> set ageing time
setbridgeprio <bridge> <prio> set bridge priority
setfd <bridge> <time> set bridge forward delay
sethello <bridge> <time> set hello time
setmaxage <bridge> <time> set max message age
setpathcost <bridge> <port> <cost> set path cost
setportprio <bridge> <port> <prio> set port priority
show [ <bridge> ] show a list of bridges #列出软交换机,即桥
showmacs <bridge> show a list of mac addrs
showstp <bridge> show bridge stp info
stp <bridge> {on|off} turn stp on/off

brctl命令使用演示

1
2
3
4
5
6
[root@node01 ~]#brctl show
bridge name bridge id STP enabled interfaces
[root@node01 ~]#brctl addbr br0
[root@node01 ~]#brctl show
bridge name bridge id STP enabled interfaces
br0 8000.000000000000 no

使用ip link创建虚拟网卡
ip link命令详解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
[root@node01 ~]#ip link help
Usage: ip link add [link DEV] [ name ] NAME
[ txqueuelen PACKETS ]
[ address LLADDR ]
[ broadcast LLADDR ]
[ mtu MTU ] [index IDX ]
[ numtxqueues QUEUE_COUNT ]
[ numrxqueues QUEUE_COUNT ]
type TYPE [ ARGS ]

ip link delete { DEVICE | dev DEVICE | group DEVGROUP } type TYPE [ ARGS ]

ip link set { DEVICE | dev DEVICE | group DEVGROUP }
[ { up | down } ]
[ type TYPE ARGS ]
[ arp { on | off } ]
[ dynamic { on | off } ]
[ multicast { on | off } ]
[ allmulticast { on | off } ]
[ promisc { on | off } ]
[ trailers { on | off } ]
[ carrier { on | off } ]
[ txqueuelen PACKETS ]
[ name NEWNAME ]
[ address LLADDR ]
[ broadcast LLADDR ]
[ mtu MTU ]
[ netns { PID | NAME } ]
[ link-netnsid ID ]
[ alias NAME ]
[ vf NUM [ mac LLADDR ]
[ vlan VLANID [ qos VLAN-QOS ] [ proto VLAN-PROTO ] ]
[ rate TXRATE ]
[ max_tx_rate TXRATE ]
[ min_tx_rate TXRATE ]
[ spoofchk { on | off} ]
[ query_rss { on | off} ]
[ state { auto | enable | disable} ] ]
[ trust { on | off} ] ]
[ node_guid { eui64 } ]
[ port_guid { eui64 } ]
[ xdp { off |
object FILE [ section NAME ] [ verbose ] |
pinned FILE } ]
[ master DEVICE ][ vrf NAME ]
[ nomaster ]
[ addrgenmode { eui64 | none | stable_secret | random } ]
[ protodown { on | off } ]

ip link show [ DEVICE | group GROUP ] [up] [master DEV] [vrf NAME] [type TYPE]

ip link xstats type TYPE [ ARGS ]

ip link afstats [ dev DEVICE ]

ip link help [ TYPE ]

TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | macvtap |
bridge | bond | team | ipoib | ip6tnl | ipip | sit | vxlan |
gre | gretap | ip6gre | ip6gretap | vti | nlmon | team_slave |
bond_slave | ipvlan | geneve | bridge_slave | vrf | macsec }

使用ip link创建一对网卡演示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@node01 ~]#ip link add veth1.1 type veth peer name veth1.2
[root@node01 ~]#ip link l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:03:24:1c brd ff:ff:ff:ff:ff:ff
3: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:03:24:26 brd ff:ff:ff:ff:ff:ff
4: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 22:17:fd:32:b4:81 brd ff:ff:ff:ff:ff:ff
5: veth1.2@veth1.1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 3e:87:a2:0e:40:eb brd ff:ff:ff:ff:ff:ff
6: veth1.1@veth1.2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 9e:f2:a6:29:6a:c9 brd ff:ff:ff:ff:ff:ff

将刚创建的一对网卡的一头接到虚拟交换机br0上

1
2
3
4
[root@node01 ~]#brctl addif br0 veth1.1
[root@node01 ~]#brctl show
bridge name bridge id STP enabled interfaces
br0 8000.9ef2a6296ac9 no veth1.1

将刚创建的一对网卡的另一头接到router1上

1
[root@node01 ~]#ip link set veth1.2 netns router1

此时在用ip link l命令只能查看到veth1.1,而不能看到veth1.2,因为ip link l命令只能看到宿主机的名称空间的网卡,而veth1.2已经被接到router1上。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@node01 ~]#ip link l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:03:24:1c brd ff:ff:ff:ff:ff:ff
3: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:03:24:26 brd ff:ff:ff:ff:ff:ff
4: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 9e:f2:a6:29:6a:c9 brd ff:ff:ff:ff:ff:ff
6: veth1.1@if5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master br0 state DOWN mode DEFAULT group default qlen 1000
link/ether 9e:f2:a6:29:6a:c9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]#ip netns exec router1 ifconfig -a
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth1.2: flags=4098<BROADCAST,MULTICAST> mtu 1500
ether 3e:87:a2:0e:40:eb txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

将veth1.2改名为eth0

1
[root@node01 ~]#ip netns exec router1 ip link set veth1.2 name eth0

给router1上的eth0网卡配置ip地址

1
[root@node01 ~]#ip netns exec router1 ifconfig eth0 10.0.0.1/24 up

给veth1.1网卡配置ip地址

1
[root@node01 ~]#ifconfig veth1.1 10.0.0.2/24 up

将veth1.1从软交换机br0上卸下来

1
[root@node01 ~]#brctl delif br0 veth1.1

此时,刚才创建的一对网卡一个在宿主机的名称空间内,一个在刚创建的router1名称空间内,测试测试的网络是否能联通

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@node01 ~]#ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.053 ms
^C
--- 10.0.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.053/0.053/0.053/0.000 ms
[root@node01 ~]#ip netns exec router1 ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.022 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.048 ms
^C
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.022/0.035/0.048/0.013 ms

将网卡veth1.1再次加到虚拟交换机br0上

1
[root@node01 ~]#brctl addif br0 veth1.1

激活软交换机br0

1
[root@node01 ~]#ifconfig br0 up

此时发现两个网卡之间不能通信了,原因是交换机br0是个二层设备,而二层设备是不允许有地址的。

为了测试不同名称空间之间的网卡是可以通信的,再加一个路由器。

1
[root@node01 ~]#ip netns add router2

将虚拟网卡veth1.1加到路由器router2上

1
[root@node01 ~]#ip link set veth1.1 netns router2

此时,相当于在主机上创建了两个虚拟路由器,然后又创建了一对虚拟网卡,分别作为两个路由器的端口。

由于将veth1.1重新加载到了router2上,所以需要重新配置地址

1
[root@node01 ~]#ip netns exec router2 ifconfig veth1.1 10.0.0.2/24 up

此时测试者一对网卡之间是否能通信

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@node01 ~]#ip netns exec router2 ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.039 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.028 ms
^C
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.028/0.033/0.039/0.008 ms
[root@node01 ~]#ip netns exec router1 ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.055 ms
^C
--- 10.0.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.032/0.043/0.055/0.013 ms

浅谈仅主机模式和nat模式

仅主机模式就是在宿主机上建立一个软交换机,然后将一个或多个虚拟机接到该软交换机上,这样一来,接到该交换机的各个虚拟机之间可以正常通信,但是他们就像一个孤岛似的,只能相互之间通信,与宿主机都不能通信,如果把该软交换机当成宿主机的一个网卡,配上ip,此时宿主机和各个虚拟机之间就可以实现通信,但各个虚拟机与数组及之外的主机均不能通信,此种模式就是仅主机模式。
在此基础上,如果在宿主机加一个网卡,此网卡可以连接互联网,再打开宿主机的核心转发功能,并做一条SNAT规则,将宿主机上的各个虚拟机所在的网段发来的不到该网段的报文全部将源地址改为可以连接互联网的哪个网卡的ip地址,此时就成了nat模式。

虚拟化技术分类

主机级虚拟化
容器级虚拟化

CloudOS(Iaas)

  • OpenStack
  • CloudStack

CloudOS(PaaS)

  • kubernetes
  • docker compose,swarm,machine
  • Mesos,Marathon

KVM

KVM全称kernel virtualization machine,基于内核的虚拟机
kvm特点

  • 仅支持x86_64架构
  • 要求cpu必须支持硬件虚拟化

使用图形化工具virt-manager安装使用kvm
判断CPU是否支持硬件虚拟化:

1
grep -i -E '(vmx|svm|lm)' /proc/cpuinfo

加载kvm模块

1
modprobe kvm

安装qemu-kvm软件

1
yum -y install qemu-kvm

使用virt-manager管理kvm

1
2
3
4
yum -y install qemu-kvm libvirt-daemon-kvm libvirt virt-manager
modprobe kvm
systemctl start libvirtd.service
virt-manager &

此外,还需要系统拥有桌面环境。

此时,在虚拟机运行virt-manager命令即可创建虚拟机,创建的虚拟机的相关配置会以文件的形式保存在/etc/libvirt/qemu/中。

使用命令行工具virsh做kvm的全生命周期管理
virsh命令介绍:

virsh create file  从一个xml文件创建一个域并开启
virsh define file 从一个xml文件定义一个域,但是不开启
virsh list --all 显示当前创建爱你的所有虚拟机
virsh  dumpxml domain 查看域详情
virsh console domain 连接虚拟机真实串行控制台
virsh suspend domain 挂起一个域,相当于暂停。
virsh resume domain 将一个域从挂起的状态中恢复
virsh save domain file 把域的状态保存到一个文件中,可能会导致数据不一致,先挂起再保存比较保险。
virsh restore file 从域状态保存到的文件中恢复一个域。--running将域恢复为运行状态,--paused将域恢复为暂停状态
virsh reboot domain 重新启动一个域,热重启
virsh reset doamin 如同使用电源按钮重新设定目标域
virsh shutdown domain 在目标域中执行关闭行为,即关闭虚拟机
virsh destroy domain 销毁虚拟机,不会删除虚拟机文件,而是把虚拟机这个进程销毁
virsh undefine domain 删除虚拟机定义,即删除虚拟机文件
virsh autostart domain 将虚拟机设定为自动启动
virsh domrename domain 重命名一个虚拟机

kvm支持硬件设备的热插拔