openstack-一次虚拟机启动异常排查

环境 : openstack stein版本,本地盘存储模式

故障:虚机启动异常,排查原因

查看虚机日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
2021-02-08 13:58:08.352 7 DEBUG nova.virt.images [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] 5516a827-a2fb-463b-9b03-168e56faa79f was qcow2, converting to raw fetch_to_raw /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/images.py:172
2021-02-08 13:58:08.353 7 DEBUG nova.privsep.utils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] Path '/var/lib/nova/instances' supports direct I/O supports_direct_io /var/lib/kolla/venv/lib/python2.7/site-packages/nova/privsep/utils.py:66
2021-02-08 13:58:08.353 7 DEBUG oslo_concurrency.processutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] Running cmd (subprocess): qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.part /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.converted execute /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372
2021-02-08 13:58:10.109 7 DEBUG oslo_concurrency.processutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] CMD "qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.part /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.converted" returned: 0 in 1.755s execute /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409
2021-02-08 13:58:10.375 7 DEBUG oslo_concurrency.processutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] Running cmd (subprocess): /var/lib/kolla/venv/bin/python -m oslo_concurrency.prlimit --as=1073741824 --cpu=30 -- env LC_ALL=C LANG=C qemu-img info /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.converted --force-share execute /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372
2021-02-08 13:58:10.813 7 DEBUG oslo_concurrency.processutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] CMD "/var/lib/kolla/venv/bin/python -m oslo_concurrency.prlimit --as=1073741824 --cpu=30 -- env LC_ALL=C LANG=C qemu-img info /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.converted --force-share" returned: 0 in 0.438s execute /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409
2021-02-08 13:58:10.819 7 DEBUG oslo_concurrency.lockutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] Lock "344d45fde44ae9163acfa37ab64364ffe2fd0b9c" released by "nova.virt.libvirt.imagebackend.fetch_func_sync" :: held 46.934s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:339
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] Instance failed to spawn: ImageUnacceptable: Image 5516a827-a2fb-463b-9b03-168e56faa79f is unacceptable: Converted to raw, but format is now qcow2
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] Traceback (most recent call last):
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py", line 2613, in _build_resources
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] yield resources
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py", line 2372, in _build_and_run_instance
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] block_device_info=block_device_info)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3358, in spawn
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] block_device_info=block_device_info)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3754, in _create_image
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] fallback_from_host)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3855, in _create_and_inject_local_root
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] instance, size, fallback_from_host)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 8578, in _try_fetch_image_cache
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] trusted_certs=instance.trusted_certs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 276, in cache
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] *args, **kwargs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 668, in create_image
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] prepare_template(target=base, *args, **kwargs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] return f(*args, **kwargs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 272, in fetch_func_sync
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] fetch_func(target=target, *args, **kwargs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 421, in fetch_image
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] images.fetch_to_raw(context, image_id, target, trusted_certs)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/images.py", line 188, in fetch_to_raw
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] data.file_format)
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] ImageUnacceptable: Image 5516a827-a2fb-463b-9b03-168e56faa79f is unacceptable: Converted to raw, but format is now qcow2
2021-02-08 13:58:10.820 7 ERROR nova.compute.manager [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757]
2021-02-08 13:58:10.821 7 INFO nova.compute.manager [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] Terminating instance
2021-02-08 13:58:10.823 7 DEBUG nova.compute.manager [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] [instance: bc2b194c-d28c-4bd1-960f-42950e5ab757] Start destroying the instance on the hypervisor. _shutdown_instance /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py:2751

pycharm 查看nova代码

nova/virt/images.py, 查看基本逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def fetch_to_raw(context, image_href, path, trusted_certs=None)
...
fetch(context, image_href, path_tmp, trusted_certs) //先从glance api下载镜像到本地_base目录
...
if fmt != "raw" and CONF.force_raw_images:
...
convert_image(path_tmp, staged, fmt, 'raw') //调用qemu convert command转换格式
...
data = qemu_img_info(staged) //调用qemu info查看转换后得镜像格式
if data.file_format != "raw":
raise exception.ImageUnacceptable(image_id=image_href, //抛出异常
reason=_("Converted to raw, but format is now %s") %
data.file_format)

总体代码逻辑如上,所以问题基本定位是 qemu convert后镜像依旧是qcow2呢, 那为什么会呢 ?

  1. nova-compute 日志中
1
2021-02-08 13:58:10.109 7 DEBUG oslo_concurrency.processutils [req-dd576230-3c75-41ca-b9f3-08b9c957eac5 1c935a2c05c140e18d93f3e189fd0ad7 3a228708dc8742d1adf519b52b613acf - b0a8dae41bc44b3296ae1a75d3e0c370 b0a8dae41bc44b3296ae1a75d3e0c370] CMD "qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.part /var/lib/nova/instances/_base/344d45fde44ae9163acfa37ab64364ffe2fd0b9c.converted" returned: 0 in 1.755s execute /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409

说明调用了qemu 发生了convert,WHY ? 那让我们手工模拟一下代码流程吧

1
2
3
4
5
#glance image-download 5516a827-a2fb-463b-9b03-168e56faa79f --file test.qcow2 --progress
#qemu-img info test.qcow2
#qemu-img convert -t none -O raw -f qcow2 test.qcow2 test.qcow2.convert
#qemu-img info test.raw.convert

如上图,发现,确实qemu convert 转换失败了,转完依旧是qcow2 , 这什么鬼?

定位相关同事,他的操作记录

  1. glance download 镜像 kali-self-2021.01.19.raw (原始glance镜像是qcow2,他却保存成了*.raw)
  2. qemu-img convert -p -f raw -O qcow2 kali-self-2021.01.19.raw kali-self-2021.01.19.qcow2
  3. glance image-create 上传

定位到是他 qemu convert 转错了,多转了一遍,把 qcow2 文件当成raw convert to qcow2 了,所以导致这个故障,Done~