openstack-裸金属ironic源码创建分析

openstack ironic 裸金属创建裸机流程源码分析

基于 ironic stein

ironic-api接收到了provision_state的设置请求,然后返回202的异步请求,api请求到了ironic.api.controllers.v1.node.NodeStatesController的provision方法, 然后调用了 _do_provision_action 方法 。

1
2
3
4
5
6
7
8
9
10
@METRICS.timer('NodeStatesController.provision')
@expose.expose(None, types.uuid_or_name, wtypes.text,
types.jsontype, types.jsontype, wtypes.text,
status_code=http_client.ACCEPTED)
def provision(self, node_ident, target, configdrive=None,
clean_steps=None, rescue_password=None):
***************
self._do_provision_action(rpc_node, target, configdrive, clean_steps,
rescue_password)

1
2
3
4
5
6
7
8
9
10
def _do_provision_action(self, rpc_node, target, configdrive=None,
clean_steps=None, rescue_password=None):
************
if target in (ir_states.ACTIVE, ir_states.REBUILD):
rebuild = (target == ir_states.REBUILD)
pecan.request.rpcapi.do_node_deploy(context=pecan.request.context,
node_id=rpc_node.uuid,
rebuild=rebuild,
configdrive=configdrive,
topic=topic)

然后RPC调用的ironic.condutor.manager.ConductorManager.do_node_deploy方法, 最终启动一个task ,调用 ironic.condutor.manager.do_node_deploy方法, 如果配置了configdrive会传至swift接口,然后调用 task.driver.deploy.prepare(task) 过程, 不同驱动的动作不一样 pxe_* 驱动使用的是iscsi_deploy.ISCSIDeploy.prepare,本地用的 task.driver.deploy.deploy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
@METRICS.timer('ConductorManager.do_node_deploy')
@messaging.expected_exceptions(exception.NoFreeConductorWorker,
exception.NodeLocked,
exception.NodeInMaintenance,
exception.InstanceDeployFailure,
exception.InvalidStateRequested,
exception.NodeProtected)
def do_node_deploy(self, context, node_id, rebuild=False,
configdrive=None):
********
#状态相关检查,确保裸机参数及电源正常
try:
task.driver.power.validate(task)
task.driver.deploy.validate(task)
utils.validate_instance_info_traits(task.node)
conductor_steps.validate_deploy_templates(task)
except exception.InvalidParameterValue as e:
raise exception.InstanceDeployFailure(
_("Failed to validate deploy or power info for node "
"%(node_uuid)s. Error: %(msg)s") %
{'node_uuid': node.uuid, 'msg': e}, code=e.code)

LOG.debug("do_node_deploy Calling event: %(event)s for node: "
"%(node)s", {'event': event, 'node': node.uuid})
try:
task.process_event(
event,
callback=self._spawn_worker,
call_args=(do_node_deploy, task, self.conductor.id,
configdrive),
err_handler=utils.provisioning_error_handler)
except exception.InvalidState:
raise exception.InvalidStateRequested(
action=event, node=task.node.uuid,
state=task.node.provision_state)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
@METRICS.timer('do_node_deploy')
@task_manager.require_exclusive_lock
def do_node_deploy(task, conductor_id=None, configdrive=None):
"""Prepare the environment and deploy a node."""
node = task.node

try:
if configdrive:
if isinstance(configdrive, dict):
configdrive = utils.build_configdrive(node, configdrive)
_store_configdrive(node, configdrive)
******************
try:
task.driver.deploy.prepare(task)

如果是iSCSI启动,task.driver.deploy.prepare(task) 调用 iscsi_deploy.py 的 prepare(task), 初次调用 task.driver.boot.prepare_instance(task) , 配置dhcp ,准备pxe config和缓存镜像, 通过ipmi操控bios配置启动设备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
*****
pxe_utils.cache_ramdisk_kernel(task, instance_image_info,
ipxe_enabled=CONF.pxe.ipxe_enabled)
*****
pxe_utils.prepare_instance_pxe_config(
task, instance_image_info,
iscsi_boot=deploy_utils.is_iscsi_boot(task),
ramdisk_boot=(boot_option == "ramdisk"),
ipxe_enabled=CONF.pxe.ipxe_enabled)
*****
# If it's going to PXE boot we need to update the DHCP server
dhcp_opts = pxe_utils.dhcp_options_for_instance(
task, ipxe_enabled)
provider = dhcp_factory.DHCPFactory()
provider.update_dhcp(task, dhcp_opts)
*****
manager_utils.node_set_boot_device(task, boot_device,
persistent=persistent)

默认本地启动调用 ironic.drivers.modules.pxe.PXERamdiskDeploy.prepare中会检查boot_option是否正确, 然后调用 populate_storage_driver_internal_info,如果使用iSCSI启动,保存boot的volume 信息至internal_info , 最终调用 调用 ironic.drivers.modules.pxe.PXEBoot.prepare_instance 启动部署

1
2
3
4
5
6
7
8
9
@METRICS.timer('RamdiskDeploy.prepare')
@task_manager.require_exclusive_lock
def prepare(self, task):
*****
deploy_utils.populate_storage_driver_internal_info(task)
******
if node.provision_state in (states.ACTIVE, states.UNRESCUING):
# In the event of takeover or unrescue.
task.driver.boot.prepare_instance(task)

调用 ironic.drivers.modules.pxe.PXEBoot.prepare_instance, 配置dhcp ,准备pxe config和缓存镜像, 通过ipmi操控bios配置启动设备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
****
#缓存镜像到本地
pxe_utils.cache_ramdisk_kernel(task, instance_image_info,
ipxe_enabled=CONF.pxe.ipxe_enabled)
****
#为当前裸机准备 单独的 pxe 文件
pxe_utils.prepare_instance_pxe_config(
task, instance_image_info,
iscsi_boot=deploy_utils.is_iscsi_boot(task),
ramdisk_boot=(boot_option == "ramdisk"),
ipxe_enabled=CONF.pxe.ipxe_enabled)
boot_device = boot_devices.PXE
*****
#因pxe启动,需update ironic DHCP配置文件配置pxe,bootfile等信息,本地启动时清理pxe文件更改启动设备为磁盘
# If it's going to PXE boot we need to update the DHCP server
dhcp_opts = pxe_utils.dhcp_options_for_instance(
task, ipxe_enabled)
provider = dhcp_factory.DHCPFactory()
provider.update_dhcp(task, dhcp_opts)
****
#
manager_utils.node_set_boot_device(task, boot_device,
persistent=persistent)
****
#开始部署
_do_next_deploy_step(task, 0, conductor_id)

调用task.deploy

1
2
3
*****
result = interface.execute_deploy_step(task, step)

如果是iSCSI启动直接调用了 iscsi_deploy.ISCSIDeploy.deploy, cinder盘,不需要缓存镜像,所以,关机,切换租户网络,开机

1
2
3
4
5
6
7
8
9
10
11
*****
manager_utils.node_power_action(task, states.POWER_OFF)
power_state_to_restore = (
manager_utils.power_on_node_if_needed(task))
task.driver.network.remove_provisioning_network(task)
task.driver.network.configure_tenant_networks(task)
manager_utils.restore_power_state_if_needed(
task, power_state_to_restore)
task.driver.boot.prepare_instance(task)
manager_utils.node_power_action(task, states.POWER_ON)

本地盘 调用pxe.PXEBoot.deploy启动deploy会关机,配置租户网络, 再开机

1
2
3
4
5
6
*****
manager_utils.node_power_action(task, states.POWER_OFF)
******
task.driver.network.configure_tenant_networks(task)
*****
manager_utils.node_power_action(task, states.POWER_ON)

至此deploy完,本地盘会进入ramdisk, iSCSI启动不走ramdisk会直接进去系统,

openstack-裸金属ironic创建及调度-3

云盘支持

When user starts bare metal instance with Cinder volume, Nova orchestrates the communication with Cinder and Ironic. The work flow of the boot process is as follows:

  1. (Preparation) Administrator registers a node with initiator information.
  2. User asks Cinder to create a boot volume.
  3. User asks Nova to boot a node from the Cinder volume.
  4. Nova calls Ironic to collect iSCSI/FC initiator information. Ironic collects initiator information and returns it to Nova.
  5. Nova calls Cinder to attach the volume to the node. Cinder attaches the volume to the node and returns connection information which includes target information.
  6. Nova passes the target information for the node to Ironic
  7. Nova calls Ironic to spawn the instance. Ironic prepares the bare metal node to boot from the remote volume which is identified by target information and powers on the bare metal node.

In the work flow above, Nova calls Ironic to get/set initiator/target information (4 and 6) and also administrator calls Ironic to set initiator information (1) but currently Ironic has neither those information nor APIs for them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#pip install rbd-iscsi-client==0.1.8 -i https://pypi.tuna.tsinghua.edu.cn/simple
#cinder type-create rbd_iscsi
#cinder type-key rbd_iscsi set volume_backend_name=rbd_iscsi
#cinder create 100 --display-name iscsi_boot1 --volume-type rbd_iscsi --image-id 3ec7c300-78bc-4d4b-bb1d-aaab4b4c35d7
#cinder create 100 --display-name iscsi_boot2 --volume-type rbd_iscsi
#openstack baremetal node set --storage-interface cinder 5e10c067-c0c6-48be-ac6e-749a70045b97
#openstack baremetal node set --property capabilities=iscsi_boot:True 5e10c067-c0c6-48be-ac6e-749a70045b97

这一步可以理解为注册主机,target中host信息,id名称随便起,最大64bytes
#openstack baremetal volume connector create --node 5e10c067-c0c6-48be-ac6e-749a70045b97 --type iqn --connector-id iqn.2021-04.net.*****.sh.cluster.5e10c067-c0c6-48be-ac6e-749a70045b97
为该裸机分配磁盘 volume
#openstack baremetal volume target create --node 5e10c067-c0c6-48be-ac6e-749a70045b97 --type iscsi --boot-index 0 --volume cfa8c986-7a79-48fd-88a0-aa997d579fbf --property auth_method="CHAP" --property auth_username="admin" --property auth_password="admin"
#openstack baremetal volume target create --node 5e10c067-c0c6-48be-ac6e-749a70045b97 --type iscsi --boot-index 1 --volume 4e4e41e3-fbfb-4737-bd93-863f48470244 --property auth_method="CHAP" --property auth_username="admin" --property auth_password="admin"

#openstack server create --image 3ec7c300-78bc-4d4b-bb1d-aaab4b4c35d7 --flavor bms.large --nic net-id=cb1356ca-e79b-4d7b-ad29-3267e4c2bc41 bm01

1
2
3
4
5
6
7
#cinder create 100 --display-name centos7-grub2-iscsi-pwd  --image-id ea4266e4-456e-4fe0-91c9-a8c15dd2d2d6 --volume-type rbd_iscsi
#gwcli
>cd /disks/
>create iscsi-data image=volume-98ebbe5c-a32f-45ee-be04-61c295bcabe4 size=100g
>goto hosts
>create iqn.2021-01.net.*****.sh***:server01
>disk add iscsi-data/volume-98ebbe5c-a32f-45ee-be04-61c295bcabe4 size=100g

常见问题:

  1. volume状态是否是available
1
#cat /data/docker/volumes/ironic_ipxe/_data/78\:ac\:44\:04\:f1\:a9.conf 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#!ipxe

set attempts:int32 10
set i:int32 0

goto deploy

:deploy
imgfree
kernel selinux=0 troubleshoot=0 text nofb nomodeset vga=normal console=tty0 console=ttyS0,115200n8 ipa-debug=1 BOOTIF=${mac} ipa-api-url=http://10.218.105.100:6385 initrd=deploy_ramdisk coreos.configdrive=0 || goto retry

initrd || goto retry
boot

:retry
iseq ${i} ${attempts} && goto fail ||
inc i
echo No response, retrying in {i} seconds.
sleep ${i}
goto deploy

:fail
echo Failed to get a response after ${attempts} attempts
echo Powering off in 30 seconds.
sleep 30
poweroff

:boot_partition
imgfree
kernel no_kernel root={{ ROOT }} ro text nofb nomodeset vga=normal console=tty0 console=ttyS0,115200n8 ipa-debug=1 initrd=ramdisk || goto boot_partition
initrd no_ramdisk || goto boot_partition
boot

:boot_ramdisk
imgfree
kernel no_kernel root=/dev/ram0 text nofb nomodeset vga=normal console=tty0 console=ttyS0,115200n8 ipa-debug=1 initrd=ramdisk || goto boot_ramdisk
initrd no_ramdisk || goto boot_ramdisk
boot

:boot_iscsi
imgfree
set username iqn.2021-04.net.*****.5e10c067-c0c6-48be-ac6e-749a70045b97
set password Cc6BK5uTbCsH6hCF
set initiator-iqn iqn.2021-04.net.*****.5e10c067-c0c6-48be-ac6e-749a70045b97
sanhook --drive 0x80 iscsi:10.206.79.34::3260:1:iqn.2021-04.net.*****.iscsi-gw:ceph-igw || goto fail_iscsi_retry
set username admin
set password admin
sanhook --drive 0x81 iscsi::::: || goto fail_iscsi_retry
set username iqn.2021-04.net.*****.5e10c067-c0c6-48be-ac6e-749a70045b97
set password Cc6BK5uTbCsH6hCF
sanboot --no-describe || goto fail_iscsi_retry

:fail_iscsi_retry
echo Failed to attach iSCSI volume(s), retrying in 10 seconds.
sleep 10
goto boot_iscsi

:boot_whole_disk

bs=16k, iodepth=1

吞吐

顺序读 顺序写 随机读 随机写 混合随机读写
数据云盘sata 246MB 41.8MB 291MB 10.1MB 7799kB/7894kB
物理盘(sas*2 raid1) 584MB 1418MB/s 127MB 65.4MB 40.2MB/40.1MB

IOPS

顺序读 顺序写 随机读 随机写 混合随机读写
数据云盘sata 15.0k 2550 17.8k 618 476/481
物理盘(sas*2 raid1) 35.7k 86.6k 7747 3990 2451/2450

bs=4k,iodepth=1

吞吐

顺序读 顺序写 随机读 随机写 混合随机读写
数据云盘sata 112MB 14.1MB 93.1MB 2490kB 2212kB/2238kB
物理盘(sas*2 raid1) 367MB 372MB 21.4MB 15.0MB 7252kB/7263kB

IOPS

顺序读 顺序写 随机读 随机写 混合随机读写
数据云盘sata 27.3k 3431 22.7k 607 559/565
物理盘(sas*2 raid1) 89.5k 90.8k 5223 3904 1770/1773

吞吐 bs=1024k,iodepth=128,size=50G,ioengine=psync

顺序读 顺序写 随机读 随机写 混合随机读写
iSCSI云盘sata 600G 840MB 85MB 830MB 140MB 135MB/140MB
iSCSI云盘sata 4T 840MB 97MB 870MB 140MB 135MB/140MB
物理盘(sas*2 raid1) 600G 520MB 2650MB 290MB 150MB 95MB/95MB
物理盘(sata*1 raid0) 4T 340MB 140MB 100MB 120MB 55MB/60MB
物理盘(ssd*1 raid0) 800G 280MB 230MB 420MB 350MB 175MB/180MB

IOPS bs=4k,iodepth=128,size=50G,ioengine=psync

顺序读 顺序写 随机读 随机写 混合随机读写
iSCSI云盘sata 600G 25.9k 2663 20k 799 602/609
iSCSI云盘sata 4T 26k 3167 25k 828 600/600
物理盘(sas*2 raid1) 600G 82.0k 92.3k 901 1232 685/694
物理盘(sata*1 raid0) 4T 5.8k 7k 327 276 142/148
物理盘(ssd*1 raid0) 800G 47.9k 6k 46.4k 39.6k 20.2k/20.2k

openstack-裸金属ironic创建及调度-2

创建裸金属实例Flavor

官方文档https://docs.openstack.org/ironic/latest/install/configure-nova-flavors.html

在 Queen 版本中,Ironic 项目新增 Trait API,节点的 traits 信息可以注册到计算服务的 Placement API 中,用于创建虚拟机时的调度。添加 Trait API 后,注册到 Ironic 的裸机也可以通过 Trait API 注册到 Placement 资源清单中,最终支持裸机的部署调度。

本文我们实践通过 Placement 来完成裸机的调度,通过 Resource Class 来标识 ironic node 的资源类型,通过 Resource Traits 来标识 ironic node 的特征,还可以通过 resources:VCPU=0、resources:MEMORY_MB=0、resources:DISK_GB=0 来 disable scheduling。

  • 创建 Flavor
1
2
3
4
5
openstack flavor create --ram 262144 --vcpus 40 --disk 600 bms.test 
openstack flavor set --property resources:CUSTOM_BAREMETAL=1 bms.test
openstack flavor set --property resources:VCPU=0 bms.test
openstack flavor set --property resources:MEMORY_MB=0 bms.test
openstack flavor set --property resources:DISK_GB=0 bms.test
  • 验证,获取 Placement 候选人
1
2
3
4
5
6
[root@controller ~]# openstack allocation candidate list --resource VCPU=40 --resource DISK_GB=600 --resource MEMORY_MB=262144 --resource CUSTOM_BAREMETAL=1
+---+-----------------------------------------------------------+--------------------------------------+-------------------------------------------------------------------+------------------------------+
| # | allocation | resource provider | inventory used/capacity | traits |
+---+-----------------------------------------------------------+--------------------------------------+-------------------------------------------------------------------+------------------------------+
| 1 | MEMORY_MB=8192,VCPU=2,DISK_GB=100,CUSTOM_BAREMETAL_TEST=1 | e322f49a-ad50-468d-a031-29bde068c290 | VCPU=0/2,MEMORY_MB=0/8192,DISK_GB=0/100,CUSTOM_BAREMETAL_TEST=0/1 | HW_CPU_X86_VMX,CUSTOM_TRAIT1 |
+---+-----------------------------------------------------------+--------------------------------------+--------
1
2
3
4
5
6
[root@17471e0f8de8 sh1]# openstack allocation candidate list   --resource CUSTOM_BAREMETAL=1                                                             
+---+--------------------+--------------------------------------+-------------------------+---------------------------------------------+
| # | allocation | resource provider | inventory used/capacity | traits |
+---+--------------------+--------------------------------------+-------------------------+---------------------------------------------+
| 1 | CUSTOM_BAREMETAL=1 | 5e10c067-c0c6-48be-ac6e-749a70045b97 | CUSTOM_BAREMETAL=0/1 | HW_CPU_X86_VMX,COMPUTE_NET_ATTACH_INTERFACE |
+---+--------------------+--------------------------------------+-------------------------+---------------------------------------------+

openstack-裸金属ironic创建及调度-1

注册裸机节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#openstack baremetal node create --driver ipmi --name BM01
+------------------------+--------------------------------------+
| Field | Value |
+------------------------+--------------------------------------+
| allocation_uuid | None |
| automated_clean | None |
| bios_interface | no-bios |
| boot_interface | pxe |
| chassis_uuid | None |
| clean_step | {} |
| conductor | openstack07.control |
| conductor_group | |
| console_enabled | False |
| console_interface | ipmitool-socat |
| created_at | 2021-04-25T06:27:44+00:00 |
| deploy_interface | iscsi |
| deploy_step | {} |
| description | None |
| driver | ipmi |
| driver_info | {} |
| driver_internal_info | {} |
| extra | {} |
| fault | None |
| inspect_interface | no-inspect |
| inspection_finished_at | None |
| inspection_started_at | None |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | False |
| maintenance_reason | None |
| management_interface | ipmitool |
| name | BM01 |
| network_interface | neutron |
| owner | None |
| power_interface | ipmitool |
| power_state | None |
| properties | {} |
| protected | False |
| protected_reason | None |
| provision_state | enroll |
| provision_updated_at | None |
| raid_config | {} |
| raid_interface | no-raid |
| rescue_interface | no-rescue |
| reservation | None |
| resource_class | None |
| storage_interface | noop |
| target_power_state | None |
| target_provision_state | None |
| target_raid_config | {} |
| traits | [] |
| updated_at | None |
| uuid | 5e10c067-c0c6-48be-ac6e-749a70045b97 |
| vendor_interface | ipmitool |
+------------------------+--------------------------------------+

当前许多 ironic node info 都是 None,需要后续继续更新。

  • 设置部署接口类型,现在可支持 iSCSI、Direct、Ansible 等类型,每种类型都有不同的行为模型,可根据实际情况选择,这里我们选择direct 、不使用(占用 Provisioning Network 的带宽)的 iSCS 类型
1
2
openstack baremetal  node set 5e10c067-c0c6-48be-ac6e-749a70045b97 \
--deploy-interface direct
  • 设置 driver_info,这里即 IPMI info,主要是提供 IPMI 的登录账户信息
1
2
3
4
5
openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 \
--driver-info ipmi_username=admin \
--driver-info ipmi_password=admin \
--driver-info ipmi_address=172.18.22.106 \
--driver-info ipmi_port=623

NOTEIPMI Driver 官方文档

  • 设置 Deploy Images,通过 RAMDisk 的方式启动
1
2
3
4
5
openstack image list

openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 \
--driver-info deploy_kernel=07852c3c-f4b4-43ab-b6b3-aeafe12c07d4 \
--driver-info deploy_ramdisk=e3429cf3-8fed-4ff8-afb0-50056aacc8f0
  • 设置 Provisioning/Cleaning Network
1
2
3
4
5
6
7
neutron net-create ironic_deploy --shared  --provider:network_type flat --provider:physical_network physnet1

neutron subnet-create --name flat-subnet --gateway 192.168.99.1 --dns-nameserver 8.8.8.8 --allocation-pool start=192.168.99.120,end=192.168.99.250 ironic_deploy 192.168.99.0/24

openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 \
--driver-info cleaning_network=7fa970fd-60c7-4f7e-83a2-cd611470dfc6 \
--driver-info provisioning_network=7fa970fd-60c7-4f7e-83a2-cd611470dfc6
  • 设置 ironic node 的 PXE 网卡 MAC 地址,在 Provisioning Network 中通过这个 MAC 地址来为其分配 IP 地址
1
2
openstack baremetal port create 78:ac:44:04:f1:a9 --node 5e10c067-c0c6-48be-ac6e-749a70045b97

NOTE:部署裸金属实例成功之后 PXE 网卡的 MAC 地址会被绑定到对应的 Tenant Network Port。

  • 为 ironic node 设置 Placement 筛选候选人的 Resource Class 类型,nova-compute for Ironic 会自动为其创建 Placement Resource Provider
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 --resource-class BAREMETAL

$ openstack resource provider list
+--------------------------------------+--------------------------------------+------------+--------------------------------------+----------------------+
| uuid | name | generation | root_provider_uuid | parent_provider_uuid |
+--------------------------------------+--------------------------------------+------------+--------------------------------------+----------------------+
| f00d6063-1253-4103-8f4d-899ea700936f | openstack08.control | 5 | f00d6063-1253-4103-8f4d-899ea700936f | None |
| 0cbb35fe-44b7-4fe7-a93f-41ffacaf6a0a | openstack07.control | 5 | 0cbb35fe-44b7-4fe7-a93f-41ffacaf6a0a | None |
| 9ee63716-854f-4554-8211-9de6516e3aa3 | 9ee63716-854f-4554-8211-9de6516e3aa3 | 3 | 9ee63716-854f-4554-8211-9de6516e3aa3 | None |
| 4a22be4a-fd35-43d8-842e-6ac7db30bd4e | 4a22be4a-fd35-43d8-842e-6ac7db30bd4e | 0 | 4a22be4a-fd35-43d8-842e-6ac7db30bd4e | None |
+--------------------------------------+--------------------------------------+------------+--------------------------------------+----------------------+

$ openstack resource provider inventory list 5e10c067-c0c6-48be-ac6e-749a70045b97
+------------------+------------------+----------+----------+----------+-----------+-------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+------------------+------------------+----------+----------+----------+-----------+-------+
| CUSTOM_BAREMETAL | 1.0 | 1 | 1 | 1 | 1 | 1 |
+------------------+------------------+----------+----------+----------+-----------+-------+
  • 为 ironic node 设置 Placement 筛选候选人的 Resource Traits 标签
1
2
3
4
5
6
7
8
9
10
11

openstack baremetal node add trait 5e10c067-c0c6-48be-ac6e-749a70045b97 \
HW_CPU_X86_VMX

[root@controller ~]# openstack resource provider trait list e322f49a-ad50-468d-a031-29bde068c290
+----------------+
| name |
+----------------+
| HW_CPU_X86_VMX |
| CUSTOM_TRAIT1 |
+----------------+

NOTE:这个操作需要较高的 Placement API 版本 <= 1.17

  • 设置 ironic node 的基础资源信息,作为 Placement 筛选候选人的参数因子
1
2
3
4
openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 \
--property cpus=40 \
--property memory_mb=262144 \
--property local_gb=600
  • 如果裸机服务器设定的是 UEFI,那么需要设置 ironic node 的 boot mode。(传统bios请忽略!!!)
1
openstack baremetal node set 5e10c067-c0c6-48be-ac6e-749a70045b97 --property capabilities='boot_mode:uefi'
  • 验证上述录入的 ironic node infos 是否合规
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ openstack baremetal node validate e322f49a-ad50-468d-a031-29bde068c290
+------------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Interface | Result | Reason |
+------------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| bios | False | Driver ipmi does not support bios (disabled or not implemented). |
| boot | False | Cannot validate image information for node e322f49a-ad50-468d-a031-29bde068c290 because one or more parameters are missing from its instance_info and insufficent information is present to boot from a remote volume. Missing are: ['ramdisk', 'kernel', 'image_source'] |
| console | False | Missing 'ipmi_terminal_port' parameter in node's driver_info. |
| deploy | False | Cannot validate image information for node e322f49a-ad50-468d-a031-29bde068c290 because one or more parameters are missing from its instance_info and insufficent information is present to boot from a remote volume. Missing are: ['ramdisk', 'kernel', 'image_source'] |
| inspect | True | |
| management | True | |
| network | True | |
| power | True | |
| raid | True | |
| rescue | False | Driver ipmi does not support rescue (disabled or not implemented). |
| storage | True | |
+------------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

NOTE:这里出现了 4 个 False,但没有致命影响,其中 bios、console 是因为我们没有提供相应的驱动,属于可选项。而 boot、deploy 在 Nova Driver for Ironic 的环境中是无法通过验证的。

  • 验证 ironic node 是否能够被纳管
1
2
3
4
5
6
7
8
9
# To move a node from enroll to manageable provision state
$ openstack baremetal node manage 5e10c067-c0c6-48be-ac6e-749a70045b97
$ openstack baremetal node show 5e10c067-c0c6-48be-ac6e-749a70045b97 | grep provision_state
| provision_state | manageable

# To move a node from manageable to available provision state
$ openstack baremetal node provide 5e10c067-c0c6-48be-ac6e-749a70045b97
$ [root@controller ~]# openstack baremetal node show 5e10c067-c0c6-48be-ac6e-749a70045b97 | grep provision_state
| provision_state | available
  • 查看当前的 ironic node 状态
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
[root@controller ~]# openstack baremetal node show 5e10c067-c0c6-48be-ac6e-749a70045b97
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| allocation_uuid | None |
| automated_clean | None |
| bios_interface | no-bios |
| boot_interface | pxe |
| chassis_uuid | None |
| clean_step | {} |
| conductor | openstack08.control |
| conductor_group | |
| console_enabled | False |
| console_interface | ipmitool-socat |
| created_at | 2021-04-25T10:13:09+00:00 |
| deploy_interface | direct |
| deploy_step | {} |
| description | None |
| driver | ipmi |
| driver_info | {'ipmi_port': 623, 'ipmi_username': 'admin', 'deploy_kernel': '07852c3c-f4b4-43ab-b6b3-aeafe12c07d4', 'ipmi_address': '172.18.22.106', 'deploy_ramdisk': 'e3429cf3-8fed-4ff8-afb0-50056aacc8f0', 'ipmi_password': '******', 'provisioning_network': '7fa970fd-60c7-4f7e-83a2-cd611470dfc6', 'cleaning_network': '7fa970fd-60c7-4f7e-83a2-cd611470dfc6'} |
| driver_internal_info | {} |
| extra | {} |
| fault | None |
| inspect_interface | no-inspect |
| inspection_finished_at | None |
| inspection_started_at | None |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | False |
| maintenance_reason | None |
| management_interface | ipmitool |
| name | BM01 |
| network_interface | neutron |
| owner | None |
| power_interface | ipmitool |
| power_state | power on |
| properties | {} |
| protected | False |
| protected_reason | None |
| provision_state | available |
| provision_updated_at | 2021-04-25T10:30:35+00:00 |
| raid_config | {} |
| raid_interface | no-raid |
| rescue_interface | no-rescue |
| reservation | None |
| resource_class | BAREMETAL |
| storage_interface | noop |
| target_power_state | None |
| target_provision_state | None |
| target_raid_config | {} |
| traits | ['HW_CPU_X86_VMX'] |
| updated_at | 2021-04-25T10:30:35+00:00 |
| uuid | 5e10c067-c0c6-48be-ac6e-749a70045b97 |
| vendor_interface | ipmitool |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
[root@17471e0f8de8 sh1]#
  • 验证,自动添加了 Ironic Neutron Agent
1
2
[root@baremetal ~]# openstack network agent list
2ecbcf17-b11b-4e6a-87b5-349dde0fcee3 | Baremetal Node | 5e10c067-c0c6-48be-ac6e-749a70045b97 | None | :-) | UP | ironic-neutron-agent

参考:https://docs.openstack.org/ironic/stein/admin/boot-from-volume.html#storage-interface

openstack-裸金属ironic整理-7

PXE概述

PXE 是描述客户端-服务器与网络引导软件交互的行业标准,并使用 DHCP 和 TFTP 协议。本指南介绍了一种使用 PXE 环境安装 Clear Linux OS 的方法。

名为 iPXE 的 PXE 扩展新增了对 HTTP、iSCSI、AoE 和 FCoE 等协议的支持。iPXE 支持在没有内置 PXE 支持的计算机上使用网络引导。

要通过 iPXE 安装 Clear Linux OS,必须创建一个 PXE 客户端。图 1 描述了 PXE 服务器和 PXE 客户端之间的信息流。

PXE information flow

看一下 ironic dnsmasq配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#cat /etc/kolla/ironic-dnsmasq/dnsmasq.conf

port=0
interface=ironic_port
dhcp-range=172.66.0.200,172.66.0.250
dhcp-option=option:tftp-server,172.66.0.235
dhcp-option=option:server-ip-address,172.66.0.235
bind-interfaces
dhcp-sequential-ip
dhcp-option=210,/tftpboot/
dhcp-match=ipxe,175
dhcp-match=set:efi,option:client-arch,7
dhcp-match=set:efi,option:client-arch,9
# Client is already running iPXE; move to next stage of chainloading
dhcp-option=tag:ipxe,option:bootfile-name,http://10.145.69.8:8089/inspector.ipxe
# Client is PXE booting over EFI without iPXE ROM,
# send EFI version of iPXE chainloader
dhcp-option=tag:efi,tag:!ipxe,option:bootfile-name,ipxe.efi
dhcp-option=option:bootfile-name,pxelinux.0
dhcp-hostsdir=/etc/dnsmasq/dhcp-hostsdir


openstack-裸金属ironic整理-6

Direct Deploy UML

*With direct deploy interface, the deploy ramdisk fetches the image from an HTTP location. It can be an object storage (swift or RadosGW) temporary URL or a user-provided HTTP URL. The deploy ramdisk then copies the image to the target disk. *

1
2
openstack baremetal node create --driver ipmi --deploy-interface direct
openstack baremetal node set <NODE> --deploy-interface direct

在这里插入图片描述

在这里插入图片描述

  1. 加入 Provision Network。
  2. 重启裸机。
  3. IPA 通过 lookup 从 Ironic Conductor 取得它对应的 ironic_node_uuid。
  4. IPA 通过 Heartbeat 与 Ironic 通信,驱动部署流程触发 ironic-conductor 发送 prepare_image 命令到 IPA,让 IPA 直接(Direct)下载并注入用户镜像到本地磁盘。写镜像是一个耗时较长的动作,通常 300GB 大小的镜像需要十分钟左右。
  5. ironic-conductor 收到 Heartbeat 消息后(从 ironic-api 过来的 RPC 消息),根据当前状态的不同有相应的处理。第一个 Heartbeat 消息会触发 ironic-conductor 下发 prepare_image 到 IPA,以后的 Heartbeat 消息会触发 ironic-conductor 查询 IPA 上的 command status,它的目的是要监控 prepare_image 等 command 是否还在运行中。一旦 command status 显示命令结束。就会继续下一步部署操作,包括设置物理服务器从磁盘启动、关机、切换网络、开机。知道部署流程结束。

mysql-排查故障常用命令

整理的常用mysql排查命令

mysql 8.0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## 当前运行的所有事务
mysql> select * from information_schema.innodb_trx\G;
## 查看锁使用情况
mysql> show status like 'innodb_row_lock_%';
+-------------------------------+--------+
| Variable_name | Value |
+-------------------------------+--------+
| Innodb_row_lock_current_waits | 1 |
| Innodb_row_lock_time | 479764 |
| Innodb_row_lock_time_avg | 39980 |
| Innodb_row_lock_time_max | 51021 |
| Innodb_row_lock_waits | 12 |
+-------------------------------+--------+
5 rows in set (0.00 sec)

解释如下:
Innodb_row_lock_current_waits : 当前等待锁的数量
Innodb_row_lock_time : 系统启动到现在,锁定的总时间长度
Innodb_row_lock_time_avg : 每次平均锁定的时间
Innodb_row_lock_time_max : 最长一次锁定时间
Innodb_row_lock_waits : 系统启动到现在总共锁定的次数

# 查询是否锁表
mysql> show OPEN TABLES where In_use > 0;
+----------+-------+--------+-------------+
| Database | Table | In_use | Name_locked |
+----------+-------+--------+-------------+
| test | tx1 | 1 | 0 |
+----------+-------+--------+-------------+
1 row in set (0.00 sec)

go-map赋值前要先初始化:assignment to entry in nil map

注意这种map的嵌套的形式,make只初始化了map[string]T部分(T为map[int]int),所以下面的赋值会出现错误:

错误用法

1
2
test := make(map[string]map[int]int)
test["go"][0] = 0 // error

正确用法

1
2
3
4

test := make(map[string]map[int]int)
test["go"] = make(map[int]int)
test["go"][0] = 0

常用用法

1
2
3
4
5
test := make(map[string]map[int]int)
if test["go"] == nil {
test["go"] = make(map[int]int)
}
test["go"][0] = 0