qemu源码解析一

基于qemu9.0.0

简介

QEMU是一个开源的虚拟化软件,它能够模拟各种硬件设备,支持多种虚拟化技术,如TCG、Xen、KVM等

TCG 是 QEMU 中的一个组件,它可以将高级语言编写的代码(例如 C 代码)转换为可在虚拟机中执行的低级代码(例如 x86 机器指令)。TCG 生成的代码通常比直接使用 CPU 指令更简单、更小,但执行速度可能稍慢。同时,TCG不仅可以将高级语言代码转换为低级代码,还可以执行其他优化,例如常量折叠和死代码消除。

Xen 是一种开源虚拟化技术,它直接嵌入到 Linux 内核中。这意味着 Xen 可以直接访问硬件资源,从而提供高性能的虚拟化。然而,Xen 的配置和管理可能比较复杂。Xen 支持准虚拟化,这允许客户机操作系统直接访问某些硬件资源,从而提高性能。

KVM 是 QEMU 中最常使用的一种虚拟化技术,它利用 Linux 内核提供的虚拟化功能。KVM 的优势在于其因为它提供了良好的性能和广泛的操作系统支持。但是要注意一点的是:KVM 依赖于 Linux 内核提供的虚拟化功能,因此它仅适用于 Linux 主机操作系统在 QEMU 中,KVM 的初始化过程主要包括以下步骤:

         **加载虚拟机监控器模块:**首先,需要加载 KVM 模块,以便在内核中启用虚拟化功能。这一步通常在系统启动时完成。

         **创建虚拟机:**接下来,使用 QEMU 命令或 API 创建一个新的虚拟机实例。在创建过程中,需要指定虚拟机的配置参数,例如内存大小、CPU 数量等。

         **分配资源:**在虚拟机创建后,需要为其分配所需的资源,包括 CPU、内存和设备。这些资源由物理硬件提供,并通过虚拟化技术映射到虚拟机上。

         **启动虚拟机:**一旦资源分配完成,就可以启动虚拟机了。这时,KVM 将接管虚拟机的执行,并将其与物理硬件隔离。

         **执行客户机操作系统:**客户机操作系统现在可以在虚拟机中执行,就好像它直接运行在物理硬件上一样。

相关功能的源码在

target/$(arch)/kvm.c(tcg/)

QEMU 可以模拟几百个设备:

QEMU 所有支持的机器类型QEMU 可以模拟的设备QEMU 在设备模拟上采取了前端和后端分离的设计模式:

        前端:
                QEMU 虚拟机管理器:负责管理虚拟机实例和提供用户界面。
        ARM 虚拟化扩展 (VE):在 ARM 处理器上提供虚拟化支持。
        后端:
                ARM CPU 模型:模拟 ARM 处理器,包括指令集、寄存器和内存管理单元 (MMU)。
                ARM 虚拟 I/O 设备模型:模拟 ARM 架构中的通用虚拟 I/O 设备,例如:virtio-blk:模                    拟虚拟块设备
                virtio-net:模拟虚拟网络接口
                virtio-serial:模拟虚拟串行端口

检查可以支持的后端的方法(字符和网络):

QEMU 初始化过程分析

select_machine函数(选择机器类型)

/system/vl.c

此函数用于选择要运行的机器类型。它从命令行选项或默认值中获取机器类型,然后返回所选机器的 MachineClass 结构。

static MachineClass *select_machine(QDict *qdict, Error **errp)
{const char *machine_type = qdict_get_try_str(qdict, "type");GSList *machines = object_class_get_list(TYPE_MACHINE, false);MachineClass *machine_class;Error *local_err = NULL;if (machine_type) {machine_class = find_machine(machine_type, machines);qdict_del(qdict, "type");if (!machine_class) {error_setg(&local_err, "unsupported machine type");}} else {machine_class = find_default_machine(machines);if (!machine_class) {error_setg(&local_err, "No machine specified, and there is no default");}}g_slist_free(machines);if (local_err) {error_append_hint(&local_err, "Use -machine help to list supported machines\n");error_propagate(errp, local_err);}return machine_class;
}

cpu_exec_init_all(初始化所有 CPU 的执行引擎)

io_mem_init函数

此函数初始化 I/O 内存区域。,可见其调用了memory_region_init_io()函数

static void io_mem_init(void)
{memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,NULL, UINT64_MAX);
}

 memory_region_init_io函数

  1. 调用 memory_region_init 函数初始化内存区域的公共部分。
  2. 设置内存区域的操作集。如果未指定操作集,则使用 unassigned_mem_ops 默认操作集。
  3. 设置内存区域的不透明数据指针。
  4. 将内存区域标记为终止区域。这意味着当内存区域被销毁时,它将自动从其父区域中删除。

 /system/memory.c

void memory_region_init_io(MemoryRegion *mr,Object *owner,const MemoryRegionOps *ops,void *opaque,const char *name,uint64_t size)
{memory_region_init(mr, owner, name, size);mr->ops = ops ? ops : &unassigned_mem_ops;mr->opaque = opaque;mr->terminates = true;
}

memory_region_init 函数

用于初始化 MemoryRegion 结构,调用了object_initialize 函数和memory_region_do_init函数

object_initialize 函数用于初始化一个对象。它执行以下操作:

  1. 分配对象的内存。
  2. 设置对象的类型。
  3. 设置对象的父对象(如果存在)。
  4. 调用对象的 init 函数(如果存在)。

  /system/memory.c

void memory_region_init(MemoryRegion *mr,Object *owner,const char *name,uint64_t size)
{object_initialize(mr, sizeof(*mr), TYPE_MEMORY_REGION);memory_region_do_init(mr, owner, name, size);
}

 memory_region_do_init函数

它执行以下操作:

  1. 设置内存区域的大小。如果大小为 UINT64_MAX,则将其设置为 INT128_MAX
  2. 设置内存区域的名称。
  3. 设置内存区域的所有者对象。
  4. 设置内存区域的设备状态对象(如果所有者对象是设备)。
  5. 设置内存区域的 RAM 块(如果存在)。
  6. 如果内存区域有名称,则将其添加到其所有者的子对象列表中。
static void memory_region_do_init(MemoryRegion *mr,Object *owner,const char *name,uint64_t size)
{mr->size = int128_make64(size);if (size == UINT64_MAX) {mr->size = int128_2_64();}mr->name = g_strdup(name);mr->owner = owner;mr->dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);mr->ram_block = NULL;if (name) {char *escaped_name = memory_region_escape_name(name);char *name_array = g_strdup_printf("%s[*]", escaped_name);if (!owner) {owner = container_get(qdev_get_machine(), "/unattached");}object_property_add_child(owner, name_array, OBJECT(mr));object_unref(OBJECT(mr));g_free(name_array);g_free(escaped_name);}
}

补充:

MemoryRegion 是 QEMU 中表示内存区域的抽象数据结构。它提供了一个统一的接口来访问和操作不同的类型的内存,例如物理内存、I/O 内存和设备内存。可以将 MemoryRegion 想象成一个计算机中的内存块。它有一个名称、大小和地址。你可以通过 MemoryRegion 的接口来读取和写入内存块中的数据,也可以设置回调函数来处理对内存块的访问。

/include/exec/memory.h

memory_map_init函数

  1. 分配内存:

    分配内存用于系统内存和 I/O 空间。
  2. 初始化内存区域:

    使用 memory_region_init 函数初始化系统内存区域。使用 memory_region_init_io 函数初始化 I/O 空间区域。
  3. 初始化地址空间:

    使用 address_space_init 函数初始化用于访问系统内存和 I/O 空间的地址空间。

/system/physmem.c

static void memory_map_init(void)
{system_memory = g_malloc(sizeof(*system_memory));memory_region_init(system_memory, NULL, "system", UINT64_MAX);address_space_init(&address_space_memory, system_memory, "memory");system_io = g_malloc(sizeof(*system_io));memory_region_init_io(system_io, NULL, &unassigned_io_ops, NULL, "io",65536);address_space_init(&address_space_io, system_io, "I/O");
}

 通俗的讲:QEMU 是一个城市,而内存映射是城市的地图。memory_map_init 函数负责创建这个地图,它定义了城市中不同区域(内存和 I/O 空间)的位置和大小。system_memory 和 io_memory 是两个容器,分别代表城市中的住宅区(内存)和商业区(I/O 空间)。address_space_io 和 address_space_memory 是两张地图,分别显示如何到达住宅区和商业区。

page_size_init(初始化页大小)

configure_accelerator(配置加速器)

accel_init_machine函数

  • 将加速器与虚拟机关联起来
  • 调用加速器的 init_machine 函数来进行特定于加速器的初始化
  • 设置加速器的兼容性属性

/accl/accl-system.c

int accel_init_machine(AccelState *accel, MachineState *ms)
{AccelClass *acc = ACCEL_GET_CLASS(accel);int ret;ms->accelerator = accel;*(acc->allowed) = true;ret = acc->init_machine(ms);if (ret < 0) {ms->accelerator = NULL;*(acc->allowed) = false;object_unref(OBJECT(accel));} else {object_set_accelerator_compat_props(acc->compat_props);}return ret;
}

machine_run_board_init函数(初始化机器)

machine_run_board_init 函数负责初始化虚拟机的硬件平台。

  • 检查虚拟机的内存大小是否有效
  • 创建默认的内存后端(如果需要)
  • 完成 NUMA 配置
  • 创建虚拟机的 RAM
  • 检查 CPU 类型是否受支持
  • 初始化加速器接口
  • 调用虚拟机类的 init 函数
  • 推进虚拟机生命周期到 PHASE_MACHINE_INITIALIZED 阶段

/hw/core/machine.c

void machine_run_board_init(MachineState *machine, const char *mem_path, Error **errp)
{ERRP_GUARD();MachineClass *machine_class = MACHINE_GET_CLASS(machine);/* This checkpoint is required by replay to separate prior clockreading from the other reads, because timer polling functions queryclock values from the log. */replay_checkpoint(CHECKPOINT_INIT);if (!xen_enabled()) {/* On 32-bit hosts, QEMU is limited by virtual address space */if (machine->ram_size > (2047 << 20) && HOST_LONG_BITS == 32) {error_setg(errp, "at most 2047 MB RAM can be simulated");return;}}if (machine->memdev) {ram_addr_t backend_size = object_property_get_uint(OBJECT(machine->memdev),"size",  &error_abort);if (backend_size != machine->ram_size) {error_setg(errp, "Machine memory size does not match the size of the memory backend");return;}} else if (machine_class->default_ram_id && machine->ram_size &&numa_uses_legacy_mem()) {if (object_property_find(object_get_objects_root(),machine_class->default_ram_id)) {error_setg(errp, "object's id '%s' is reserved for the default"" RAM backend, it can't be used for any other purposes",machine_class->default_ram_id);error_append_hint(errp,"Change the object's 'id' to something else or disable"" automatic creation of the default RAM backend by setting"" 'memory-backend=%s' with '-machine'.\n",machine_class->default_ram_id);return;}if (!create_default_memdev(current_machine, mem_path, errp)) {return;}}if (machine->numa_state) {numa_complete_configuration(machine);if (machine->numa_state->num_nodes) {machine_numa_finish_cpu_init(machine);if (machine_class->cpu_cluster_has_numa_boundary) {validate_cpu_cluster_to_numa_boundary(machine);}}}if (!machine->ram && machine->memdev) {machine->ram = machine_consume_memdev(machine, machine->memdev);}/* Check if the CPU type is supported */if (machine->cpu_type && !is_cpu_type_supported(machine, errp)) {return;}if (machine->cgs) {/** With confidential guests, the host can't see the real* contents of RAM, so there's no point in it trying to merge* areas.*/machine_set_mem_merge(OBJECT(machine), false, &error_abort);/** Virtio devices can't count on directly accessing guest* memory, so they need iommu_platform=on to use normal DMA* mechanisms.  That requires also disabling legacy virtio* support for those virtio pci devices which allow it.*/object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy","on", true);object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform","on", false);}accel_init_interfaces(ACCEL_GET_CLASS(machine->accelerator));machine_class->init(machine);phase_advance(PHASE_MACHINE_INITIALIZED);
}

pc_init1函数

该函数初始化 PC 特定的设置,包括创建 CPU 和内存。

  1. 内存分配和 ROM/BIOS 加载

    • 为 RAM 分配内存并从 ROM/BIOS 加载固件。
    • 如果启用了 Xen,则使用 Xen 特定的内存设置。
  2. PCI 总线初始化(如果启用)

    • 创建 PCI 主桥设备并将其连接到系统内存、I/O 和 PCI 内存。
    • 设置 PCI 总线大小和 PCI 孔位 64 位地址空间大小。
    • 将 PCI 设备映射到中断请求 (IRQ)。
  3. ISA 总线初始化(如果 PCI 未启用)

    • 创建 ISA 总线并将其连接到系统内存和 I/O。
    • 注册 ISA 总线输入 IRQ。
  4. 基本设备初始化

    • 初始化基本 PC 硬件,包括:
      • 实时时钟 (RTC)
      • 可编程中断控制器 (PIC)
      • 串口和并口
      • 超级 I/O 设备
  5. 网络设备初始化

    • 根据机器类型初始化网络设备。
  6. IDE 设备初始化(如果 ISA 总线启用)

    • 初始化 IDE 控制器和设备。
  7. ACPI 初始化(如果启用)

    • 创建 ACPI 设备并将其连接到 SMBus 和 SMI 中断。
  8. NV DIMM 初始化(如果启用)

    • 初始化 NV DIMM ACPI 状态,使其与系统 I/O 和固件配置表 (FW_CFG) 交互。
  9. 其他设备初始化

    • 初始化 VGA 控制器。
    • 根据配置设置虚拟机端口 (VMP)。

/hw/i386/pc_piix.c

/* PC hardware initialisation */
static void pc_init1(MachineState *machine, const char *pci_type)
{PCMachineState *pcms = PC_MACHINE(machine);PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);X86MachineState *x86ms = X86_MACHINE(machine);MemoryRegion *system_memory = get_system_memory();MemoryRegion *system_io = get_system_io();Object *phb = NULL;ISABus *isa_bus;Object *piix4_pm = NULL;qemu_irq smi_irq;GSIState *gsi_state;MemoryRegion *ram_memory;MemoryRegion *pci_memory = NULL;MemoryRegion *rom_memory = system_memory;ram_addr_t lowmem;uint64_t hole64_size = 0;/** Calculate ram split, for memory below and above 4G.  It's a bit* complicated for backward compatibility reasons ...**  - Traditional split is 3.5G (lowmem = 0xe0000000).  This is the*    default value for max_ram_below_4g now.**  - Then, to gigabyte align the memory, we move the split to 3G*    (lowmem = 0xc0000000).  But only in case we have to split in*    the first place, i.e. ram_size is larger than (traditional)*    lowmem.  And for new machine types (gigabyte_align = true)*    only, for live migration compatibility reasons.**  - Next the max-ram-below-4g option was added, which allowed to*    reduce lowmem to a smaller value, to allow a larger PCI I/O*    window below 4G.  qemu doesn't enforce gigabyte alignment here,*    but prints a warning.**  - Finally max-ram-below-4g got updated to also allow raising lowmem,*    so legacy non-PAE guests can get as much memory as possible in*    the 32bit address space below 4G.**  - Note that Xen has its own ram setup code in xen_ram_init(),*    called via xen_hvm_init_pc().** Examples:*    qemu -M pc-1.7 -m 4G    (old default)    -> 3584M low,  512M high*    qemu -M pc -m 4G        (new default)    -> 3072M low, 1024M high*    qemu -M pc,max-ram-below-4g=2G -m 4G     -> 2048M low, 2048M high*    qemu -M pc,max-ram-below-4g=4G -m 3968M  -> 3968M low (=4G-128M)*/if (xen_enabled()) {xen_hvm_init_pc(pcms, &ram_memory);} else {ram_memory = machine->ram;if (!pcms->max_ram_below_4g) {pcms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */}lowmem = pcms->max_ram_below_4g;if (machine->ram_size >= pcms->max_ram_below_4g) {if (pcmc->gigabyte_align) {if (lowmem > 0xc0000000) {lowmem = 0xc0000000;}if (lowmem & (1 * GiB - 1)) {warn_report("Large machine and max_ram_below_4g ""(%" PRIu64 ") not a multiple of 1G; ""possible bad performance.",pcms->max_ram_below_4g);}}}if (machine->ram_size >= lowmem) {x86ms->above_4g_mem_size = machine->ram_size - lowmem;x86ms->below_4g_mem_size = lowmem;} else {x86ms->above_4g_mem_size = 0;x86ms->below_4g_mem_size = machine->ram_size;}}pc_machine_init_sgx_epc(pcms);x86_cpus_init(x86ms, pcmc->default_cpu_version);if (kvm_enabled()) {kvmclock_create(pcmc->kvmclock_create_always);}if (pcmc->pci_enabled) {pci_memory = g_new(MemoryRegion, 1);memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);rom_memory = pci_memory;phb = OBJECT(qdev_new(TYPE_I440FX_PCI_HOST_BRIDGE));object_property_add_child(OBJECT(machine), "i440fx", phb);object_property_set_link(phb, PCI_HOST_PROP_RAM_MEM,OBJECT(ram_memory), &error_fatal);object_property_set_link(phb, PCI_HOST_PROP_PCI_MEM,OBJECT(pci_memory), &error_fatal);object_property_set_link(phb, PCI_HOST_PROP_SYSTEM_MEM,OBJECT(system_memory), &error_fatal);object_property_set_link(phb, PCI_HOST_PROP_IO_MEM,OBJECT(system_io), &error_fatal);object_property_set_uint(phb, PCI_HOST_BELOW_4G_MEM_SIZE,x86ms->below_4g_mem_size, &error_fatal);object_property_set_uint(phb, PCI_HOST_ABOVE_4G_MEM_SIZE,x86ms->above_4g_mem_size, &error_fatal);object_property_set_str(phb, I440FX_HOST_PROP_PCI_TYPE, pci_type,&error_fatal);sysbus_realize_and_unref(SYS_BUS_DEVICE(phb), &error_fatal);pcms->pcibus = PCI_BUS(qdev_get_child_bus(DEVICE(phb), "pci.0"));pci_bus_map_irqs(pcms->pcibus,xen_enabled() ? xen_pci_slot_get_pirq: pc_pci_slot_get_pirq);hole64_size = object_property_get_uint(phb,PCI_HOST_PROP_PCI_HOLE64_SIZE,&error_abort);}/* allocate ram and load rom/bios */if (!xen_enabled()) {pc_memory_init(pcms, system_memory, rom_memory, hole64_size);} else {assert(machine->ram_size == x86ms->below_4g_mem_size +x86ms->above_4g_mem_size);pc_system_flash_cleanup_unused(pcms);if (machine->kernel_filename != NULL) {/* For xen HVM direct kernel boot, load linux here */xen_load_linux(pcms);}}gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);if (pcmc->pci_enabled) {PCIDevice *pci_dev;DeviceState *dev;size_t i;pci_dev = pci_new_multifunction(-1, pcms->south_bridge);object_property_set_bool(OBJECT(pci_dev), "has-usb",machine_usb(machine), &error_abort);object_property_set_bool(OBJECT(pci_dev), "has-acpi",x86_machine_is_acpi_enabled(x86ms),&error_abort);object_property_set_bool(OBJECT(pci_dev), "has-pic", false,&error_abort);object_property_set_bool(OBJECT(pci_dev), "has-pit", false,&error_abort);qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);object_property_set_bool(OBJECT(pci_dev), "smm-enabled",x86_machine_is_smm_enabled(x86ms),&error_abort);dev = DEVICE(pci_dev);for (i = 0; i < ISA_NUM_IRQS; i++) {qdev_connect_gpio_out_named(dev, "isa-irqs", i, x86ms->gsi[i]);}pci_realize_and_unref(pci_dev, pcms->pcibus, &error_fatal);if (xen_enabled()) {pci_device_set_intx_routing_notifier(pci_dev, piix_intx_routing_notifier_xen);/** Xen supports additional interrupt routes from the PCI devices to* the IOAPIC: the four pins of each PCI device on the bus are also* connected to the IOAPIC directly.* These additional routes can be discovered through ACPI.*/pci_bus_irqs(pcms->pcibus, xen_intx_set_irq, pci_dev,XEN_IOAPIC_NUM_PIRQS);}isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));x86ms->rtc = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),"rtc"));piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));pci_ide_create_devs(PCI_DEVICE(dev));pcms->idebus[0] = qdev_get_child_bus(dev, "ide.0");pcms->idebus[1] = qdev_get_child_bus(dev, "ide.1");} else {isa_bus = isa_bus_new(NULL, system_memory, system_io,&error_abort);isa_bus_register_input_irqs(isa_bus, x86ms->gsi);x86ms->rtc = isa_new(TYPE_MC146818_RTC);qdev_prop_set_int32(DEVICE(x86ms->rtc), "base_year", 2000);isa_realize_and_unref(x86ms->rtc, isa_bus, &error_fatal);i8257_dma_init(OBJECT(machine), isa_bus, 0);pcms->hpet_enabled = false;}if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {pc_i8259_create(isa_bus, gsi_state->i8259_irq);}if (phb) {ioapic_init_gsi(gsi_state, phb);}if (tcg_enabled()) {x86_register_ferr_irq(x86ms->gsi[13]);}pc_vga_init(isa_bus, pcmc->pci_enabled ? pcms->pcibus : NULL);assert(pcms->vmport != ON_OFF_AUTO__MAX);if (pcms->vmport == ON_OFF_AUTO_AUTO) {pcms->vmport = xen_enabled() ? ON_OFF_AUTO_OFF : ON_OFF_AUTO_ON;}/* init basic PC hardware */pc_basic_device_init(pcms, isa_bus, x86ms->gsi, x86ms->rtc, true,0x4);pc_nic_init(pcmc, isa_bus, pcms->pcibus);#ifdef CONFIG_IDE_ISAif (!pcmc->pci_enabled) {DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];int i;ide_drive_get(hd, ARRAY_SIZE(hd));for (i = 0; i < MAX_IDE_BUS; i++) {ISADevice *dev;char busname[] = "ide.0";dev = isa_ide_init(isa_bus, ide_iobase[i], ide_iobase2[i],ide_irq[i],hd[MAX_IDE_DEVS * i], hd[MAX_IDE_DEVS * i + 1]);/** The ide bus name is ide.0 for the first bus and ide.1 for the* second one.*/busname[4] = '0' + i;pcms->idebus[i] = qdev_get_child_bus(DEVICE(dev), busname);}}
#endifif (piix4_pm) {smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));/* TODO: Populate SPD eeprom data.  */smbus_eeprom_init(pcms->smbus, 8, NULL, 0);object_property_add_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,TYPE_HOTPLUG_HANDLER,(Object **)&x86ms->acpi_dev,object_property_allow_set_link,OBJ_PROP_LINK_STRONG);object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,piix4_pm, &error_abort);}if (machine->nvdimms_state->is_enabled) {nvdimm_init_acpi_state(machine->nvdimms_state, system_io,x86_nvdimm_acpi_dsmio,x86ms->fw_cfg, OBJECT(pcms));}
}
创建和初始化CPU
  1. pc_init1:初始化 PC 特定的设置,包括创建 CPU 和内存。
  2. x86_cpus_init:根据配置创建和初始化多个 CPU。
  3. x86_cpu_new:创建一个新的 X86CPU 设备。
  4. qdev_realize:经过 QOM 的 object_property 机制,最后调用到 device_set_realized
  5. device_set_realized:标记设备已实现,并调用设备的 realize 函数。
  6. x86_cpu_realizefn:X86CPU 设备的 realize 函数,负责初始化 CPU 的寄存器、内存映射和中断。
x86_cpus_init函数
  1. 设置默认 CPU 版

  2. 计算 CPU APIC ID 限制(计算 CPU APIC ID 的最大值,以确保所有 CPU APIC ID 都小于此限制)

  3. 检查 APIC ID 255 或更高(如果启用了 KVM 并且 APIC ID 限制大于 255,则检查是否启用了内核中的 lapic 和 X2APIC 用户空间 API)

  4. 设置 KVM 最大 APIC ID(如果启用了 KVM,则设置 KVM 的最大 APIC ID)

  5. 设置 APIC 最大 APIC ID(如果内核中没有 irqchip,则设置 APIC 的最大 APIC ID)

  6. 获取可能的 CPU 架构 ID 列表(获取机器类支持的可能 CPU 架构 ID 列表)

  7. 创建 CPU(对于每个 CPU,创建并初始化一个新的 CPU)

/hw/i386/x86.c

void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
{int i;const CPUArchIdList *possible_cpus;MachineState *ms = MACHINE(x86ms);MachineClass *mc = MACHINE_GET_CLASS(x86ms);x86_cpu_set_default_version(default_cpu_version);/** Calculates the limit to CPU APIC ID values** Limit for the APIC ID value, so that all* CPU APIC IDs are < x86ms->apic_id_limit.** This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().*/x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,ms->smp.max_cpus - 1) + 1;/** Can we support APIC ID 255 or higher?  With KVM, that requires* both in-kernel lapic and X2APIC userspace API.** kvm_enabled() must go first to ensure that kvm_* references are* not emitted for the linker to consume (kvm_enabled() is* a literal `0` in configurations where kvm_* aren't defined)*/if (kvm_enabled() && x86ms->apic_id_limit > 255 &&kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {error_report("current -smp configuration requires kernel ""irqchip and X2APIC API support.");exit(EXIT_FAILURE);}if (kvm_enabled()) {kvm_set_max_apic_id(x86ms->apic_id_limit);}if (!kvm_irqchip_in_kernel()) {apic_set_max_apic_id(x86ms->apic_id_limit);}possible_cpus = mc->possible_cpu_arch_ids(ms);for (i = 0; i < ms->smp.cpus; i++) {x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);}
}
x86_cpu_new函数
  1. 创建 CPU 对象

  2. 设置 APIC ID

  3. 实现 CPU

  4. 清理(取消引用 CPU 对象)

/hw/i386/x86.c

void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
{Object *cpu = object_new(MACHINE(x86ms)->cpu_type);if (!object_property_set_uint(cpu, "apic-id", apic_id, errp)) {goto out;}qdev_realize(DEVICE(cpu), NULL, errp);out:object_unref(cpu);
}
qdev_realize函数

该函数负责实现设备

/hw/i386/x86.c

bool qdev_realize(DeviceState *dev, BusState *bus, Error **errp)
{assert(!dev->realized && !dev->parent_bus);if (bus) {if (!qdev_set_parent_bus(dev, bus, errp)) {return false;}} else {assert(!DEVICE_GET_CLASS(dev)->bus_type);}return object_property_set_bool(OBJECT(dev), "realized", true, errp);
}
device_set_realized函数
  • 设置设备的已实现标志
  • 调用设备类的 realize 函数(如果存在)
  • 调用设备监听器的 realize 函数
  • 设置设备的规范路径
  • 注册设备的 VM 状态(如果存在)
  • 实现设备的子总线
  • 如果设备是热插拔的,则复位设备并将其插入父总线
  • 设置设备的挂起已删除事件标志
  • 调用设备的热插拔处理程序(如果存在)
  • 释放与设备关联的内存
  • 取消实现设备的子总线
  • 取消注册设备的 VM 状态(如果存在)
  • 设置设备的规范路径为 NULL
  • 调用设备类的 unrealize 函数(如果存在)
  • 调用设备监听器的 unrealize 函数
  • 设置设备的已实现标志为 false

/hw/core/qdev.c

static void device_set_realized(Object *obj, bool value, Error **errp)
{DeviceState *dev = DEVICE(obj);DeviceClass *dc = DEVICE_GET_CLASS(dev);HotplugHandler *hotplug_ctrl;BusState *bus;NamedClockList *ncl;Error *local_err = NULL;bool unattached_parent = false;static int unattached_count;if (dev->hotplugged && !dc->hotpluggable) {error_setg(errp, QERR_DEVICE_NO_HOTPLUG, object_get_typename(obj));return;}if (value && !dev->realized) {if (!check_only_migratable(obj, errp)) {goto fail;}if (!obj->parent) {gchar *name = g_strdup_printf("device[%d]", unattached_count++);object_property_add_child(container_get(qdev_get_machine(),"/unattached"),name, obj);unattached_parent = true;g_free(name);}hotplug_ctrl = qdev_get_hotplug_handler(dev);if (hotplug_ctrl) {hotplug_handler_pre_plug(hotplug_ctrl, dev, &local_err);if (local_err != NULL) {goto fail;}}if (dc->realize) {dc->realize(dev, &local_err);if (local_err != NULL) {goto fail;}}DEVICE_LISTENER_CALL(realize, Forward, dev);/** always free/re-initialize here since the value cannot be cleaned up* in device_unrealize due to its usage later on in the unplug path*/g_free(dev->canonical_path);dev->canonical_path = object_get_canonical_path(OBJECT(dev));QLIST_FOREACH(ncl, &dev->clocks, node) {if (ncl->alias) {continue;} else {clock_setup_canonical_path(ncl->clock);}}if (qdev_get_vmsd(dev)) {if (vmstate_register_with_alias_id(VMSTATE_IF(dev),VMSTATE_INSTANCE_ID_ANY,qdev_get_vmsd(dev), dev,dev->instance_id_alias,dev->alias_required_for_version,&local_err) < 0) {goto post_realize_fail;}}/** Clear the reset state, in case the object was previously unrealized* with a dirty state.*/resettable_state_clear(&dev->reset);QLIST_FOREACH(bus, &dev->child_bus, sibling) {if (!qbus_realize(bus, errp)) {goto child_realize_fail;}}if (dev->hotplugged) {/** Reset the device, as well as its subtree which, at this point,* should be realized too.*/resettable_assert_reset(OBJECT(dev), RESET_TYPE_COLD);resettable_change_parent(OBJECT(dev), OBJECT(dev->parent_bus),NULL);resettable_release_reset(OBJECT(dev), RESET_TYPE_COLD);}dev->pending_deleted_event = false;if (hotplug_ctrl) {hotplug_handler_plug(hotplug_ctrl, dev, &local_err);if (local_err != NULL) {goto child_realize_fail;}}qatomic_store_release(&dev->realized, value);} else if (!value && dev->realized) {/** Change the value so that any concurrent users are aware* that the device is going to be unrealized** TODO: change .realized property to enum that states* each phase of the device realization/unrealization*/qatomic_set(&dev->realized, value);/** Ensure that concurrent users see this update prior to* any other changes done by unrealize.*/smp_wmb();QLIST_FOREACH(bus, &dev->child_bus, sibling) {qbus_unrealize(bus);}if (qdev_get_vmsd(dev)) {vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);}if (dc->unrealize) {dc->unrealize(dev);}dev->pending_deleted_event = true;DEVICE_LISTENER_CALL(unrealize, Reverse, dev);}assert(local_err == NULL);return;child_realize_fail:QLIST_FOREACH(bus, &dev->child_bus, sibling) {qbus_unrealize(bus);}if (qdev_get_vmsd(dev)) {vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);}post_realize_fail:g_free(dev->canonical_path);dev->canonical_path = NULL;if (dc->unrealize) {dc->unrealize(dev);}fail:error_propagate(errp, local_err);if (unattached_parent) {/** Beware, this doesn't just revert* object_property_add_child(), it also runs bus_remove()!*/object_unparent(OBJECT(dev));unattached_count--;}
}
x86_cpu_realizefn函数

该函数负责实现 x86 CPU。其主要功能包括:

* 初始化 CPU 状态,包括 APIC ID、Hyper-V 增强功能、CPU 特性等。
* 调用框架实现函数,执行 CPU 特定的初始化。
* 检查主机 CPUID 要求,确保加速器支持请求的特性。
* 设置微码版本、MWAIT 扩展信息、物理位数等 CPU 参数。
* 初始化缓存信息。
* 创建 APIC(仅限 KVM)。
* 初始化机器检查异常 (MCE)。
* 初始化 VCPU。
* 警告超线程问题(如果存在)。
* 实现 APIC(仅限 KVM)。
* 重置 CPU。
* 调用 CPU 类父类的实现函数。
* 释放与 CPU 关联的内存。

/target/i386/cpu.c

static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
{CPUState *cs = CPU(dev);X86CPU *cpu = X86_CPU(dev);X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);CPUX86State *env = &cpu->env;Error *local_err = NULL;static bool ht_warned;unsigned requested_lbr_fmt;#if defined(CONFIG_TCG) && !defined(CONFIG_USER_ONLY)/* Use pc-relative instructions in system-mode */cs->tcg_cflags |= CF_PCREL;
#endifif (cpu->apic_id == UNASSIGNED_APIC_ID) {error_setg(errp, "apic-id property was not initialized properly");return;}/** Process Hyper-V enlightenments.* Note: this currently has to happen before the expansion of CPU features.*/x86_cpu_hyperv_realize(cpu);x86_cpu_expand_features(cpu, &local_err);if (local_err) {goto out;}/** Override env->features[FEAT_PERF_CAPABILITIES].LBR_FMT* with user-provided setting.*/if (cpu->lbr_fmt != ~PERF_CAP_LBR_FMT) {if ((cpu->lbr_fmt & PERF_CAP_LBR_FMT) != cpu->lbr_fmt) {error_setg(errp, "invalid lbr-fmt");return;}env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT;env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;}/** vPMU LBR is supported when 1) KVM is enabled 2) Option pmu=on and* 3)vPMU LBR format matches that of host setting.*/requested_lbr_fmt =env->features[FEAT_PERF_CAPABILITIES] & PERF_CAP_LBR_FMT;if (requested_lbr_fmt && kvm_enabled()) {uint64_t host_perf_cap =x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);unsigned host_lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;if (!cpu->enable_pmu) {error_setg(errp, "vPMU: LBR is unsupported without pmu=on");return;}if (requested_lbr_fmt != host_lbr_fmt) {error_setg(errp, "vPMU: the lbr-fmt value (0x%x) does not match ""the host value (0x%x).",requested_lbr_fmt, host_lbr_fmt);return;}}x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {error_setg(&local_err,accel_uses_host_cpuid() ?"Host doesn't support requested features" :"TCG doesn't support requested features");goto out;}/* On AMD CPUs, some CPUID[8000_0001].EDX bits must match the bits on* CPUID[1].EDX.*/if (IS_AMD_CPU(env)) {env->features[FEAT_8000_0001_EDX] &= ~CPUID_EXT2_AMD_ALIASES;env->features[FEAT_8000_0001_EDX] |= (env->features[FEAT_1_EDX]& CPUID_EXT2_AMD_ALIASES);}x86_cpu_set_sgxlepubkeyhash(env);/** note: the call to the framework needs to happen after feature expansion,* but before the checks/modifications to ucode_rev, mwait, phys_bits.* These may be set by the accel-specific code,* and the results are subsequently checked / assumed in this function.*/cpu_exec_realizefn(cs, &local_err);if (local_err != NULL) {error_propagate(errp, local_err);return;}if (xcc->host_cpuid_required && !accel_uses_host_cpuid()) {g_autofree char *name = x86_cpu_class_get_model_name(xcc);error_setg(&local_err, "CPU model '%s' requires KVM or HVF", name);goto out;}if (cpu->ucode_rev == 0) {/** The default is the same as KVM's. Note that this check* needs to happen after the evenual setting of ucode_rev in* accel-specific code in cpu_exec_realizefn.*/if (IS_AMD_CPU(env)) {cpu->ucode_rev = 0x01000065;} else {cpu->ucode_rev = 0x100000000ULL;}}/** mwait extended info: needed for Core compatibility* We always wake on interrupt even if host does not have the capability.** requires the accel-specific code in cpu_exec_realizefn to* have already acquired the CPUID data into cpu->mwait.*/cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;/* For 64bit systems think about the number of physical bits to present.* ideally this should be the same as the host; anything other than matching* the host can cause incorrect guest behaviour.* QEMU used to pick the magic value of 40 bits that corresponds to* consumer AMD devices but nothing else.** Note that this code assumes features expansion has already been done* (as it checks for CPUID_EXT2_LM), and also assumes that potential* phys_bits adjustments to match the host have been already done in* accel-specific code in cpu_exec_realizefn.*/if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {if (cpu->phys_bits &&(cpu->phys_bits > TARGET_PHYS_ADDR_SPACE_BITS ||cpu->phys_bits < 32)) {error_setg(errp, "phys-bits should be between 32 and %u "" (but is %u)",TARGET_PHYS_ADDR_SPACE_BITS, cpu->phys_bits);return;}/** 0 means it was not explicitly set by the user (or by machine* compat_props or by the host code in host-cpu.c).* In this case, the default is the value used by TCG (40).*/if (cpu->phys_bits == 0) {cpu->phys_bits = TCG_PHYS_ADDR_BITS;}} else {/* For 32 bit systems don't use the user set value, but keep* phys_bits consistent with what we tell the guest.*/if (cpu->phys_bits != 0) {error_setg(errp, "phys-bits is not user-configurable in 32 bit");return;}if (env->features[FEAT_1_EDX] & (CPUID_PSE36 | CPUID_PAE)) {cpu->phys_bits = 36;} else {cpu->phys_bits = 32;}}/* Cache information initialization */if (!cpu->legacy_cache) {const CPUCaches *cache_info =x86_cpu_get_versioned_cache_info(cpu, xcc->model);if (!xcc->model || !cache_info) {g_autofree char *name = x86_cpu_class_get_model_name(xcc);error_setg(errp,"CPU model '%s' doesn't support legacy-cache=off", name);return;}env->cache_info_cpuid2 = env->cache_info_cpuid4 = env->cache_info_amd =*cache_info;} else {/* Build legacy cache information */env->cache_info_cpuid2.l1d_cache = &legacy_l1d_cache;env->cache_info_cpuid2.l1i_cache = &legacy_l1i_cache;env->cache_info_cpuid2.l2_cache = &legacy_l2_cache_cpuid2;env->cache_info_cpuid2.l3_cache = &legacy_l3_cache;env->cache_info_cpuid4.l1d_cache = &legacy_l1d_cache;env->cache_info_cpuid4.l1i_cache = &legacy_l1i_cache;env->cache_info_cpuid4.l2_cache = &legacy_l2_cache;env->cache_info_cpuid4.l3_cache = &legacy_l3_cache;env->cache_info_amd.l1d_cache = &legacy_l1d_cache_amd;env->cache_info_amd.l1i_cache = &legacy_l1i_cache_amd;env->cache_info_amd.l2_cache = &legacy_l2_cache_amd;env->cache_info_amd.l3_cache = &legacy_l3_cache;}#ifndef CONFIG_USER_ONLYMachineState *ms = MACHINE(qdev_get_machine());qemu_register_reset(x86_cpu_machine_reset_cb, cpu);if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || ms->smp.cpus > 1) {x86_cpu_apic_create(cpu, &local_err);if (local_err != NULL) {goto out;}}
#endifmce_init(cpu);qemu_init_vcpu(cs);/** Most Intel and certain AMD CPUs support hyperthreading. Even though QEMU* fixes this issue by adjusting CPUID_0000_0001_EBX and CPUID_8000_0008_ECX* based on inputs (sockets,cores,threads), it is still better to give* users a warning.** NOTE: the following code has to follow qemu_init_vcpu(). Otherwise* cs->nr_threads hasn't be populated yet and the checking is incorrect.*/if (IS_AMD_CPU(env) &&!(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&cs->nr_threads > 1 && !ht_warned) {warn_report("This family of AMD CPU doesn't support ""hyperthreading(%d)",cs->nr_threads);error_printf("Please configure -smp options properly"" or try enabling topoext feature.\n");ht_warned = true;}#ifndef CONFIG_USER_ONLYx86_cpu_apic_realize(cpu, &local_err);if (local_err != NULL) {goto out;}
#endif /* !CONFIG_USER_ONLY */cpu_reset(cs);xcc->parent_realize(dev, &local_err);out:if (local_err != NULL) {error_propagate(errp, local_err);return;}
}
初始化 PC 的内存和固件
  1. 初始化内存并将其添加到系统中。
  2. 加载 BIOS 映像。
  3. 将 BIOS 映像添加到 ROM 列表中。
  4. 将 ROM 列表插入到系统中。
  5. 将 BIOS 的最后 128KB 映射到 ISA 空间。
  6. 将所有 BIOS 映射到内存顶部。
  7. 创建可选 ROM 区域。
  8. 创建 FWCfgState 并初始化参数。
  9. 使用 FWCfgState 初始化全局 fw_cfg。
  10. 如果指定了内核,则加载内核。
  11. 添加 ROM 镜像。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/611359.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

第十二届蓝桥杯真题做题笔记

2、卡片 笔记&#xff1a; 直接巧用排列组合求解即可&#xff1a; 我们通过对样例说明进行分析可知&#xff1a;想要分给n个小孩&#xff0c;那么我们就需要满足C(K, 2) K > n才能满足。 #include<bits/stdc.h> using namespace std;int com(int up, int down){i…

Kafka—ISR机制

ISR机制 Kafka 中的 ISR&#xff08;In-Sync Replicas&#xff09;机制是一种用于确保数据可靠性和一致性的重要机制。ISR 是一组副本&#xff0c;它包括分区的领导者&#xff08;Leader&#xff09;和追随者&#xff08;Follower&#xff09;副本&#xff0c;这些副本与领导者…

element问题总结之el-table使用fixed固定列后滚动条滑动到底部或者最右侧的时候错位问题

el-table使用fixed固定列后滚动条滑动到底部或者最右侧的时候错位 效果图前言解决方案纵向滑动滚动条滑动到底部的错位解决横向滚动条滑动到最右侧的错位解决 效果图 前言 在使用el-table固定行的时候移动滚动条会发现移动到底部或者移动到最右侧的时候会出现表头和内容错位或…

2024-4-11-arm作业

汇编实现三个灯的闪烁 源代码&#xff1a; .text .global _start _start: 时钟使能LDR r0,0x50000A28ldr r1,[r0]orr r1,r1,#(0x3<<4)str r1,[r0]设置PE10输出LDR r0,0x50006000ldr r1,[r0]bic r1,r1,#(0x3<<20)orr r1,r1,#(0x1<<20)str r1,[r0]设置PE1…

Android源码解析之截屏事件流程

今天这篇文章我们主要讲一下Android系统中的截屏事件处理流程。用过android系统手机的同学应该都知道&#xff0c;一般的android手机按下音量减少键和电源按键就会触发截屏事件&#xff08;国内定制机做个修改的这里就不做考虑了&#xff09;。那么这里的截屏事件是如何触发的呢…

test4122

欢迎关注博主 Mindtechnist 或加入【Linux C/C/Python社区】一起学习和分享Linux、C、C、Python、Matlab&#xff0c;机器人运动控制、多机器人协作&#xff0c;智能优化算法&#xff0c;滤波估计、多传感器信息融合&#xff0c;机器学习&#xff0c;人工智能等相关领域的知识和…

Shenandoah GC算法

概述 最早由Red Hat公司发起&#xff0c;目标是利用现代多核CPU的优势&#xff0c;减少大堆内存在GC时产生的停顿时间。随OpenJDK 12一起发布&#xff0c;暂停时间不依赖于堆的大小&#xff1b;这意味着无论堆的大小如何&#xff0c;暂停时间都是差不多的。 Shenandoah最初的…

Multisim仿真二极管、晶体管和场效应管学习笔记

Multisim仿真二极管、晶体管和场效应管 &#xff08;note&#xff1a;使用Multisim14.0版本进行仿真&#xff09; 文章目录 Multisim仿真二极管、晶体管和场效应管二极管的I-V特性晶体管的I-V特性场效应管的I-V特性 二极管的I-V特性 插入I-V analyzer 原理图绘制 改变仿真…

【Python】Python城乡人口数据分析可视化(代码+数据集)【独一无二】

&#x1f449;博__主&#x1f448;&#xff1a;米码收割机 &#x1f449;技__能&#x1f448;&#xff1a;C/Python语言 &#x1f449;公众号&#x1f448;&#xff1a;测试开发自动化【获取源码商业合作】 &#x1f449;荣__誉&#x1f448;&#xff1a;阿里云博客专家博主、5…

c# .net 香橙派 Orangepi GPIO高低电平、上升沿触发\下降沿触发 监听回调方法

c# .net 香橙派GPIO高低电平、上升沿触发\下降沿触发 监听回调方法 通过gpio readall 查看 gpio编码 这里用orangepi zero3 ,gpio= 70为例 当gpio 70 输入高电平时,触发回调 c# .net 代码 方法1: Nuget 包 System.Device.Gpio ,微软官方库对香橙派支持越来越好了,用得…

日程安排组件DHTMLX Scheduler v7.0新版亮点 - 拥有多种全新的主题

DHTMLX Scheduler是一个类似于Google日历的JavaScript日程安排控件&#xff0c;日历事件通过Ajax动态加载&#xff0c;支持通过拖放功能调整事件日期和时间&#xff0c;事件可以按天、周、月三个种视图显示。 备受关注的DHTMLX Scheduler 7.0版本日前正式发布了&#xff0c;如…

为什么你选择成为一名程序员?

逐码探梦&#xff1a;我选择程序员之路 在数字化的纹理中编织梦想&#xff0c;于逻辑的海洋里追寻真理&#xff0c;程序员&#xff0c;这个职业对我而言不仅仅是一份工作&#xff0c;更是一扇通向无限可能性的大门。选择成为一名程序员&#xff0c;是一个交织着兴趣和职业规划…