KVM系统虚拟化性能测试过程总结

buildroot编译

为啥要用buildroot
  1. 支持很多:交叉编译工具链、根文件系统生成、内核映像编译和引导加载程序编译。
  2. 使用简单:使用类似内核的menuconfig、gconfig和xconfig配置界面,使用buildroot构建基本系统很容易。
  3. 支持很多的包:很多benchmark的测试,qemu,kvmtools等都集成在里面。
基本介绍

目录结构

  • config:配置文件
  • dl:下载的软件包
  • output:输出文件
  • package:软件包版本,编译配置信息

配置界面:

主要关注:

  • Target options:用于为构建目标选择特性和配置参数
  • Toolchain:该选项用于配置工具链和编译器特性
  • System configuration:该选项用于配置生成的文件系统的配置文件和启动特性
  • Target packages:该选项用于选择和配置所需要的软件包和软件环境
  • Filesystem images:该选项用于配置经buildroot编译构建后的文件系统的镜像格式

虚拟机根文件系统需设置:

配置保存

配置文件作为.config存储在顶级buildroot源目录中。它是一个完整的配置文件,它包含所有选项的值。efconfig只存储选择了非默认值选项的值,这样更容易阅读、修改,可以用于配置的自动化构建。对于默认的buildroot配置,defconfig是空的,一切都是默认的。

在configs/目录下,有许多已经配置好的*_defconfig,我们可以根据它来生成.config文件。

make *_defconfig

然后再:

make menuconfig

它会覆盖当前的.config文件,如果要保存,则可以使用:

make savedefconfig
升级qemu

默认的qemu不支持cortex-a55,最新的qemu8.2.0则支持。

下载最新的qemu-8.2.0.tar.xz,并把它放入dl目录。

修改package/qemu目录下的qemu.mk:

QEMU_VERSION = 8.2.0
QEMU_SOURCE = qemu-$(QEMU_VERSION).tar.xz
QEMU_SITE = http://download.qemu.org
QEMU_LICENSE = GPL-2.0, LGPL-2.1, MIT, BSD-3-Clause, BSD-2-Clause, Others/BSD-1c
QEMU_LICENSE_FILES = COPYING COPYING.LIB

并且在qemu的编译配置中(搜索宏QEMU_CONFIGURE_CMDS),添加编译参数:

--disable-hexagon-idef-parser

kernel编译

配置
make ARCH=arm64 CROSS_COMPILE=/home/yue/beauty/proj/rk356x_linux_release_v1.3.0b_20221213/prebuilts/gcc/linux-x86/aarch64/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu- menuconfig

选择KVM:

配置console:

关联根文件系统:

编译
ake ARCH=arm64 CROSS_COMPILE=/home/yue/beauty/proj/rk356x_linux_release_v1.3.0b_20221213/prebuilts/gcc/linux-x86/aarch64/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu- j8
输出
  • arch/arm64/boot/Image
  • vmlinux

file一下发现:

  • Image: Linux kernel ARM64 boot executable Image, little-endian, 4K pages
  • vmlinux: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, BuildID[sha1]=dad4c3147e36034c9cb13a786bd8e238f740504b, with debug_info, not stripped

查看编译信息:

  SORTEX  vmlinuxSYSMAP  System.mapOBJCOPY arch/arm64/boot/ImageBuilding modules, stage 2.MODPOST 418 modules

编译报错

在编译buildroot中的dtc库的时候:

error:multiple definition of `yylloc'

屏蔽dtc-parse.tab.c文件中1205行左右的YYLTYPE yylloc,或者extern YYLTYPE yylloc:

/* The semantic value of the lookahead symbol.  */
YYSTYPE yylval;
/* Location data for the lookahead symbol.  */
//YYLTYPE yylloc
# if defined YYLTYPE_IS_TRIVIAL && YYLTYPE_IS_TRIVIAL= { 1, 1, 1, 1 }
# endif
;

在编译buildroot中的ctest库时候:

buildroot/output/build/host-cmake-3.8.2/Source/cmServerProtocol.cxx:626:39: error: ‘numeric_limits’ is not a member of ‘std’

此时需要添加头文件引用:

#include <stdexcept>
#include <limits>

在编译qemu的时候:

FAILED: target/hexagon/idef-parserlink meson-generated_idef-parser.tab.c.o libglib-2.0.so: error adding symbols: file in wrong format

使用file查看这两个文件发现,一个是x86,一个是aarch64

--disable-hexagon-idef-parser

信息查看

内核版本

cat /proc/version
Linux version 4.19.232 (root@yue-yi-machine) ((HEAD: f7165816db073abb32bfe4f754a317d687c7bbcf) (sdk version: rk356x_linux_release_20230710_v1.3.2f.xml) (gcc version 10.3.1 20210621 
root@RK356X:/#

发行版本

cat /etc/issue
Welcome to RK356X Buildroot

CPU信息:

root@RK356X:/# lscpu
Architecture:            aarch64CPU op-mode(s):        32-bit, 64-bitByte Order:            Little Endian
CPU(s):                  4On-line CPU(s) list:   0-3
Vendor ID:               ARMModel name:            Cortex-A55Model:               0Thread(s) per core:  1Core(s) per cluster: 4Socket(s):           -Cluster(s):          1Stepping:            r2p0CPU max MHz:         1992.0000CPU min MHz:         408.0000BogoMIPS:            48.00

另外:cat /proc/cpuinfo可以查看每一个CPU信息。

**内存信息:**单位为MB

root@RK356X:/# free -mtotal        used        free      shared  buff/cache   available
Mem:            3837         106        3567           1         163        3690
Swap:              0           0           0

分区信息

me@ubuntu:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            875M     0  875M   0% /dev
tmpfs           185M  3.0M  182M   2% /run
/dev/mmcblk0p2   29G  4.0G   24G  15% /
tmpfs           924M     0  924M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           924M     0  924M   0% /sys/fs/cgroup
/dev/loop0       58M   58M     0 100% /snap/core20/1614
/dev/loop2       92M   92M     0 100% /snap/lxd/24065
/dev/loop1       62M   62M     0 100% /snap/lxd/22761
/dev/loop3       36M   36M     0 100% /snap/snapd/20674
/dev/mmcblk0p1  253M  121M  132M  48% /boot/firmware
tmpfs           185M     0  185M   0% /run/user/1000

查看是否开启了KVM:

查看开机信息:

[    0.011336] CPU: All CPU(s) started at EL2
[    0.106179] kvm [1]: IPA Size Limit: 44 bits
[    0.107506] kvm [1]: vgic interrupt IRQ9
[    0.107646] kvm [1]: Hyp mode initialized successfully

编译时确认:

如果是buildroot编译,可以查看kernel/arch/arm64/configs目录下,指定的配置文件是否有CONFIG_VIRTUALIZATION宏,如果没有,需要自己配置添加上。

命令查看:

zcat /proc/config.gz | grep "CONFIG_VIRTUALIZATION"

得到shell输出:

CONFIG_VIRTUALIZATION=y

firefly-rk3568环境搭建

sdk安装

下载sdk:Firefly | 让科技更简单,让生活更智能 (t-firefly.com)

  1. 解压SDK
chmod +x ./sdk_tools.sh

创建一个目录以存放SDK:比如我现在这个是3588的SDK,我想解压到上一层文件夹,避免污染当前目录

mkdir ../firefly_rk3588_SDK
./sdk_tools.sh --unpack -C ../firefly_rk3588_SDK
  1. 还原工作目录

选择刚才解压后的目录

./sdk_tools.sh --sync -C ../firefly_rk3588_SDK

可以使用上面脚本执行或者手动执行命令,然后进入刚刚解压后的目录

cd ../firefly_rk3588_SDK
.repo/repo/repo sync -l
.repo/repo/repo start firefly --all
  1. 更新SDK

前面2个步骤只在第一次解压SDK时执行,后续更新SDK只需进入SDK目录执行第3步骤,进行网络更新

.repo/repo/repo sync -c --no-tags
编译

执行:./build.sh,选择:rk3568-firefly

烧录

使用工具:RKDevTool_Release

加载虚拟机镜像

使用tftp从主机下载开发板,因为是busybox里面的tftp,是一款应用于嵌入式开发系统上的一款小巧tftp工具,所以方法和普通tftp有异:

root@RK356X:/# tftp ?
BusyBox v1.34.1 (2023-12-23 16:25:21 CST) multi-call binary.
Usage: tftp [OPTIONS] HOST [PORT]Transfer a file from/to tftp server-l FILE Local FILE-r FILE Remote FILE-g      Get file-p      Put file-b SIZE Transfer blocks in bytes
tftp -g -l Image -r Image 192.168.0.102

树莓派4B环境搭建

烧录

树莓派的烧录需要一张SD卡,并将其格式化fat32。使用raspberry Pi Imager工具把linux镜像烧录到SD卡中。

https://www.raspberrypi.com/software/

配置修改

cmdline.txt需要修改,取消slient

console=serial0,115200 console=tty1 root=PARTUUID=686c0ceb-02 rootfstype=ext4 fsck.repair=yes rootwait splash plymouth.ignore-serial-consoles

config.txt需要修改,使能uart

enable_uart=1

如果需要启动u-boot,还得添加:

kernel=u-boot.bin

并将u-boot.bin/uImage/urootfs.cpio放入sd卡根目录中。

uboot中启动kernel:

setenv bootargs "8250.nr_uarts=1 console=ttyS0,115200"
fatload mmc 0:1 0x80000 uImage; fatload mmc 0:1 0x3800000 bcm2711-rpi-4-b.dtb; fatload mmc 0:1 0x5800000 urootfs.cpio; bootm 0x80000 0x5800000 0x3800000
运行

安装qemu,因为是debian系统,所以直接用apt命令了:

sudo apt install qemu-system

报警告:

hwmon1: Undervoltage detected

电压不足,我是用的usb连接的电脑端。换了根电源线,好一点。

后面直接接手机快充+配套电源线。可恶的警告⚠再也没出现了。可见一定要给供电能力配足。

测试

基准测试程序
  1. Dhrystone是一个用于测量处理器整形性能的简单基准测试
  2. Cachebench是评估计算机系统内存性能
  3. 内存带宽已经被认为能够影响系统性能
  4. Hackbench通过确定调度给定数目任务花费的时间来测量系统调度性能
unixbench能测什么
测试结果
linux kernel使用:4.19.232unixbench版本:BYTE UNIX Benchmarks (Version 5.1.3)qemu版本:最新的8.2.0硬件:4核A55,4GB内存

测试结果:

在rk3568中,启用qemu,并分别在使能和不使能kvm的情况下启动虚拟机linux。

  • 不使能kvm性能很差,使用unixbench,Dhrystone只能达到硬件的2.04%
  • 使能kvm后性能提升非常大,使用unixbench,单多核Dhrystone均达到98.4%
  • 当使用kvmtool替换掉qemu后,使用unixbench,单多核Dhrystone也能达到98.35%,和qemu几乎相差不大
虚拟机linux kernel:4.19.232host linux kernel:6.1.0unixbench版本:BYTE UNIX Benchmarks (Version 5.1.3)硬件:4核A72,2GB内存

测试结果:

在树莓派中使能KVM,并启动虚拟机Linux。

  • 虚拟机比真实物理机的Dhrystone测试结果更快,达到了100.7%
  • 差异点可能在于内核的版本和配置,没有一致
rk3568(A55-4核-4GB)
直接运行linux

第一次测试,把unixbench放入tmpfs测试:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK356X: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #1 SMP Fri Dec 22 15:28:34 CST 2023Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (48.0 bogomips)CPU 1:  (48.0 bogomips)CPU 2:  (48.0 bogomips)CPU 3:  (48.0 bogomips)01:02:26 up 35 min,  0 users,  load average: 0.09, 0.04, 0.01; runlevel------------------------------------------------------------------------
Benchmark Run: Tue Jan 02 2024 01:02:26 - 01:30:47
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       12048408.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2729.1 MWIPS (10.1 s, 7 samples)
Execl Throughput                                702.0 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        286853.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           82849.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        687970.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              490535.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  34078.8 lps   (10.0 s, 7 samples)
Process Creation                               2246.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1476.0 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    460.7 lpm   (60.1 s, 2 samples)
System Call Overhead                         746013.0 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   12048408.6   1032.4
Double-Precision Whetstone                       55.0       2729.1    496.2
Execl Throughput                                 43.0        702.0    163.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     286853.2    724.4
File Copy 256 bufsize 500 maxblocks            1655.0      82849.8    500.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     687970.7   1186.2
Pipe Throughput                               12440.0     490535.2    394.3
Pipe-based Context Switching                   4000.0      34078.8     85.2
Process Creation                                126.0       2246.3    178.3
Shell Scripts (1 concurrent)                     42.4       1476.0    348.1
Shell Scripts (8 concurrent)                      6.0        460.7    767.8
System Call Overhead                          15000.0     746013.0    497.3========
System Benchmarks Index Score                                         418.2------------------------------------------------------------------------
Benchmark Run: Tue Jan 02 2024 01:30:47 - 01:59:11
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       46934003.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    10671.0 MWIPS (10.1 s, 7 samples)
Execl Throughput                               2514.5 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        544406.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          151998.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1428537.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1915090.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 196978.4 lps   (10.0 s, 7 samples)
Process Creation                               7348.1 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4310.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    575.1 lpm   (60.2 s, 2 samples)
System Call Overhead                        2696112.3 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46934003.7   4021.8
Double-Precision Whetstone                       55.0      10671.0   1940.2
Execl Throughput                                 43.0       2514.5    584.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     544406.0   1374.8
File Copy 256 bufsize 500 maxblocks            1655.0     151998.7    918.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1428537.7   2463.0
Pipe Throughput                               12440.0    1915090.3   1539.5
Pipe-based Context Switching                   4000.0     196978.4    492.4
Process Creation                                126.0       7348.1    583.2
Shell Scripts (1 concurrent)                     42.4       4310.9   1016.7
Shell Scripts (8 concurrent)                      6.0        575.1    958.5
System Call Overhead                          15000.0    2696112.3   1797.4========
System Benchmarks Index Score                                        1221.1

第二次,把unixbench放入sda磁盘测试,文件拷贝速度显著减弱:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK356X: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #1 SMP Fri Dec 22 15:28:34 CST 2023Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (48.0 bogomips)CPU 1:  (48.0 bogomips)CPU 2:  (48.0 bogomips)CPU 3:  (48.0 bogomips)08:12:29 up 0 min,  0 users,  load average: 0.26, 0.10, 0.03; runlevel------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 08:12:29 - 08:40:51
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       11976868.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2716.7 MWIPS (10.1 s, 7 samples)
Execl Throughput                                689.3 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         88528.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           25141.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        293936.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              488819.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33583.5 lps   (10.0 s, 7 samples)
Process Creation                               2216.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1384.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    435.5 lpm   (60.1 s, 2 samples)
System Call Overhead                         742257.1 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11976868.7   1026.3
Double-Precision Whetstone                       55.0       2716.7    493.9
Execl Throughput                                 43.0        689.3    160.3
File Copy 1024 bufsize 2000 maxblocks          3960.0      88528.7    223.6
File Copy 256 bufsize 500 maxblocks            1655.0      25141.9    151.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     293936.5    506.8
Pipe Throughput                               12440.0     488819.7    392.9
Pipe-based Context Switching                   4000.0      33583.5     84.0
Process Creation                                126.0       2216.5    175.9
Shell Scripts (1 concurrent)                     42.4       1384.7    326.6
Shell Scripts (8 concurrent)                      6.0        435.5    725.8
System Call Overhead                          15000.0     742257.1    494.8========
System Benchmarks Index Score                                         314.9------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 08:40:51 - 09:09:15
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       46773415.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    10638.8 MWIPS (10.1 s, 7 samples)
Execl Throughput                               2500.9 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        110127.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           29874.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        397006.1 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1909959.0 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 196549.5 lps   (10.0 s, 7 samples)
Process Creation                               6751.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4099.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    544.8 lpm   (60.2 s, 2 samples)
System Call Overhead                        2690986.9 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46773415.3   4008.0
Double-Precision Whetstone                       55.0      10638.8   1934.3
Execl Throughput                                 43.0       2500.9    581.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     110127.0    278.1
File Copy 256 bufsize 500 maxblocks            1655.0      29874.4    180.5
File Copy 4096 bufsize 8000 maxblocks          5800.0     397006.1    684.5
Pipe Throughput                               12440.0    1909959.0   1535.3
Pipe-based Context Switching                   4000.0     196549.5    491.4
Process Creation                                126.0       6751.0    535.8
Shell Scripts (1 concurrent)                     42.4       4099.2    966.8
Shell Scripts (8 concurrent)                      6.0        544.8    908.0
System Call Overhead                          15000.0    2690986.9   1794.0========
System Benchmarks Index Score                                         824.5
qemu(a55-4核-2GB)

在rk3568的linux中不使能KVM,进行模拟:

qemu-system-aarch64 -M virt,virtualization=true -cpu cortex-a55 -nographic -smp 4 -m 2048 -kernel Image --append "console=ttyAMA0"

结果:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK3568_qemu: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #2 SMP PREEMPT Fri Dec 29 09:46:36 CST 2023Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (125.0 bogomips)CPU 1:  (125.0 bogomips)CPU 2:  (125.0 bogomips)CPU 3:  (125.0 bogomips)02:50:14 up 2 min,  1 user,  load average: 0.61, 0.30, 0.11; runlevel------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 02:50:15 - 03:20:33
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables         274451.3 lps   (10.3 s, 7 samples)
Double-Precision Whetstone                       81.9 MWIPS (10.0 s, 7 samples)
Execl Throughput                                 20.0 lps   (29.6 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks          2864.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks             775.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks          9293.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                                4334.4 lps   (10.3 s, 7 samples)
Pipe-based Context Switching                    464.6 lps   (10.3 s, 7 samples)
Process Creation                                 37.3 lps   (30.3 s, 2 samples)
Shell Scripts (1 concurrent)                     43.6 lpm   (60.5 s, 2 samples)
Shell Scripts (8 concurrent)                      7.3 lpm   (65.9 s, 2 samples)
System Call Overhead                           5166.9 lps   (10.3 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0     274451.3     23.5
Double-Precision Whetstone                       55.0         81.9     14.9
Execl Throughput                                 43.0         20.0      4.6
File Copy 1024 bufsize 2000 maxblocks          3960.0       2864.5      7.2
File Copy 256 bufsize 500 maxblocks            1655.0        775.0      4.7
File Copy 4096 bufsize 8000 maxblocks          5800.0       9293.7     16.0
Pipe Throughput                               12440.0       4334.4      3.5
Pipe-based Context Switching                   4000.0        464.6      1.2
Process Creation                                126.0         37.3      3.0
Shell Scripts (1 concurrent)                     42.4         43.6     10.3
Shell Scripts (8 concurrent)                      6.0          7.3     12.1
System Call Overhead                          15000.0       5166.9      3.4========
System Benchmarks Index Score                                           6.4------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 03:20:33 - 03:54:14
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables         958472.9 lps   (10.8 s, 7 samples)
Double-Precision Whetstone                      305.9 MWIPS (10.0 s, 7 samples)
Execl Throughput                                 42.9 lps   (30.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         11160.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks            2090.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks         40914.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                                9138.3 lps   (11.2 s, 7 samples)
Pipe-based Context Switching                    766.3 lps   (11.3 s, 7 samples)
Process Creation                                 74.5 lps   (31.5 s, 2 samples)
Shell Scripts (1 concurrent)                     59.5 lpm   (63.5 s, 2 samples)
Shell Scripts (8 concurrent)                      3.1 lpm   (78.5 s, 2 samples)
System Call Overhead                           5585.6 lps   (11.3 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0     958472.9     82.1
Double-Precision Whetstone                       55.0        305.9     55.6
Execl Throughput                                 43.0         42.9     10.0
File Copy 1024 bufsize 2000 maxblocks          3960.0      11160.0     28.2
File Copy 256 bufsize 500 maxblocks            1655.0       2090.4     12.6
File Copy 4096 bufsize 8000 maxblocks          5800.0      40914.0     70.5
Pipe Throughput                               12440.0       9138.3      7.3
Pipe-based Context Switching                   4000.0        766.3      1.9
Process Creation                                126.0         74.5      5.9
Shell Scripts (1 concurrent)                     42.4         59.5     14.0
Shell Scripts (8 concurrent)                      6.0          3.1      5.1
System Call Overhead                          15000.0       5585.6      3.7========
System Benchmarks Index Score                                          13.1
qemu(host-4核-2GB)

在rk3568的linux中输入命令:

qemu-system-aarch64 -cpu host -m 2048 -enable-kvm -nographic -machine virt -smp 4 -kernel Image -append "console=ttyAMA0"

结果:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK3568_qemu: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #2 SMP PREEMPT Fri Dec 29 09:46:36 CST 2023Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (48.0 bogomips)CPU 1:  (48.0 bogomips)CPU 2:  (48.0 bogomips)CPU 3:  (48.0 bogomips)07:02:45 up 0 min,  1 user,  load average: 0.03, 0.01, 0.00; runlevel------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 07:02:45 - 07:31:04
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       11819460.8 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2945.1 MWIPS (10.0 s, 7 samples)
Execl Throughput                               1025.8 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        301514.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           89249.3 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        715154.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              577421.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  31832.7 lps   (10.0 s, 7 samples)
Process Creation                               1459.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1259.9 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    526.5 lpm   (60.1 s, 2 samples)
System Call Overhead                         844864.8 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11819460.8   1012.8
Double-Precision Whetstone                       55.0       2945.1    535.5
Execl Throughput                                 43.0       1025.8    238.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     301514.3    761.4
File Copy 256 bufsize 500 maxblocks            1655.0      89249.3    539.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     715154.7   1233.0
Pipe Throughput                               12440.0     577421.2    464.2
Pipe-based Context Switching                   4000.0      31832.7     79.6
Process Creation                                126.0       1459.0    115.8
Shell Scripts (1 concurrent)                     42.4       1259.9    297.1
Shell Scripts (8 concurrent)                      6.0        526.5    877.5
System Call Overhead                          15000.0     844864.8    563.2========
System Benchmarks Index Score                                         431.1------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 07:31:04 - 07:59:26
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       46075093.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    11500.5 MWIPS (10.0 s, 7 samples)
Execl Throughput                               2586.5 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        617301.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          183004.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1446823.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2253111.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 311796.3 lps   (10.0 s, 7 samples)
Process Creation                               4987.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4130.5 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    564.8 lpm   (60.2 s, 2 samples)
System Call Overhead                        3029954.4 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46075093.1   3948.2
Double-Precision Whetstone                       55.0      11500.5   2091.0
Execl Throughput                                 43.0       2586.5    601.5
File Copy 1024 bufsize 2000 maxblocks          3960.0     617301.3   1558.8
File Copy 256 bufsize 500 maxblocks            1655.0     183004.5   1105.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1446823.8   2494.5
Pipe Throughput                               12440.0    2253111.7   1811.2
Pipe-based Context Switching                   4000.0     311796.3    779.5
Process Creation                                126.0       4987.0    395.8
Shell Scripts (1 concurrent)                     42.4       4130.5    974.2
Shell Scripts (8 concurrent)                      6.0        564.8    941.4
System Call Overhead                          15000.0    3029954.4   2020.0========
System Benchmarks Index Score                                        1294.3
kvmtool(host-4核-2GB)
root@RK356X:/root# lkvm run --kernel Image -m 2048# lkvm run -k Image -m 2048 -c 4 --name guest-1568

其中kernel默认配置即可,rootfs同理,不需要更改ttyAMA0

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK3568_qemu: GNU/LinuxOS: GNU/Linux -- 5.16.12 -- #3 SMP PREEMPT Wed Jan 10 16:17:21 CST 2024Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (48.0 bogomips)CPU 1:  (48.0 bogomips)CPU 2:  (48.0 bogomips)CPU 3:  (48.0 bogomips)00:00:21 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00; runlevel------------------------------------------------------------------------
Benchmark Run: Thu Jan 01 1970 00:00:21 - 00:28:42
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       11791200.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2942.2 MWIPS (10.0 s, 7 samples)
Execl Throughput                                417.8 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        245581.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           75954.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        597586.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                              514077.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  26346.4 lps   (10.0 s, 7 samples)
Process Creation                                251.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1071.0 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    394.0 lpm   (60.2 s, 2 samples)
System Call Overhead                         715970.1 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11791200.9   1010.4
Double-Precision Whetstone                       55.0       2942.2    534.9
Execl Throughput                                 43.0        417.8     97.2
File Copy 1024 bufsize 2000 maxblocks          3960.0     245581.3    620.2
File Copy 256 bufsize 500 maxblocks            1655.0      75954.9    458.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     597586.0   1030.3
Pipe Throughput                               12440.0     514077.4    413.2
Pipe-based Context Switching                   4000.0      26346.4     65.9
Process Creation                                126.0        251.0     19.9
Shell Scripts (1 concurrent)                     42.4       1071.0    252.6
Shell Scripts (8 concurrent)                      6.0        394.0    656.7
System Call Overhead                          15000.0     715970.1    477.3========
System Benchmarks Index Score                                         305.5------------------------------------------------------------------------
Benchmark Run: Thu Jan 01 1970 00:28:42 - 00:57:05
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       46001918.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    11480.7 MWIPS (10.0 s, 7 samples)
Execl Throughput                               1861.1 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        566906.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          171430.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1294907.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2005545.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 215499.0 lps   (10.0 s, 7 samples)
Process Creation                               4414.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3166.2 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    422.4 lpm   (60.3 s, 2 samples)
System Call Overhead                        2611391.3 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46001918.1   3941.9
Double-Precision Whetstone                       55.0      11480.7   2087.4
Execl Throughput                                 43.0       1861.1    432.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     566906.0   1431.6
File Copy 256 bufsize 500 maxblocks            1655.0     171430.7   1035.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1294907.7   2232.6
Pipe Throughput                               12440.0    2005545.6   1612.2
Pipe-based Context Switching                   4000.0     215499.0    538.7
Process Creation                                126.0       4414.2    350.3
Shell Scripts (1 concurrent)                     42.4       3166.2    746.7
Shell Scripts (8 concurrent)                      6.0        422.4    704.0
System Call Overhead                          15000.0    2611391.3   1740.9========
System Benchmarks Index Score                                        1104.2
树莓派4B(A72-4核-2GB)
直接运行linux

直接把unixbench放入tmpfs测试:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: raspberrypi: GNU/LinuxOS: GNU/Linux -- 6.1.0-rpi7-rpi-v8 -- #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24)Machine: aarch64 (unknown)Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)04:53:13 up 10 min,  3 users,  load average: 0.69, 0.54, 0.31; runlevel Jan------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 04:53:13 - 05:21:26
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       19203197.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3225.8 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1132.7 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        190043.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           57100.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        520541.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                              182851.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  37719.8 lps   (10.0 s, 7 samples)
Process Creation                               1996.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3365.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    455.4 lpm   (60.1 s, 2 samples)
System Call Overhead                         128006.6 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19203197.4   1645.5
Double-Precision Whetstone                       55.0       3225.8    586.5
Execl Throughput                                 43.0       1132.7    263.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     190043.9    479.9
File Copy 256 bufsize 500 maxblocks            1655.0      57100.4    345.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     520541.2    897.5
Pipe Throughput                               12440.0     182851.4    147.0
Pipe-based Context Switching                   4000.0      37719.8     94.3
Process Creation                                126.0       1996.6    158.5
Shell Scripts (1 concurrent)                     42.4       3365.8    793.8
Shell Scripts (8 concurrent)                      6.0        455.4    759.0
System Call Overhead                          15000.0     128006.6     85.3========
System Benchmarks Index Score                                         356.9------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 05:21:26 - 05:49:02
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       27031362.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     5970.9 MWIPS (7.1 s, 7 samples)
Execl Throughput                               1539.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        285477.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           80235.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        798634.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                              257923.5 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  45554.8 lps   (10.0 s, 7 samples)
Process Creation                               4925.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3740.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    499.8 lpm   (60.1 s, 2 samples)
System Call Overhead                         178635.9 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   27031362.4   2316.3
Double-Precision Whetstone                       55.0       5970.9   1085.6
Execl Throughput                                 43.0       1539.6    358.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     285477.7    720.9
File Copy 256 bufsize 500 maxblocks            1655.0      80235.0    484.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     798634.9   1377.0
Pipe Throughput                               12440.0     257923.5    207.3
Pipe-based Context Switching                   4000.0      45554.8    113.9
Process Creation                                126.0       4925.6    390.9
Shell Scripts (1 concurrent)                     42.4       3740.2    882.1
Shell Scripts (8 concurrent)                      6.0        499.8    832.9
System Call Overhead                          15000.0     178635.9    119.1========
System Benchmarks Index Score                                         515.2

直接接入5v/3a的手机电源,性能确实显著提升了,可见一定要给足电源供电:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: raspberrypi: GNU/LinuxOS: GNU/Linux -- 6.1.0-rpi7-rpi-v8 -- #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24)Machine: aarch64 (unknown)Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)08:27:14 up 0 min,  3 users,  load average: 1.07, 0.39, 0.14; runlevel Jan------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 08:27:14 - 08:55:28
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       19259097.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3225.7 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1109.5 lps   (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        172250.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           50151.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        479185.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                              183515.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  37330.6 lps   (10.0 s, 7 samples)
Process Creation                               1973.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3498.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    962.3 lpm   (60.0 s, 2 samples)
System Call Overhead                         128098.5 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19259097.7   1650.3
Double-Precision Whetstone                       55.0       3225.7    586.5
Execl Throughput                                 43.0       1109.5    258.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     172250.3    435.0
File Copy 256 bufsize 500 maxblocks            1655.0      50151.7    303.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     479185.6    826.2
Pipe Throughput                               12440.0     183515.7    147.5
Pipe-based Context Switching                   4000.0      37330.6     93.3
Process Creation                                126.0       1973.5    156.6
Shell Scripts (1 concurrent)                     42.4       3498.7    825.2
Shell Scripts (8 concurrent)                      6.0        962.3   1603.9
System Call Overhead                          15000.0     128098.5     85.4========
System Benchmarks Index Score                                         370.2------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 08:55:28 - 09:23:44
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       74018347.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12887.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                               3300.0 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        650225.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          193757.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1355042.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              705913.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 127388.2 lps   (10.0 s, 7 samples)
Process Creation                               7114.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   7725.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1020.3 lpm   (60.1 s, 2 samples)
System Call Overhead                         492634.0 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   74018347.3   6342.6
Double-Precision Whetstone                       55.0      12887.5   2343.2
Execl Throughput                                 43.0       3300.0    767.5
File Copy 1024 bufsize 2000 maxblocks          3960.0     650225.3   1642.0
File Copy 256 bufsize 500 maxblocks            1655.0     193757.5   1170.7
File Copy 4096 bufsize 8000 maxblocks          5800.0    1355042.7   2336.3
Pipe Throughput                               12440.0     705913.3    567.5
Pipe-based Context Switching                   4000.0     127388.2    318.5
Process Creation                                126.0       7114.5    564.6
Shell Scripts (1 concurrent)                     42.4       7725.8   1822.1
Shell Scripts (8 concurrent)                      6.0       1020.3   1700.4
System Call Overhead                          15000.0     492634.0    328.4========
System Benchmarks Index Score                                        1149.4

重新跑ubuntu20.04.5发现性能更差。

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: ubuntu: GNU/LinuxOS: GNU/Linux -- 5.4.0-1100-raspi -- #112-Ubuntu SMP PREEMPT Fri Nov 24 15:35:17 UTC 2023Machine: aarch64 (aarch64)Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)04:38:47 up 6 min,  1 user,  load average: 0.24, 0.23, 0.10; runlevel 2024-01-04------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 04:38:47 - 05:07:02
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       18051001.2 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3218.4 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1029.2 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        105458.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           30120.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        294282.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                              159132.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33518.0 lps   (10.0 s, 7 samples)
Process Creation                               2875.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2477.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    834.8 lpm   (60.0 s, 2 samples)
System Call Overhead                         202649.0 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   18051001.2   1546.8
Double-Precision Whetstone                       55.0       3218.4    585.2
Execl Throughput                                 43.0       1029.2    239.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     105458.2    266.3
File Copy 256 bufsize 500 maxblocks            1655.0      30120.0    182.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     294282.6    507.4
Pipe Throughput                               12440.0     159132.6    127.9
Pipe-based Context Switching                   4000.0      33518.0     83.8
Process Creation                                126.0       2875.3    228.2
Shell Scripts (1 concurrent)                     42.4       2477.9    584.4
Shell Scripts (8 concurrent)                      6.0        834.8   1391.4
System Call Overhead                          15000.0     202649.0    135.1========
System Benchmarks Index Score                                         325.8------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 05:07:02 - 05:35:18
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       70864726.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12882.1 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2917.2 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        205595.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           55555.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        492145.4 KBps  (30.0 s, 2 samples)
Pipe Throughput                              635438.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 148162.1 lps   (10.0 s, 7 samples)
Process Creation                               6958.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6744.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    922.1 lpm   (60.1 s, 2 samples)
System Call Overhead                         793322.3 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   70864726.5   6072.4
Double-Precision Whetstone                       55.0      12882.1   2342.2
Execl Throughput                                 43.0       2917.2    678.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     205595.6    519.2
File Copy 256 bufsize 500 maxblocks            1655.0      55555.4    335.7
File Copy 4096 bufsize 8000 maxblocks          5800.0     492145.4    848.5
Pipe Throughput                               12440.0     635438.3    510.8
Pipe-based Context Switching                   4000.0     148162.1    370.4
Process Creation                                126.0       6958.9    552.3
Shell Scripts (1 concurrent)                     42.4       6744.2   1590.6
Shell Scripts (8 concurrent)                      6.0        922.1   1536.9
System Call Overhead                          15000.0     793322.3    528.9========
System Benchmarks Index Score                                         871.8

切换成root用户:

   System: ubuntu: GNU/LinuxOS: GNU/Linux -- 5.4.0-1100-raspi -- #112-Ubuntu SMP PREEMPT Fri Nov 24 15:35:17 UTC 2023Machine: aarch64 (aarch64)Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)07:55:25 up 1 min,  1 user,  load average: 0.14, 0.11, 0.04; runlevel 2024-01-04------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 07:55:25 - 08:23:40
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       18111605.8 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3219.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                                997.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        107925.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           29536.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        301787.4 KBps  (30.0 s, 2 samples)
Pipe Throughput                              164363.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33460.7 lps   (10.0 s, 7 samples)
Process Creation                               2821.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2483.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    839.2 lpm   (60.1 s, 2 samples)
System Call Overhead                         203130.0 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   18111605.8   1552.0
Double-Precision Whetstone                       55.0       3219.5    585.4
Execl Throughput                                 43.0        997.6    232.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     107925.9    272.5
File Copy 256 bufsize 500 maxblocks            1655.0      29536.4    178.5
File Copy 4096 bufsize 8000 maxblocks          5800.0     301787.4    520.3
Pipe Throughput                               12440.0     164363.4    132.1
Pipe-based Context Switching                   4000.0      33460.7     83.7
Process Creation                                126.0       2821.3    223.9
Shell Scripts (1 concurrent)                     42.4       2483.8    585.8
Shell Scripts (8 concurrent)                      6.0        839.2   1398.7
System Call Overhead                          15000.0     203130.0    135.4========
System Benchmarks Index Score                                         326.4------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 08:23:40 - 08:51:58
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       69996456.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12829.6 MWIPS (10.0 s, 7 samples)
Execl Throughput                               2890.7 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        202244.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           54757.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        491815.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              629127.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 146257.3 lps   (10.0 s, 7 samples)
Process Creation                               7002.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6768.1 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    920.9 lpm   (60.1 s, 2 samples)
System Call Overhead                         793749.6 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   69996456.5   5998.0
Double-Precision Whetstone                       55.0      12829.6   2332.6
Execl Throughput                                 43.0       2890.7    672.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     202244.3    510.7
File Copy 256 bufsize 500 maxblocks            1655.0      54757.0    330.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     491815.7    848.0
Pipe Throughput                               12440.0     629127.6    505.7
Pipe-based Context Switching                   4000.0     146257.3    365.6
Process Creation                                126.0       7002.8    555.8
Shell Scripts (1 concurrent)                     42.4       6768.1   1596.2
Shell Scripts (8 concurrent)                      6.0        920.9   1534.8
System Call Overhead                          15000.0     793749.6    529.2========
System Benchmarks Index Score                                         866.7
qemu(host-4核-1GB)
qemu-system-aarch64 -cpu host -m 1024 -enable-kvm -nographic -machine virt -smp 4 -kernel Image -append "console=ttyAMA0"

运行起来:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK3568_qemu: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #3 SMP PREEMPT Wed Jan 3 10:24:38 CST 2024Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)09:27:56 up 0 min,  1 user,  load average: 0.00, 0.00, 0.00; runlevel------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 09:27:56 - 09:56:06
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       19394971.2 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3269.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2316.3 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        436850.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          157413.2 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        825354.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              749421.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  40439.3 lps   (10.0 s, 7 samples)
Process Creation                               3964.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3347.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    847.3 lpm   (60.0 s, 2 samples)
System Call Overhead                         638685.4 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19394971.2   1662.0
Double-Precision Whetstone                       55.0       3269.5    594.5
Execl Throughput                                 43.0       2316.3    538.7
File Copy 1024 bufsize 2000 maxblocks          3960.0     436850.9   1103.2
File Copy 256 bufsize 500 maxblocks            1655.0     157413.2    951.1
File Copy 4096 bufsize 8000 maxblocks          5800.0     825354.5   1423.0
Pipe Throughput                               12440.0     749421.9    602.4
Pipe-based Context Switching                   4000.0      40439.3    101.1
Process Creation                                126.0       3964.6    314.6
Shell Scripts (1 concurrent)                     42.4       3347.3    789.5
Shell Scripts (8 concurrent)                      6.0        847.3   1412.2
System Call Overhead                          15000.0     638685.4    425.8========
System Benchmarks Index Score                                         663.1------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 09:56:06 - 10:24:18
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       74545237.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    13074.2 MWIPS (9.9 s, 7 samples)
Execl Throughput                               4403.0 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1136175.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          361147.3 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1916223.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2880998.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 475862.9 lps   (10.0 s, 7 samples)
Process Creation                               7806.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6233.5 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    898.2 lpm   (60.2 s, 2 samples)
System Call Overhead                        2400486.5 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   74545237.5   6387.8
Double-Precision Whetstone                       55.0      13074.2   2377.1
Execl Throughput                                 43.0       4403.0   1023.9
File Copy 1024 bufsize 2000 maxblocks          3960.0    1136175.9   2869.1
File Copy 256 bufsize 500 maxblocks            1655.0     361147.3   2182.2
File Copy 4096 bufsize 8000 maxblocks          5800.0    1916223.8   3303.8
Pipe Throughput                               12440.0    2880998.3   2315.9
Pipe-based Context Switching                   4000.0     475862.9   1189.7
Process Creation                                126.0       7806.2    619.5
Shell Scripts (1 concurrent)                     42.4       6233.5   1470.2
Shell Scripts (8 concurrent)                      6.0        898.2   1497.0
System Call Overhead                          15000.0    2400486.5   1600.3========
System Benchmarks Index Score                                        1878.7

在ubuntu下跑虚拟机:

========================================================================BYTE UNIX Benchmarks (Version 5.1.3)System: RK3568_qemu: GNU/LinuxOS: GNU/Linux -- 4.19.232 -- #3 SMP PREEMPT Wed Jan 3 10:24:38 CST 2024Machine: aarch64 (unknown)Language:  (charmap=, collate=)CPU 0:  (108.0 bogomips)CPU 1:  (108.0 bogomips)CPU 2:  (108.0 bogomips)CPU 3:  (108.0 bogomips)03:13:55 up 2 min,  1 user,  load average: 0.00, 0.00, 0.00; runlevel------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 03:13:55 - 03:42:06
4 CPUs in system; running 1 parallel copy of testsDhrystone 2 using register variables       19022473.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3265.1 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2390.8 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        436414.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          156531.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        816641.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                              750129.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  47353.1 lps   (10.0 s, 7 samples)
Process Creation                               4155.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2867.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    858.4 lpm   (60.0 s, 2 samples)
System Call Overhead                         639944.1 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19022473.9   1630.0
Double-Precision Whetstone                       55.0       3265.1    593.6
Execl Throughput                                 43.0       2390.8    556.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     436414.6   1102.1
File Copy 256 bufsize 500 maxblocks            1655.0     156531.9    945.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     816641.9   1408.0
Pipe Throughput                               12440.0     750129.7    603.0
Pipe-based Context Switching                   4000.0      47353.1    118.4
Process Creation                                126.0       4155.5    329.8
Shell Scripts (1 concurrent)                     42.4       2867.3    676.2
Shell Scripts (8 concurrent)                      6.0        858.4   1430.7
System Call Overhead                          15000.0     639944.1    426.6========
System Benchmarks Index Score                                         666.4------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 03:42:06 - 04:10:19
4 CPUs in system; running 4 parallel copies of testsDhrystone 2 using register variables       75362148.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    13078.8 MWIPS (10.0 s, 7 samples)
Execl Throughput                               4618.8 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1075601.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          370916.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1761060.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2935171.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 505143.3 lps   (10.0 s, 7 samples)
Process Creation                               8491.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6652.1 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    931.5 lpm   (60.1 s, 2 samples)
System Call Overhead                        2401011.6 lps   (10.0 s, 7 samples)System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   75362148.4   6457.8
Double-Precision Whetstone                       55.0      13078.8   2378.0
Execl Throughput                                 43.0       4618.8   1074.1
File Copy 1024 bufsize 2000 maxblocks          3960.0    1075601.7   2716.2
File Copy 256 bufsize 500 maxblocks            1655.0     370916.8   2241.2
File Copy 4096 bufsize 8000 maxblocks          5800.0    1761060.8   3036.3
Pipe Throughput                               12440.0    2935171.9   2359.5
Pipe-based Context Switching                   4000.0     505143.3   1262.9
Process Creation                                126.0       8491.8    674.0
Shell Scripts (1 concurrent)                     42.4       6652.1   1568.9
Shell Scripts (8 concurrent)                      6.0        931.5   1552.5
System Call Overhead                          15000.0    2401011.6   1600.7========
System Benchmarks Index Score                                        1912.0

差异点在哪里?可能在于差异点内核的版本和配置,没有一致。

​ write by xuxeu

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/337198.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

LightGBM原理和调参

背景知识 LightGBM(Light Gradient Boosting Machine)是一个实现GBDT算法的框架&#xff0c;具有支持高效率的并行训练、更快的训练速度、更低的内存消耗、更好的准确率、支持分布式可以处理海量数据等优点。 普通的GBDT算法不支持用mini-batch的方式训练&#xff0c;在每一次…

一键修复所有dll缺失的工具,dll修复工具下载使用教程

在计算机使用过程中&#xff0c;我们经常会遇到各种软件或系统错误提示&#xff0c;其中最常见的就是“找不到指定的模块”或“无法找到某某.dll文件”。Dll是动态链接库的缩写&#xff0c;它是Windows操作系统中的重要组成部分&#xff0c;负责提供各种功能和资源给应用程序使…

大模型PEFT技术原理(一):BitFit、Prefix Tuning、Prompt Tuning

随着预训练模型的参数越来越大&#xff0c;尤其是175B参数大小的GPT3发布以来&#xff0c;让很多中小公司和个人研究员对于大模型的全量微调望而却步&#xff0c;近年来研究者们提出了各种各样的参数高效迁移学习方法&#xff08;Parameter-efficient Transfer Learning&#x…

【语义解析:连接自然语言与机器智能的桥梁】

语义解析&#xff1a;连接自然语言与机器智能的桥梁 语义解析技术可以提高人机交互的效率和准确性&#xff0c;在自然语言处理、数据分析、智能客服、智能家居等领域都有广泛的应用前景。特别是在大数据时代&#xff0c;语义解析能够帮助企业更快速地从大量的数据中获取有用的…

【开源商城推荐-LGPL-3.0】ts-mall 聚惠星商城

dts-shop: 聚惠星商城 DTS-SHOP&#xff0c;基于 微信小程序 springboot vue 技术构建 &#xff0c;支持单店铺&#xff0c;多店铺入驻的商城平台。项目包含 微信小程序&#xff0c;管理后台。基于java后台语言&#xff0c;已功能闭环&#xff0c;且达到商用标准的一套项目体…

【EI会议征稿通知】第五届计算机信息和大数据应用国际学术会议(CIBDA 2024)

第五届计算机信息和大数据应用国际学术会议&#xff08;CIBDA 2024&#xff09; 2024 5th International Conference on Computer Information and Big Data Applications 第五届计算机信息和大数据应用国际学术会议&#xff08;CIBDA 2024&#xff09;将于2024年4月26-28日在…

Python pip 常用指令

前言 Python的pip是一个强大的包管理工具&#xff0c;它可以帮助我们安装、升级和管理Python的第三方库。以下是一些常用的pip指令。 1. 安装第三方库 使用pip安装Python库非常简单&#xff0c;只需要使用pip install命令&#xff0c;后面跟上库的名字即可。 # 安装virtuale…

springboot——消息中间件

消息的概念 从广义角度来说&#xff0c;消息其实就是信息&#xff0c;但是和信息又有所不同。信息通常被定义为一组数据&#xff0c;而消息除了具有数据的特征之外&#xff0c;还有消息的来源与接收的概念。通常发送消息的一方称为消息的生产者&#xff0c;接收消息的一方称为…

使用PAI-DSW搭建基于LangChain的检索知识库问答机器人

教程简述 在本教程中&#xff0c;您将学习如何在阿里云交互式建模&#xff08;PAI-DSW&#xff09;中&#xff0c;基于LangChain的检索知识库实现知识问答。旨在建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。 LangChain是一个开源的框架&#xff0c…

Citrix找不到ICAWebWrapper.msi所在的文件夹的路径

在Citrix Workspace启动虚拟机是出现 首先解压Citrix Receiver.exe,然后在里面找到CAWebWrapper.msi这个东西,将放入上图中找不到的路径下可以解决这个上述问题。

浏览器缓存引发的odoo前端报错

前两天&#xff0c;跑了一个odoo16项目&#xff0c;莫名其妙的前端报错&#xff0c; moment.js 报的错&#xff0c; 这是一个时间库&#xff0c;不是我自己写的代码&#xff0c;我也没做过任何修改&#xff0c;搞不清楚为什么报错。以为是odoo的bug&#xff0c;所以从gitee下载…

Git删除远程仓库某次提交记录后的所有提交

1、鼠标右键->git bash here&#xff0c;然后cd切换到代码目录&#xff1b; 2、git log查看提交记录&#xff0c;获取commit id 3、git reset commit id&#xff08;commit id指要保留的最新的提交记录id&#xff09; 4、git push --force&#xff0c;强制push 如果出现…