clickhouse-压测

一、数据集准备

数据集可以使用官网数据集,也可以用ssb-dbgen来准备

1.准备数据

这里最后生成表的数据行数为60亿行,数据量为300G左右

git clone https://github.com/vadimtk/ssb-dbgen.git
cd ssb-dbgen/
make

1.1 生成数据

# -s 指生成多少G的数据
$ ./dbgen -s 40 -T c
$ ./dbgen -s 40 -T l
$ ./dbgen -s 40 -T p
$ ./dbgen -s 40 -T s

1.2 创建表

CREATE TABLE customer
(C_CUSTKEY       UInt32,C_NAME          String,C_ADDRESS       String,C_CITY          LowCardinality(String),C_NATION        LowCardinality(String),C_REGION        LowCardinality(String),C_PHONE         String,C_MKTSEGMENT    LowCardinality(String)
)
ENGINE = MergeTree ORDER BY (C_CUSTKEY);CREATE TABLE lineorder
(LO_ORDERKEY             UInt32,LO_LINENUMBER           UInt8,LO_CUSTKEY              UInt32,LO_PARTKEY              UInt32,LO_SUPPKEY              UInt32,LO_ORDERDATE            Date,LO_ORDERPRIORITY        LowCardinality(String),LO_SHIPPRIORITY         UInt8,LO_QUANTITY             UInt8,LO_EXTENDEDPRICE        UInt32,LO_ORDTOTALPRICE        UInt32,LO_DISCOUNT             UInt8,LO_REVENUE              UInt32,LO_SUPPLYCOST           UInt32,LO_TAX                  UInt8,LO_COMMITDATE           Date,LO_SHIPMODE             LowCardinality(String)
)
ENGINE = MergeTree PARTITION BY toYear(LO_ORDERDATE) ORDER BY (LO_ORDERDATE, LO_ORDERKEY);CREATE TABLE part
(P_PARTKEY       UInt32,P_NAME          String,P_MFGR          LowCardinality(String),P_CATEGORY      LowCardinality(String),P_BRAND         LowCardinality(String),P_COLOR         LowCardinality(String),P_TYPE          LowCardinality(String),P_SIZE          UInt8,P_CONTAINER     LowCardinality(String)
)
ENGINE = MergeTree ORDER BY P_PARTKEY;CREATE TABLE supplier
(S_SUPPKEY       UInt32,S_NAME          String,S_ADDRESS       String,S_CITY          LowCardinality(String),S_NATION        LowCardinality(String),S_REGION        LowCardinality(String),S_PHONE         String
)
ENGINE = MergeTree ORDER BY S_SUPPKEY;

1.3 导入数据

$ clickhouse-client --query "INSERT INTO db_bench.customer FORMAT CSV" < customer.tbl
$ clickhouse-client --query "INSERT INTO db_bench.part FORMAT CSV" < part.tbl
$ clickhouse-client --query "INSERT INTO db_bench.supplier FORMAT CSV" < supplier.tbl
$ clickhouse-client --query "INSERT INTO db_bench.lineorder FORMAT CSV" < lineorder.tbl

1.4 join表

这个操作耗时两个小时,占用内存为29G

# 因为这个操作比较耗费内存,所以要事先设置好内存限制
SET max_memory_usage = 30000000000;CREATE TABLE lineorder_flat
ENGINE = MergeTree ORDER BY (LO_ORDERDATE, LO_ORDERKEY)
AS SELECTl.LO_ORDERKEY AS LO_ORDERKEY,l.LO_LINENUMBER AS LO_LINENUMBER,l.LO_CUSTKEY AS LO_CUSTKEY,l.LO_PARTKEY AS LO_PARTKEY,l.LO_SUPPKEY AS LO_SUPPKEY,l.LO_ORDERDATE AS LO_ORDERDATE,l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY,l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY,l.LO_QUANTITY AS LO_QUANTITY,l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE,l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE,l.LO_DISCOUNT AS LO_DISCOUNT,l.LO_REVENUE AS LO_REVENUE,l.LO_SUPPLYCOST AS LO_SUPPLYCOST,l.LO_TAX AS LO_TAX,l.LO_COMMITDATE AS LO_COMMITDATE,l.LO_SHIPMODE AS LO_SHIPMODE,c.C_NAME AS C_NAME,c.C_ADDRESS AS C_ADDRESS,c.C_CITY AS C_CITY,c.C_NATION AS C_NATION,c.C_REGION AS C_REGION,c.C_PHONE AS C_PHONE,c.C_MKTSEGMENT AS C_MKTSEGMENT,s.S_NAME AS S_NAME,s.S_ADDRESS AS S_ADDRESS,s.S_CITY AS S_CITY,s.S_NATION AS S_NATION,s.S_REGION AS S_REGION,s.S_PHONE AS S_PHONE,p.P_NAME AS P_NAME,p.P_MFGR AS P_MFGR,p.P_CATEGORY AS P_CATEGORY,p.P_BRAND AS P_BRAND,p.P_COLOR AS P_COLOR,p.P_TYPE AS P_TYPE,p.P_SIZE AS P_SIZE,p.P_CONTAINER AS P_CONTAINER
FROM lineorder AS l
INNER JOIN customer AS c ON c.C_CUSTKEY = l.LO_CUSTKEY
INNER JOIN supplier AS s ON s.S_SUPPKEY = l.LO_SUPPKEY
INNER JOIN part AS p ON p.P_PARTKEY = l.LO_PARTKEY;

二、基准测试

1.benchmark的使用

1.1 基本用法

# 以下几种写法都可以
$ clickhouse-benchmark --query ["single query"] [keys]
$ echo "single query" | clickhouse-benchmark [keys]
$ clickhouse-benchmark [keys] <<< "single query"
clickhouse-benchmark [keys] < queries_file;
# 比较两个clickhouse性能
$ echo "SELECT * FROM system.numbers LIMIT 10000000 OFFSET 10000000" | clickhouse-benchmark --host=localhost --port=9001 --host=localhost --port=9000 -i 10

1.2 参数详解

--query=QUERY — 要执行的查询。 如果未传递此参数,clickhouse-benchmark 将从标准输入读取查询。
-c N, --concurrency=N — clickhouse-benchmark 同时发送的查询数。 默认值:1。
-d N, --delay=N — 中间报告之间的间隔(以秒为单位)(以禁用报告集 0)。 默认值:1。
-h HOST, --host=HOST — 服务器主机。 默认值:本地主机。 对于比较模式,您可以使用多个 -h 键。
-p N, --port=N — 服务器端口。 默认值:9000。对于比较模式,您可以使用多个 -p 键。
-i N, --iterations=N — 查询总数。 默认值:0(永远重复)。
-r, --randomize — 如果有多个输入查询,则查询执行的随机顺序。
-s, --secure — 使用 TLS 连接。
-t N, --timelimit=N — 时间限制(以秒为单位)。 当达到指定的时间限制时,clickhouse-benchmark 将停止发送查询。 默认值:0(时间限制禁用)。
--confidence=N — T 检验的置信度。 可能的值:0 (80%)、1 (90%)、2 (95%)、3 (98%)、4 (99%)、5 (99.5%)。 默认值:5。在比较模式下,clickhouse-benchmark 执行独立双样本学生 t 检验,以确定两个分布在所选置信水平下是否没有差异。
--cumulative — 打印累积数据而不是每个间隔的数据。
--database=DATABASE_NAME — ClickHouse 数据库名称。 默认值:默认。
--json=FILEPATH — JSON 输出。 设置密钥后,clickhouse-benchmark 会将报告输出到指定的 JSON 文件。
--user=USERNAME — ClickHouse 用户名。 默认值:默认。
--password=PSWD — ClickHouse 用户密码。 默认值:空字符串。
--stacktrace — 堆栈跟踪输出。 设置密钥后,clickhouse-bencmark 会输出异常的堆栈跟踪。
--stage=WORD — 服务器上的查询处理阶段。 ClickHouse 在指定阶段停止查询处理并向 clickhouse-benchmark 返回答案。 可能的值:complete、fetch_columns、with_mergeable_state。 默认值:完整。
--help — 显示帮助消息。
如果要对查询应用某些设置,请将它们作为键传递 --<session setting name>= SETTING_VALUE。 例如,--max_memory_usage=1048576

1.3 结果分析

# 执行的查询数:字段中的查询数。
Queries executed: 72 (1800.000%).
# ClickHouse 服务器的端点。
# queries:已处理查询的数量。
# QPS:在 --delay 参数指定的时间段内服务器每秒执行的查询数量。
# RPS:在 --delay 参数指定的时间段内服务器每秒读取的行数。
# MiB/s:在 --delay 参数中指定的时间段内,服务器每秒读取多少兆字节。
# result RPS:在 --delay 参数中指定的时间段内,服务器每秒将多少行放入查询结果中。
# result MiB/s。 在 --delay 参数指定的时间段内,服务器每秒向查询结果放置多少兆字节。localhost:9000, queries 2, QPS: 0.156, RPS: 432704682.870, MiB/s: 1370.478, result RPS: 2.185, result MiB/s: 0.000.
# 查询执行时间的百分位数。
0.000%		0.217 sec.
10.000%		0.217 sec.
20.000%		0.217 sec.
30.000%		0.217 sec.
40.000%		0.217 sec.
50.000%		12.594 sec.
60.000%		12.594 sec.
70.000%		12.594 sec.
80.000%		12.594 sec.
90.000%		12.594 sec.
95.000%		12.594 sec.
99.000%		12.594 sec.
99.900%		12.594 sec.
99.990%		12.594 sec.状态字符串包含(按顺序):ClickHouse 服务器的端点。
已处理查询的数量。
QPS:在 --delay 参数指定的时间段内服务器每秒执行的查询数量。
RPS:在 --delay 参数指定的时间段内服务器每秒读取的行数。
MiB/s:在 --delay 参数中指定的时间段内,服务器每秒读取多少兆字节。
结果 RPS:在 --delay 参数中指定的时间段内,服务器每秒将多少行放入查询结果中。
结果 MiB/s。 在 --delay 参数指定的时间段内,服务器每秒向查询结果放置多少兆字节。
查询执行时间的百分位数。

2.基本测试

基准测试的内容可以看官网,具体的sql在这里查看。我是共写了4个sql文件,内容如下

# test1.sql
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM db_bench.lineorder_flat WHERE toYear(LO_ORDERDATE) = 1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM db_bench.lineorder_flat WHERE toYYYYMM(LO_ORDERDATE) = 199401 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM db_bench.lineorder_flat WHERE toISOWeek(LO_ORDERDATE) = 6 AND toYear(LO_ORDERDATE) = 1994 AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35;# test2.sql
SELECT sum(LO_REVENUE),toYear(LO_ORDERDATE) AS year,P_BRAND FROM db_bench.lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA' GROUP BY year,P_BRAND ORDER BY year,P_BRAND;
SELECT sum(LO_REVENUE),toYear(LO_ORDERDATE) AS year,P_BRAND FROM db_bench.lineorder_flat WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228' AND S_REGION = 'ASIA' GROUP BY year,P_BRAND ORDER BY year,P_BRAND;
SELECT sum(LO_REVENUE), toYear(LO_ORDERDATE) AS year, P_BRAND FROM db_bench.lineorder_flat WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE' GROUP BY year, P_BRAND ORDER BY year, P_BRAND;# test3.sql
SELECT C_NATION, S_NATION, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM db_bench.lineorder_flat WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND year >= 1992 AND year <= 1997 GROUP BY C_NATION, S_NATION, year ORDER BY year ASC, revenue DESC;
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM db_bench.lineorder_flat WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND year >= 1992 AND year <= 1997 GROUP BY C_CITY, S_CITY, year ORDER BY year ASC, revenue DESC;
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM db_bench.lineorder_flat WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND year >= 1992 AND year <= 1997 GROUP BY C_CITY, S_CITY, year ORDER BY year ASC, revenue DESC;
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM db_bench.lineorder_flat WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND toYYYYMM(LO_ORDERDATE) = 199712 GROUP BY C_CITY, S_CITY, year ORDER BY year ASC, revenue DESC;# test4.sql
SELECT toYear(LO_ORDERDATE) AS year, C_NATION, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM db_bench.lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year, C_NATION ORDER BY year ASC, C_NATION ASC;
SELECT toYear(LO_ORDERDATE) AS year, S_NATION, P_CATEGORY, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM db_bench.lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (year = 1997 OR year = 1998) AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year, S_NATION, P_CATEGORY ORDER BY year ASC, S_NATION ASC, P_CATEGORY ASC;
SELECT toYear(LO_ORDERDATE) AS year, S_CITY, P_BRAND, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM db_bench.lineorder_flat WHERE S_NATION = 'UNITED STATES' AND (year = 1997 OR year = 1998) AND P_CATEGORY = 'MFGR#14' GROUP BY year, S_CITY, P_BRAND ORDER BY year ASC, S_CITY ASC, P_BRAND ASC;

2.1 测试方法

clickhouse-benchmark < test1.sql
clickhouse-benchmark < test2.sql
clickhouse-benchmark < test3.sql
clickhouse-benchmark < test4.sql

2.2 测试结果

# test1
Queries executed: 921 (30700.000%).localhost:9000, queries 2, QPS: 5.558, RPS: 263878534.377, MiB/s: 2012.050, result RPS: 5.558, result MiB/s: 0.000.0.000%		0.091 sec.
10.000%		0.091 sec.
20.000%		0.091 sec.
30.000%		0.091 sec.
40.000%		0.091 sec.
50.000%		0.268 sec.
60.000%		0.268 sec.
70.000%		0.268 sec.
80.000%		0.268 sec.
90.000%		0.268 sec.
95.000%		0.268 sec.
99.000%		0.268 sec.
99.900%		0.268 sec.# test2
Queries executed: 32 (1066.667%).localhost:9000, queries 1, QPS: 0.054, RPS: 326066467.053, MiB/s: 2797.293, result RPS: 3.043, result MiB/s: 0.000.0.000%		18.401 sec.
10.000%		18.401 sec.
20.000%		18.401 sec.
30.000%		18.401 sec.
40.000%		18.401 sec.
50.000%		18.401 sec.
60.000%		18.401 sec.
70.000%		18.401 sec.
80.000%		18.401 sec.
90.000%		18.401 sec.
95.000%		18.401 sec.
99.000%		18.401 sec.
99.900%		18.401 sec.
99.990%		18.401 sec.# test3
localhost:9000, queries 73, QPS: 0.082, RPS: 340111314.396, MiB/s: 2527.187, result RPS: 15.938, result MiB/s: 0.000.0.000%		0.182 sec.
10.000%		0.217 sec.
20.000%		0.230 sec.
30.000%		10.547 sec.
40.000%		12.614 sec.
50.000%		14.860 sec.
60.000%		16.560 sec.
70.000%		18.072 sec.
80.000%		18.285 sec.
90.000%		19.915 sec.
95.000%		19.962 sec.
99.000%		20.011 sec.
99.900%		20.059 sec.
99.990%		20.059 sec.# test4
Queries executed: 3 (100.000%).localhost:9000, queries 1, QPS: 0.474, RPS: 683988835.693, MiB/s: 9777.042, result RPS: 378.949, result MiB/s: 0.004.0.000%		2.111 sec.
10.000%		2.111 sec.
20.000%		2.111 sec.
30.000%		2.111 sec.
40.000%		2.111 sec.
50.000%		2.111 sec.
60.000%		2.111 sec.
70.000%		2.111 sec.
80.000%		2.111 sec.
90.000%		2.111 sec.
95.000%		2.111 sec.
99.000%		2.111 sec.
99.900%		2.111 sec.
99.990%		2.111 sec.

2.3 cpu情况

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND7031 999       20   0  0.257t 1.470g  99080 S  4656  0.8   3643:13 clickhouse-serv

2.4 读取数据情况

在这里插入图片描述

结论: 可以看到读取数据的速度还是非常快的,每秒读取的行数和数据量都很大,读取时非常耗cpu资源,但内存占用缺极少

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/86713.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

React绑定antd输入框,点击清空或者确定按钮实现清空输入框内容

其实实现原理和vue的双向绑定是一样的&#xff0c;就是监听输入框的onChange事件&#xff0c;绑定value值&#xff0c;当输入框内容发生变化后&#xff0c;就重新设置这个value值。 示例代码&#xff1a;我这里是统一在handleCancel这个函数里面处理清空逻辑了&#xff0c;你们…

函数的参数传递和返回值-PHP8知识详解

本文学习的是《php8知识详解》中的《函数的参数传递和返回值》。主要包括&#xff1a;向函数传递参数值、向函数传递参数引用、函数的返回值。 1、向函数传递参数值 函数是一段封闭的程序&#xff0c;有时候&#xff0c;程序员需要向函数传递一些数据进行操作。可以接受传入参…

如何给图片加水印?

如何给图片加水印&#xff1f;在我们的日常生活中&#xff0c;许多热爱摄影的朋友都会选择给自己的照片添加水印。这是因为我们深知&#xff0c;一张出色的照片背后需要付出大量的努力和心血&#xff0c;而通过添加水印可以有效地保护自己照片的版权&#xff0c;这样即使将图片…

数学建模及数据分析 || 4. 深度学习应用案例分享

PyTorch 深度学习全连接网络分类 文章目录 PyTorch 深度学习全连接网络分类1. 非线性二分类2. 泰坦尼克号数据分类2.1 数据的准备工作2.2 全连接网络的搭建2.3 结果的可视化 1. 非线性二分类 import sklearn.datasets #数据集 import numpy as np import matplotlib.pyplot as…

OLBY应用APP说明支持

OLBY应用APP说明支持 OLBY是一款支持在线调节鱼缸灯光控制的APP 支持模拟日出日落&#xff0c;给用户在手持端也可以很好的操作控制设备 技术支持 zcj 331163.com

spring boot 3使用 elasticsearch 提供搜索建议

业务场景 用户输入内容&#xff0c;快速返回建议&#xff0c;示例效果如下 技术选型 spring boot 3elasticsearch server 7.17.4spring data elasticsearch 5.0.1elasticsearch-java-api 8.5.3 pom.xml <dependency><groupId>org.springframework.boot</gr…

迅为RK3588开发板Android12 设置系统默认不锁屏

修改 frameworks/base/packages/SettingsProvider/res/values/defaults.xml 文件&#xff0c;修改为如下 所示&#xff1a; - <bool name"def_lockscreen_disabled">false</bool> <bool name"def_lockscreen_disabled">true</bool&…

在React项目是如何捕获错误的?

文章目录 react中的错误介绍解决方案后言 react中的错误介绍 错误在我们日常编写代码是非常常见的 举个例子&#xff0c;在react项目中去编写组件内JavaScript代码错误会导致 React 的内部状态被破坏&#xff0c;导致整个应用崩溃&#xff0c;这是不应该出现的现象 作为一个框架…

Spring事务的隔离级别

使用事务隔离级别可以控制并发事务在同时执行时的某种行为。 前言&#xff1a; 在学习Spring事务隔离级别前我们先了解一下什么是脏读&#xff0c;幻读&#xff0c;不可重复读。 脏读&#xff1a; 一个事务读到另一个事务未提交的更新数据&#xff0c;所谓脏读&#xff0c;就…

Unity项目如何上传Gitee仓库

前言 最近Unity项目比较多&#xff0c;我都是把Unity项目上传到Gitee中去&#xff0c;GitHub的话我用的少&#xff0c;可能我还是更喜欢Gitee吧&#xff0c;毕竟Gitee仓库用起来更加方便&#xff0c;注意Unity项目上传时最佳的方式是把 Asste ProjectSetting 两个文件夹上传上…

如何使用CSS实现一个自适应两栏布局,其中一栏固定宽度,另一栏自适应宽度?

聚沙成塔每天进步一点点 ⭐ 专栏简介⭐ 使用Float属性⭐ 使用Flexbox布局⭐ 写在最后 ⭐ 专栏简介 前端入门之旅&#xff1a;探索Web开发的奇妙世界 记得点击上方或者右侧链接订阅本专栏哦 几何带你启航前端之旅 欢迎来到前端入门之旅&#xff01;这个专栏是为那些对Web开发感…

代码随想录算法训练营第四十六天 | 139.单词拆分

代码随想录算法训练营第四十六天 | 139.单词拆分 139.单词拆分 139.单词拆分 题目链接 视频讲解 给你一个字符串 s 和一个字符串列表 wordDict 作为字典。请你判断是否可以利用字典中出现的单词拼接出 s 注意&#xff1a;不要求字典中出现的单词全部都使用&#xff0c;并且字典…