DM SQL关联列 like 优化案例

news/2025/2/11 12:42:24/文章来源:https://www.cnblogs.com/yuzhijian/p/18709525

1.1、sql优化背景

达梦一哥们找我优化条SQL,反馈在DM8数据库执行时间很慢出不来结果, 监控工具显示这条SQL的执行时间需要20多万毫秒,安排。


1.2、慢sql和执行时间

select a.col1 as d_id,
a.col2 as s_id,
a.col3 as bm,
a.col4,
a.col5,
(select b.col1 from table2 b where b.col_itname = 'zb1' and b.col1 = a.col20) as bb,
a.col6 as dzzlxr,
a.col7 as dzzlxdh,
(select b.col1 from table2 b where b.col_itname = 'zb2' and b.col1 = a.col21) as bc,
(select b.col1 from table2 b where b.col_itname = 'zb3' and b.col1 = a.col22) as cb,
a.col8,
date_format(a.col9, '%Y-%m-%d %H:%i:%s') as gx,
a.col10 as cid,
a.col11 as tp,
(select b.col5 from table1 b where b.col1 = a.col2) as sj,
(select count(*) from table3 dy left join table1 dzz on dy.col1 = dzz.col1 where dzz.col11 like concat(a.col11,'%')) as rc
from table1 a
where 1 = 1
and a.col1 in ( /* 这里 in 了 600 个 字符串条件 */ );100条执行成功, 执行耗时1分 28秒 248毫秒. 执行号:1432757809

1.3、慢sql执行计划

1 #NSET2: [1330892675, 12345, 692]
2 #PIPE2: [1330892675, 12345, 692]
3 #PIPE2: [1330892669, 12345, 692]
4 #PIPE2: [1330892663, 12345, 692]
5 #PIPE2: [1330892657, 12345, 692]
6 #PIPE2: [1330892648, 12345, 692]
7 #PRJT2: [4, 12345, 692]; exp_num(17), is_atom(FALSE)
8 #NEST LOOP INDEX JOIN2: [4, 12345, 692]
9 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
10 #BLKUP2: [3, 1, 0]; INDEX33571964(A)
11 #SSEK2: [3, 1, 0]; scan_type(ASC), INDEX33571964(table1 as A), scan_range[DMTEMPVIEW_22201688.colname,DMTEMPVIEW_22201688.colname]
12 #SPL2: [1330892644, 1, 852]; key_num(2), spool_num(4), is_atom(FALSE), has_variable(0)
13 #PRJT2: [1330892644, 1, 852]; exp_num(3), is_atom(FALSE)
14 #HAGR2: [1330892644, 1, 852]; grp_num(1), sfun_num(3); slave_empty(0) keys(A.ROWID)
15 #NEST LOOP LEFT JOIN2: [1327131762, 71772595, 852]; join condition(DZZ.col11 LIKE exp11) partition_keys_num(0) ret_null(0)
16 #NEST LOOP INDEX JOIN2: [4, 12345, 692]
17 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
18 #BLKUP2: [3, 1, 0]; INDEX33571964(A)
19 #SSEK2: [3, 1, 0]; scan_type(ASC), INDEX33571964(table1 as A), scan_range[DMTEMPVIEW_22201689.colname,DMTEMPVIEW_22201689.colname]
20 #HASH2 INNER JOIN: [26, 116278, 160]; LKEY_UNIQUE KEY_NUM(1); KEY(DZZ.col1=DY.col1) KEY_NULL_EQU(0)
21 #CSCN2: [1, 12345, 104]; INDEX33571530(table1 as DZZ)
22 #SSCN: [13, 116278, 56]; IDX_DYJBXX_ORGID(table3 as DY)
23 #SPL2: [9, 9876, 740]; key_num(2), spool_num(3), is_atom(FALSE), has_variable(0)
24 #PRJT2: [9, 9876, 740]; exp_num(2), is_atom(FALSE)
25 #HASH RIGHT SEMI JOIN2: [9, 9876, 740]; n_keys(1) KEY(DMTEMPVIEW_22201694.colname=A.col1) KEY_NULL_EQU(0)
26 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
27 #HASH2 INNER JOIN: [9, 9876, 740]; LKEY_UNIQUE KEY_NUM(1); KEY(B.col1=A.col2) KEY_NULL_EQU(0)
28 #CSCN2: [1, 12345, 96]; INDEX33571530(table1 as B)
29 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
30 #SPL2: [5, 11618, 740]; key_num(2), spool_num(2), is_atom(FALSE), has_variable(0)
31 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
32 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201695.colname=A.col1) KEY_NULL_EQU(0)
33 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
34 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col22) KEY_NULL_EQU(0)
35 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb3',min),('zb3',max))
36 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
37 #SPL2: [5, 11618, 740]; key_num(2), spool_num(1), is_atom(FALSE), has_variable(0)
38 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
39 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201696.colname=A.col1) KEY_NULL_EQU(0)
40 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
41 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col21) KEY_NULL_EQU(0)
42 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb2',min),('zb2',max))
43 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
44 #SPL2: [5, 11618, 740]; key_num(2), spool_num(0), is_atom(FALSE), has_variable(0)
45 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
46 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201697.colname=A.col1) KEY_NULL_EQU(0)
47 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
48 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col20) KEY_NULL_EQU(0)
49 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb1',min),('zb1',max))
50 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)

1.4、涉及表的数据量

select count(1) from table1
union all
select count(1) from table2
union all
select count(1) from table3;

1.5、分析过程

用瞪眼大法观察,目测是这几段标量子查询导致慢的(啥是瞪眼大法?问就是优化这么多案例的经验)

(select b.col1 from table2 b where b.col_itname = 'zb1' and b.col1 = a.col20) as bb,
(select b.col1 from table2 b where b.col_itname = 'zb2' and b.col1 = a.col21) as bc,
(select b.col1 from table2 b where b.col_itname = 'zb3' and b.col1 = a.col22) as cb,
(select count(*) from table3 dy left join table1 dzz on dy.col1 = dzz.col1 where dzz.col11 like concat(a.col11,'%')) as rc

每段标量子查询测试后,发现是最后一段标量子查询缓慢导致

-- (select b.col1 from table2 b where b.col_itname = 'zb1' and b.col1 = a.col20) as bb,
-- (select b.col1 from table2 b where b.col_itname = 'zb2' and b.col1 = a.col21) as bc,
-- (select b.col1 from table2 b where b.col_itname = 'zb3' and b.col1 = a.col22) as cb,
(select count(*) from table3 dy left join table1 dzz on dy.col1 = dzz.col1 where dzz.col11 like concat(a.col11,'%')) as rc

做了个测试,如果将 like 改成 = 的话,非常快出结果

(select count(*) from table3 dy left join table1 dzz on dy.col1 = dzz.col1 where dzz.col11 = a.col11 ) as rc

dzz.col11 字段是有索引,尝试过各种手段都用不上,只能改写SQL。

2.1、SQL等价改写

我想法就是将 like 关联这种模糊态查询改成 = 这种确定态的精准匹配逻辑,想了好几个小时都没什么头绪。

后面只能去翻翻落总博客,卧槽,还没想到真的给我看到类似的case ,瞬间有了灵感做了下面改写:

select a.col1 as d_id,
a.col2 as s_id,
a.col3 as bm,
a.col4,
a.col5,
(select b.col1 from table2 b where b.col_itname = 'zb1' and b.col1 = a.col20) as bb,
a.col6 as dzzlxr,
a.col7 as dzzlxdh,
(select b.col1 from table2 b where b.col_itname = 'zb2' and b.col1 = a.col21) as bc,
(select b.col1 from table2 b where b.col_itname = 'zb3' and b.col1 = a.col22) as cb,
a.col8,
date_format(a.col9, '%Y-%m-%d %H:%i:%s') as gx,
a.col10 as cid,
a.col11 as tp,
(select b.col5 from table1 b where b.col1 = a.col2) as sj,
b.cnt as rc
from table1 a
LEFT JOIN (
SELECT COUNT(*) cnt,
dzz.col11
FROM table3 dy
LEFT JOIN table1 dzz
ON dy.col1 = dzz.col1
GROUP BY dzz.col11
) b ON SUBSTR(b.col11, 1, LENGTH(a.col11)) = a.col11
where 1 = 1
and a.col1 in (-- 这里 in 了 600 个 字符串条件
);100条执行成功, 执行耗时5秒 326毫秒. 执行号:1435485506

改写完后5秒左右就能出结果了,差集比对后也是等价,呦西。

2.2、SQL改写后执行计划

1 #NSET2: [524737849, 358862, 740]
2 #PIPE2: [524737849, 358862, 740]
3 #PIPE2: [524737843, 358862, 740]
4 #PIPE2: [524737837, 358862, 740]
5 #PIPE2: [524737831, 358862, 740]
6 #PRJT2: [524737822, 358862, 740]; exp_num(16), is_atom(FALSE)
7 #NEST LOOP LEFT JOIN2: [524737822, 358862, 740]; join condition(A.col11 = exp11) partition_keys_num(0) ret_null(0)
8 #NEST LOOP INDEX JOIN2: [4, 12345, 692]
9 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
10 #BLKUP2: [3, 1, 0]; INDEX33571964(A)
11 #SSEK2: [3, 1, 0]; scan_type(ASC), INDEX33571964(table1 as A), scan_range[DMTEMPVIEW_22201592.colname,DMTEMPVIEW_22201592.colname]
12 #PRJT2: [33, 1162, 48]; exp_num(2), is_atom(FALSE)
13 #HAGR2: [33, 1162, 48]; grp_num(1), sfun_num(1); slave_empty(0) keys(DZZ.col11)
14 #HASH RIGHT JOIN2: [25, 116278, 48]; key_num(1), ret_null(0), KEY(DZZ.col1=DY.col1)
15 #CSCN2: [1, 12345, 96]; INDEX33571530(table1 as DZZ)
16 #SSCN: [13, 116278, 48]; IDX_DYJBXX_ORGID(table3 as DY)
17 #SPL2: [9, 9876, 740]; key_num(2), spool_num(3), is_atom(FALSE), has_variable(0)
18 #PRJT2: [9, 9876, 740]; exp_num(2), is_atom(FALSE)
19 #HASH RIGHT SEMI JOIN2: [9, 9876, 740]; n_keys(1) KEY(DMTEMPVIEW_22201597.colname=A.col1) KEY_NULL_EQU(0)
20 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
21 #HASH2 INNER JOIN: [9, 9876, 740]; LKEY_UNIQUE KEY_NUM(1); KEY(B.col1=A.col2) KEY_NULL_EQU(0)
22 #CSCN2: [1, 12345, 96]; INDEX33571530(table1 as B)
23 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
24 #SPL2: [5, 11618, 740]; key_num(2), spool_num(2), is_atom(FALSE), has_variable(0)
25 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
26 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201598.colname=A.col1) KEY_NULL_EQU(0)
27 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
28 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col22) KEY_NULL_EQU(0)
29 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb3',min),('zb3',max))
30 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
31 #SPL2: [5, 11618, 740]; key_num(2), spool_num(1), is_atom(FALSE), has_variable(0)
32 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
33 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201599.colname=A.col1) KEY_NULL_EQU(0)
34 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
35 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col21) KEY_NULL_EQU(0)
36 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb2',min),('zb2',max))
37 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)
38 #SPL2: [5, 11618, 740]; key_num(2), spool_num(0), is_atom(FALSE), has_variable(0)
39 #PRJT2: [5, 11618, 740]; exp_num(2), is_atom(FALSE)
40 #HASH RIGHT SEMI JOIN2: [5, 11618, 740]; n_keys(1) KEY(DMTEMPVIEW_22201600.colname=A.col1) KEY_NULL_EQU(0)
41 #CONST VALUE LIST: [1, 600, 48]; row_num(600), col_num(1),
42 #HASH2 INNER JOIN: [5, 11618, 740]; KEY_NUM(1); KEY(B.col1=A.col20) KEY_NULL_EQU(0)
43 #SSEK2: [1, 120, 96]; scan_type(ASC), INDEX33572004(table2 as B), scan_range[('zb1',min),('zb1',max))
44 #CSCN2: [2, 12345, 644]; INDEX33571530(table1 as A)

2.3、 总结

像这种用 like 做关联很明显是业务涉及不规范,不符合三范式要求。

在业务设计初期,尽量满足好三范式设计,后续才能少点用 like 这种模糊态的查询操作。

业务允许的情况下,尽量使用 = 精确匹配来代替like。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/882048.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

rust学习笔记(7)

crate 中文是货箱,这是我们编写自己的库或者程序的方式 库 使用rustc可以把一个文件编译为lib rustc --crate-type=lib rary.rs构建的方式选择lib编译出来的文件rlib,并且文件的开头会加上lib前缀使用库 rustc main.rs --extern rary=library.rlib --edition=2018在编译使用的时…

mysql 系统变量

前言 简单介绍一下mysql的系统变量 正文 当我们之间查看mysql的系统变量的时候呢? 我们使用show variables,这样我们就可以查看到系统变量。 但是这样显然是没有意义的。可以看到很多很多的系统变量,如果想查具体的,那么就是: SHOW VARIABLES LIKE default_storage_engine又…

批量PDF文件转Word,免费!

今天推荐一款免费的PDF文件转换工具,它包括了PDF压缩、PDF转Word、PDF转Excel、提取PDF中图片等功能,下载地址在文末。 操作步骤 1、打开PDF转换工具,点击菜单PDF转Word,如下图2、添加需要转换的PDF文件,支持批量添加多个文件一起进行转换,如果一个目录下面的所有文件都需…

【博主亲测可用】科学计算软件:Wolfram Mathematica 14.2.0(附软件包及安装教程)

软件介绍 Wolfram Mathematica 14.2.0是一款功能全面且强大的数学计算与分析软件,它在大数据处理、人工智能集成、符号数组功能扩展、图形和可视化改进以及性能提升方面都有显著的增强。这些改进不仅使用户能够更高效地进行数据分析和科学计算,还极大地提升了用户体验。无论是…

硅基流动

使用硅基流动+Cherry ai部署 硅基流动网站:https://cloud.siliconflow.cn/i/OIItglHJ 邀请码:OIItglHJ 首次注册免费2000万Tokens Cherry ai网站:https://cherry-ai.com/download第一步,登入硅基流动网站,注册后 在左边选择api密钥,右上角选择新建api密钥新建后点击密钥…

Windows系统安装Ollama超简教程(附DeepSeek R1实战)

一、Ollama下载指引 官网地址:https://ollama.com/download选择Windows版本直接下载(推荐64位系统),安装包745MB左右,支持Win10/11系统。点击"Download for Windows"按钮即可开始下载。 二、安装过程详解双击下载的OllamaSetup.exe点击install之后,一路下一步就…

部署milvus2.5.3(standalone模式)

环境:os:Centos 7milvus:2.5.31.创建部署目录mkdir -p /home/middle/milvus2.准备docker-compose.yml配置文件内如如下: 我这里使用的是自己的镜像,需要根据自己环境情况进行修改[root@host135 milvus]# more docker-compose.yml version: 3.5services:etcd:container_name: m…

[书]清华大学DeepSeek:从入门到精通

通过网盘分享的文件:清华大学DeepSeek:从入门到精通.pdf等3个文件链接: https://pan.baidu.com/s/1y0-b3seTz7gMTTuPxYS7Vg?pwd=xd25 提取码: xd25一共三本资料

六. UML

UML 一. 事物 1.结构事物 结构事物是UML模型中的名词。它们通常是模型的静态部分,描述概念或物理元素。结构事物包括类(Class)、接口(Interface)、协作(Collaboration)、用例(Use Case)、主动类(Active Class)、构件(Component)、制品( Artifact)和结点(Node)。 各种结构事物的…

探索 QuestPDF:全平台支持、多功能、专业级的 .NET PDF 库

QuestPDF 是一个用于生成 PDF 文件的 .NET 库,它提供了一个简洁的 API 和灵活的布局选项,使得在 .NET 应用程序中创建 PDF 文件变得更加简单。 支持多平台,支持的功能有合并文档 附加文件 提取页面 加密/解密 扩展元数据 限制访问 针对 Web 进行优化 叠加层 / 底层安装 第一…

Java 中堆内存和栈内存上的数据分布和特点

说到 Java 中内存我们一般笼统地划分为堆内存(Heap)和栈内存(Stack),那么哪些数据被放置在堆内存?哪些数据被放置在栈内存?这些数据的分布有什么特点吗?博客:https://www.emanjusaka.com 博客园:https://www.cnblogs.com/emanjusaka 公众号:emanjusaka的编程栈by em…

AspNetCore 实战:三种流式响应机制详解

在现代Web应用中,实时数据传输和高效的数据流处理变得越来越重要。AspNetCore 提供了多种流式响应机制,以满足不同场景下的需求。 在使用ChatGpt,deepseek的适合有没有想过ai的逐字显示回答是怎么实现的,下面将介绍三种主要的流式响应来实现此功能。 Server-Sent Events (S…