文章目录
- 1、抓出外键没创建索引的表
- 2、抓出需要收集直方图的列
- 3、抓出必须创建索引的列
- 4、抓出SELECT * 的SQL
- 5、抓出有标量子查询的SQL
- 6、抓出带有自定义函数的SQL
- 7、抓出表被多次反复调用SQL
- 8、抓出走了FILTER的SQL
- 9、抓出返回行数较多的嵌套循环SQL
- 10、抓出NL被驱动表走了全表扫描的SQL
- 11、抓出走了TABLE ACCESS FULL的SQL
- 12、抓出走了INDEX FULL SCAN的SQL
- 13、抓出走了INDEX SKIP SCAN的SQL
- 14、抓出索引被哪些SQL引用
- 15、 抓出走了笛卡儿积的SQL
- 16、抓出走了错误的排序合并连接的SQL
- 17、抓出LOOP套LOOP的PSQL
- 18、抓出走了低选择性索引的SQL
- 19、抓出可以创建组合索引的SQL(回表再过滤选择性高的列)
- 20、抓出可以创建组合索引的SQL(回表只访问少数字段)
我们为大家分享一些常用的全自动SQL审核脚本,在实际工作中,我们可以对脚本进行适当修改,以便适应自己的数据库环境,从而提升工作效率。
1、抓出外键没创建索引的表
此脚本不依赖统计信息。
建议在外键列上创建索引,外键列不创建索引容易导致死锁。级联删除的时候,外键列没有索引会导致表被全表扫描。以下脚本抓出Scott账户下外键没创建索引的表:
with cons as (select /*+ materialize */ owner, table_name, constraint_namefrom dba_constraintswhere owner = 'SCOTT'AND constraint_type = 'R'),idx as (select /*+ materialize*/ table_owner,table_name, column_namefrom dba_ind_columnswhere table_owner = 'SCOTT')
select owner,table_name,constraint_name,column_namefrom dba_cons_columnswhere (owner,table_name, constraint_name) in(select * from cons)and (owner,table_name, column_name) not in(select * from idx);
在Scott账户中,EMP表的deptno列引用了DEPT表的deptno列,但是没有创建索引,因此我们通过脚本可以将其抓出:
2、抓出需要收集直方图的列
此脚本依赖统计信息。当一个表比较大,列选择性低于5%,而且列出现在where条件中,为了防止优化器估算Rows出现较大偏差,我们需要对这种列收集直方图。以下脚本抓出Scott账户下,表总行数大于5万行、列选择性低于5%并且列出现在where条件中的表以及列信息:
select a.owner,a.table_name,a.column_name,b.num_rows,a.num_distinct Cardinality,round(a.num_distinct / b.num_rows * 100, 2) selectivityfrom dba_tab_col_statistics a, dba_tables bwhere a.owner = b.ownerand a.table_name = b.table_nameand a.owner = 'SCOTT'and round(a.num_distinct / b.num_rows * 100, 2) < 5and num_rows > 50000and (a.table_name, a.column_name) in(select o.name, c.namefrom sys.col_usage$ u, sys.obj$ o, sys.col$ c, sys.user$ rwhere o.obj# = u.obj#and c.obj# = u.obj#and c.col# = u.intcol#and r.name = 'SCOTT');
在Scott账户中,test表总行数大于5万行,owner列选择性小于5%,而且出现在where条件中,通过以上脚本我们可以将其抓出:
3、抓出必须创建索引的列
此脚本依赖统计信息。当一个表比较大,列选择性超过20%,列出现在where条件中并且没有创建索引,我们可以对该列创建索引从而提升SQL查询性能。以下脚本抓出Scott账户下表总行数大于5万行、列选择性超过20%、列出现在where条件中并且没有创建索引:
select owner,table_name,column_name,num_rows,Cardinality,selectivity
from (select a.owner,a.table_name,a.column_name,b.num_rows,a.num_distinct Cardinality,round(a.num_distinct / b.num_rows * 100, 2) selectivityfrom dba_tab_col_statistics a,dba_tables bwhere a.owner = b.ownerand a.table_name = b.table_nameand a.owner = 'SCOTT')
where selectivity >= 20and num_rows > 50000and (table_name, column_name) not in(select table_name, column_namefrom dba_ind_columnswhere table_owner = 'SCOTT'and column_position = 1)and (table_name, column_name) in(select o.name, c.namefrom sys.col_usage$ u,sys.obj$ o,sys.col$ c,sys.user$ rwhere o.obj# = u.obj#and c.obj# = u.obj#and c.col# = u.intcol#and r.name = 'SCOTT');
4、抓出SELECT * 的SQL
此脚本不依赖统计信息。在开发过程中,我们应该尽量避免编写SELECT * 这种SQL。SELECT * 这种SQL,走索引无法避免回表,走HASH连接的时候会将驱动表所有的列放入PGA中,浪费PGA内存。执行计划中(V$SQL_PLAN/PLAN_TABLE),projection字段表示访问了哪些字段,如果projection字段中字段个数等于表的字段总个数,那么我们就可以判断SQL语句使用了SELECT *。以下脚本抓出SELECT * 的SQL:
select a.sql_id, a.sql_text, c.owner, d.table_name, d.column_cnt, c.size_mbfrom v$sql a,v$sql_plan b,(select owner, segment_name, sum(bytes / 1024 / 1024) size_mbfrom dba_segmentsgroup by owner, segment_name) c,(select owner, table_name, count(*) column_cntfrom dba_tab_colsgroup by owner, table_name) dwhere a.sql_id = b.sql_idand a.child_number = b.child_numberand b.object_owner = c.ownerand b.object_name = c.segment_nameand b.object_owner = d.ownerand b.object_name = d.table_nameand REGEXP_COUNT(b.projection, ']') = d.column_cntand c.owner = 'SCOTT'order by 6 desc;
我们在Scott账户中运行如下SQL:
select * from t where object_id<1000;
我们使用脚本将其抓出:
select a.sql_id, a.sql_text, c.owner, d.table_name, d.column_cnt, c.size_mb
from v$sql a,v$sql_plan b,(select owner, segment_name, sum(bytes / 1024 / 1024) size_mbfrom dba_segmentsgroup by owner, segment_name) c,(select owner, table_name, count(*) column_cntfrom dba_tab_colsgroup by owner, table_name) d
where a.sql_id = b.sql_idand a.child_number = b.child_numberand b.object_owner = c.ownerand b.object_name = c.segment_nameand b.object_owner = d.ownerand b.object_name = d.table_nameand REGEXP_COUNT(b.projection, ']') = d.column_cntand c.owner = 'SCOTT'
order by 6 desc;
5、抓出有标量子查询的SQL
此脚本不依赖统计信息。在开发过程中,我们应该尽量避免编写标量子查询。我们可以通过分析执行计划,抓出标量子查询语句。同一个SQL语句,执行计划中如果有两个或者两个以上的depth=1的执行计划就表示SQL中出现了标量子查询。以下脚本抓出Scott账户下在SQL*Plus中运行过的标量子查询语句:
select sql_id, sql_text, modulefrom v$sqlwhere parsing_schema_name = 'SCOTT'and module = 'SQL*Plus'AND sql_id in(select sql_idfrom (select sql_id,count(*) over(partition by sql_id, child_number, depth) cntfrom V$SQL_PLANwhere depth = 1and (object_owner = 'SCOTT' or object_owner is null))where cnt >= 2);
我们在SQL*Plus中运行如下标量子查询语句:
SQL> select dname,2 (select max(sal) from emp where deptno = d.deptno) max_sal3 from dept d;DNAME MAX_SAL
-------------- ----------
ACCOUNTING 5000
RESEARCH 3000
SALES 2850
OPERATIONS
我们利用以上脚本将刚运行过的标量子查询抓出:
SQL> select sql_id, sql_text, module2 from v$sql3 where parsing_schema_name = 'SCOTT'4 and module = 'SQL*Plus'5 AND sql_id in6 (select sql_id7 from (select sql_id,8 count(*) over(partition by sql_id, child_number, depth) cnt9 from V$SQL_PLAN10 where depth = 111 and (object_owner = 'SCOTT' or object_owner is null))12 where cnt >= 2);SQL_ID SQL_TEXT MODULE
--------------- ---------------------------------------------- ---------------------
739fhcu0pbz28 select dname, (select max(sal) from emp where SQL*Plusdeptno = d.deptno) max_sal from dept d
6、抓出带有自定义函数的SQL
此脚本不依赖统计信息。在开发过程中,我们应该避免在SQL语句中调用自定义函数。我们可以通过以下SQL语句抓出SQL语句中调用了自定义函数的SQL:
select distinct sql_id, sql_text, modulefrom V$SQL,(select object_namefrom DBA_OBJECTS Owhere owner = 'SCOTT'and object_type in ('FUNCTION', 'PACKAGE'))where (instr(upper(sql_text), object_name) > 0)and plsql_exec_time > 0and regexp_like(upper(sql_fulltext), '^[SELECT]')and parsing_schema_name = 'SCOTT';
我们在Scott账户中创建如下函数:
create or replace function f_getdname(v_deptno in number) return varchar2 asv_dname dept.dname%type;
beginselect dname into v_dname from dept where deptno = v_deptno;return v_dname;
end f_getdname;
/
然后我们在Scott账户中运行如下SQL:
SQL> select empno,sal,f_getdname(deptno) dname from emp;EMPNO SAL DNAME
---------- ---------- -------------------------7369 800 RESEARCH7499 1600 SALES7521 1250 SALES7566 2975 RESEARCH7654 1250 SALES7698 2850 SALES7782 2450 ACCOUNTING7788 3000 RESEARCH7839 5000 ACCOUNTING7844 1500 SALES7876 1100 RESEARCH7900 950 SALES7902 3000 RESEARCH7934 1300 ACCOUNTING
我们通过脚本抓出刚执行过的SQL语句:
SQL> select distinct sql_id, sql_text, module2 from V$SQL,3 (select object_name4 from DBA_OBJECTS O5 where owner = 'SCOTT'6 and object_type in ('FUNCTION', 'PACKAGE'))7 where (instr(upper(sql_text), object_name) > 0)8 and plsql_exec_time > 09 and regexp_like(upper(sql_fulltext), '^[SELECT]')10 and parsing_schema_name = 'SCOTT';SQL_ID SQL_TEXT MODULE
--------------- ------------------------------------------------------- ---------
2ck71xc69j49u select empno,sal,f_getdname(deptno) dname from emp SQL*Plus
7、抓出表被多次反复调用SQL
此脚本不依赖统计信息。在开发过程中,我们应该避免在同一个SQL语句中对同一个表多次访问。我们可以通过下面SQL抓出同一个SQL语句中对某个表进行多次扫描的SQL:
select a.parsing_schema_name schema,a.sql_id,a.sql_text,b.object_name,b.cntfrom v$sql a,(select *from (select sql_id,child_number,object_owner,object_name,object_type,count(*) cntfrom v$sql_planwhere object_owner = 'SCOTT'group by sql_id,child_number,object_owner,object_name,object_type)where cnt >= 2) bwhere a.sql_id = b.sql_idand a.child_number = b.child_number;
我们在Scott账户中运行如下SQL:
select ename,job,deptno from emp where sal>(select avg(sal) from emp);
以上SQL访问了emp表两次,我们可以通过脚本将其抓出:
SQL> select a.parsing_schema_name schema,2 a.sql_id,3 a.sql_text,4 b.object_name,5 b.cnt6 from v$sql a,7 (select *8 from (select sql_id,9 child_number,10 object_owner,11 object_name,12 object_type,13 count(*) cnt14 from v$sql_plan15 where object_owner = 'SCOTT'16 group by sql_id,17 child_number,18 object_owner,19 object_name,20 object_type)21 where cnt >= 2) b22 where a.sql_id = b.sql_id23 and a.child_number = b.child_number;SCHEMA SQL_ID SQL_TEXT OBJECT_NAME CNT
--------------- --------------- ----------------------------- ------------ ----------
SCOTT fdt0z70z43vgv select ename,job,deptno from EMP 2emp where sal>(select avg(sal)from emp)
8、抓出走了FILTER的SQL
此脚本不依赖统计信息。当where子查询没能unnest,执行计划中就会出现FILTER,对于此类SQL,我们应该在上线之前对其进行改写,避免执行计划中出现FILTER,以下脚本可以抓出where子查询没能unnest的SQL:
select parsing_schema_name schema, sql_id, sql_textfrom v$sqlwhere parsing_schema_name = 'SCOTT'and (sql_id, child_number) in(select sql_id, child_numberfrom v$sql_planwhere operation = 'FILTER'and filter_predicates like '%IS NOT NULL%'minusselect sql_id, child_numberfrom v$sql_planwhere object_owner = 'SYS');
9、抓出返回行数较多的嵌套循环SQL
此脚本不依赖统计信息。两表关联返回少量数据应该走嵌套循环,如果返回大量数据,应该走HASH连接,或者是排序合并连接。如果一个SQL语句返回行数较多(大于1万行),SQL的执行计划在最后几步(Id<=5)走了嵌套循环,我们可以判定该执行计划中的嵌套循环是有问题的,应该走HASH连接。以下脚本抓出返回行数较多的嵌套循环SQL:
select *from (select parsing_schema_name schema,sql_id,sql_text,rows_processed / executions rows_processedfrom v$sqlwhere parsing_schema_name = 'SCOTT'and executions > 0and rows_processed / executions > 10000order by 4 desc) awhere a.sql_id in (select sql_idfrom v$sql_planwhere operation like '%NESTED LOOPS%'and id <= 5);
10、抓出NL被驱动表走了全表扫描的SQL
此脚本不依赖统计信息。嵌套循环的被驱动表应该走索引,以下脚本抓出嵌套循环被驱动表走了全表扫描的SQL,同时根据表大小降序显示:
select c.sql_text, a.sql_id, b.object_name, d.mbfrom v$sql_plan a,(select *from (select sql_id,child_number,object_owner,object_name,parent_id,operation,options,row_number() over(partition by sql_id, child_number, parent_id order by id) rnfrom v$sql_plan)where rn = 2) b,v$sql c,(select owner, segment_name, sum(bytes / 1024 / 1024) mbfrom dba_segmentsgroup by owner, segment_name) dwhere b.sql_id = c.sql_idand b.child_number = c.child_numberand b.object_owner = 'SCOTT'and a.sql_id = b.sql_idand a.child_number = b.child_numberand a.operation like '%NESTED LOOPS%'and a.id = b.parent_idand b.operation = 'TABLE ACCESS'and b.options = 'FULL'and b.object_owner = d.ownerand b.object_name = d.segment_nameorder by 4 desc;
11、抓出走了TABLE ACCESS FULL的SQL
此脚本不依赖统计信息。如果一个大表走了全表扫描,会严重影响SQL性能。这时我们可以查看大表与谁进行关联。如果大表与小表(小结果集)关联,我们可以考虑让大表作为嵌套循环被驱动表,大表走连接列索引。如果大表与大表(大结果集)关联,我们可以检查大表过滤条件是否可以走索引,也要检查大表被访问了多少个字段。假设大表有50个字段,但是只访问了其中5个字段,这时我们可以建立一个组合索引,将where过滤字段、表连接字段以及select访问的字段组合在一起,这样就可以直接从索引中获取数据,避免大表全表扫描,从而提升性能。下面脚本抓出走了全表扫描的SQL,同时显示访问了表多少个字段,表一共有多少个字段以及表段大小:
select a.sql_id,a.sql_text,d.table_name,REGEXP_COUNT(b.projection, ']') ||'/'|| d.column_cnt column_cnt,c.size_mb,b.FILTER_PREDICATES filterfrom v$sql a,v$sql_plan b,(select owner, segment_name, sum(bytes / 1024 / 1024) size_mbfrom dba_segmentsgroup by owner, segment_name) c,(select owner, table_name, count(*) column_cntfrom dba_tab_colsgroup by owner, table_name) dwhere a.sql_id = b.sql_idand a.child_number = b.child_numberand b.object_owner = c.ownerand b.object_name = c.segment_nameand b.object_owner = d.ownerand b.object_name = d.table_nameand c.owner = 'SCOTT'and b.operation = 'TABLE ACCESS'and b.options = 'FULL'order by 5 desc;
12、抓出走了INDEX FULL SCAN的SQL
此脚本不依赖统计信息。INDEX FULL SCAN会扫描索引中所有的叶子块,单块读。如果索引很大,执行计划中出现了INDEX FULL SCAN,这时SQL会出现严重的性能问题,因此我们需要抓出走了INDEX FULL SCAN的SQL。以下脚本抓出走了INDEX FULL SCAN的SQL并且根据索引段大小降序显示:
select c.sql_text, c.sql_id, b.object_name, d.mbfrom v$sql_plan b,v$sql c,(select owner, segment_name, sum(bytes / 1024 / 1024) mbfrom dba_segmentsgroup by owner, segment_name) dwhere b.sql_id = c.sql_idand b.child_number = c.child_numberand b.object_owner = 'SCOTT'and b.operation = 'INDEX'and b.options = 'FULL SCAN'and b.object_owner = d.ownerand b.object_name = d.segment_nameorder by 4 desc;
13、抓出走了INDEX SKIP SCAN的SQL
此脚本不依赖统计信息。当执行计划中出现了INDEX SKIP SCAN,通常说明需要额外添加一个索引。以下脚本抓出走了INDEX SKIP SCAN的SQL:
select c.sql_text, c.sql_id, b.object_name, d.mbfrom v$sql_plan b,v$sql c,(select owner, segment_name, sum(bytes / 1024 / 1024) mbfrom dba_segmentsgroup by owner, segment_name) dwhere b.sql_id = c.sql_idand b.child_number = c.child_numberand b.object_owner = 'SCOTT'and b.operation = 'INDEX'and b.options = 'SKIP SCAN'and b.object_owner = d.ownerand b.object_name = d.segment_nameorder by 4 desc;
14、抓出索引被哪些SQL引用
此脚本不依赖统计信息。有时开发人员可能会胡乱建立一些索引,但是这些索引在数据库中可能并不会被任何一个SQL使用。这样的索引会增加维护成本,我们可以将其删掉。下面脚本查询SQL使用哪些索引:
select a.sql_text, a.sql_id, b.object_owner, b.object_name, b.object_typefrom v$sql a, v$sql_plan bwhere a.sql_id = b.sql_idand a.child_number = b.child_numberand object_owner = 'SCOTT'and object_type like '%INDEX%'
order by 3,4,5;
15、 抓出走了笛卡儿积的SQL
当两表没有关联条件的时候就会走笛卡儿积,当Rows被估算为1的时候,也可能走笛卡儿积连接。下面脚本抓出走了笛卡儿积的SQL:
select c.sql_text,a.sql_id,b.object_name,a.filter_predicates filter,a.access_predicates predicate,d.mbfrom v$sql_plan a,(select *from (select sql_id,child_number,object_owner,object_name,parent_id,operation,options,row_number() over(partition by sql_id, child_number, parent_id order by id) rnfrom v$sql_plan)where rn = 1) b,v$sql c,(select owner, segment_name, sum(bytes / 1024 / 1024) mbfrom dba_segmentsgroup by owner, segment_name) dwhere b.sql_id = c.sql_idand b.child_number = c.child_numberand b.object_owner = 'SCOTT'and a.sql_id = b.sql_idand a.child_number = b.child_numberand a.operation = 'MERGE JOIN'and a.id = b.parent_idand a.options = 'CARTESIAN'and b.object_owner = d.ownerand b.object_name = d.segment_nameorder by 4 desc;
16、抓出走了错误的排序合并连接的SQL
此脚本不依赖统计信息。排序合并连接一般用于非等值关联,如果两表是等值关联,我们建议使用HASH连接代替排序合并连接,因为HASH连接只需要将驱动表放入PGA中,而排序合并连接要么是将两个表放入PGA中,要么是将一个表放入PGA中、另外一个表走INDEX FULL SCAN,然后回表。如果两表是等值关联并且两表比较大,这时应该走HASH连接而不是排序合并连接。下面脚本抓出两表等值关联但是走了排序合并连接的SQL,同时显示离MERGE JOIN关键字较远的表的段大小(太大PGA放不下):
select c.sql_id, c.sql_text, d.owner, d.segment_name, d.mbfrom v$sql_plan a,v$sql_plan b,v$sql c,(select owner, segment_name, sum(bytes / 1024 / 1024) mbfrom dba_segmentsgroup by owner, segment_name) dwhere a.sql_id = b.sql_idand a.child_number = b.child_numberand b.operation = 'SORT'and b.options = 'JOIN'and b.access_predicates like '%"="%'and a.parent_id = b.idand a.object_owner = 'SCOTT'and b.sql_id = c.sql_idand b.child_number = c.child_numberand a.object_owner = d.ownerand a.object_name = d.segment_nameorder by 4 desc;
17、抓出LOOP套LOOP的PSQL
此脚本不依赖统计信息。在编写PLSQL的时候,我们应该尽量避免LOOP套LOOP,因为双层循环,最内层循环类似笛卡儿积。假设外层循环返回1 000行数据,内层循环返回1 000行数据,那么内层循环里面的代码就会执行1000*1000次。以下脚本可以抓出LOOP套LOOP的PLSQL:
with x as
(select / *+ materialize */ owner,name,type,line,text,rownum rn from dba_source where (upper(text) like '%END%LOOP%' or upper(text) like '%FOR%LOOP%'))
select a.owner,a.name,a.type from x a,x b
where ((upper(a.text) like '%END%LOOP%'
and upper(b.text) like '%END%LOOP%'
and a.rn+1=b.rn)
or (upper(a.text) like '%FOR%LOOP%'
and upper(b.text) like '%FOR%LOOP%'
and a.rn+1=b.rn))
and a.owner=b.owner
and a.name=b.name
and a.type=b.type
and a.owner='SCOTT';
18、抓出走了低选择性索引的SQL
此脚本依赖统计信息。如果一个索引选择性很低,说明列数据分布不均衡。当SQL走了数据分布不均衡列的索引,很容易走错执行计划,此时我们应该检查SQL语句中是否有其他过滤条件,如果有其他过滤条件,可以考虑建立组合索引,将选择性高的列作为引导列;如果没有其他过滤条件,应该检查列是否有收集直方图。以下脚本抓出走了低选择性索引的SQL:
select c.sql_id,c.sql_text,b.index_name,e.table_name,trunc(d.num_distinct / e.num_rows * 100, 2) selectivity,d.num_distinct,e.num_rowsfrom v$sql_plan a,(select *from (select index_owner,index_name,table_owner,table_name,column_name,count(*) over(partition by index_owner, index_name, table_owner, table_name) cntfrom dba_ind_columns)where cnt = 1) b,v$sql c,dba_tab_col_statistics d,dba_tables ewhere a.object_owner = b.index_ownerand a.object_name = b.index_nameand b.index_owner = 'SCOTT'and a.access_predicates is not nulland a.sql_id = c.sql_idand a.child_number = c.child_numberand d.owner = e.ownerand d.table_name = e.table_nameand b.table_owner = e.ownerand b.table_name = e.table_nameand d.column_name = b.column_nameand d.table_name = b.table_nameand d.num_distinct / e.num_rows < 0.1;
19、抓出可以创建组合索引的SQL(回表再过滤选择性高的列)
回表次数太多会严重影响SQL性能。当执行计划中发生了回表再过滤并且过滤字段的选择性比较高,我们可以将过滤字段包含在索引中避免回表再过滤,从而减少回表次数,提升查询性能。以下脚本抓出回表再过滤选择性较高的列:
select a.sql_id,a.sql_text,f.table_name,c.size_mb,e.column_name,round(e.num_distinct / f.num_rows * 100, 2) selectivityfrom v$sql a,v$sql_plan b,(select owner, segment_name, sum(bytes / 1024 / 1024) size_mbfrom dba_segmentsgroup by owner, segment_name) c,dba_tab_col_statistics e,dba_tables fwhere a.sql_id = b.sql_idand a.child_number = b.child_numberand b.object_owner = c.ownerand b.object_name = c.segment_nameand e.owner = f.ownerand e.table_name = f.table_nameand b.object_owner = f.ownerand b.object_name = f.table_nameand instr(b.filter_predicates, e.column_name) > 0and (e.num_distinct / f.num_rows) > 0.1and c.owner = 'SCOTT'and b.operation = 'TABLE ACCESS'and b.options = 'BY INDEX ROWID'and e.owner = 'SCOTT'order by 4 desc;
20、抓出可以创建组合索引的SQL(回表只访问少数字段)
此脚本不依赖统计信息。我们在第1章中讲到,回表次数太多会严重影响SQL性能。当SQL走索引回表只访问表中少部分字段,我们可以将这些字段与过滤条件组合起来建立为一个组合索引,这样就能避免回表,从而提升查询性能。下面脚本抓出回表只访问少数字段的SQL:
select a.sql_id,a.sql_text,d.table_name,REGEXP_COUNT(b.projection, ']') ||'/'|| d.column_cntcolumn_cnt,c.size_mb,...