你的位置:首页 > 数据库

[数据库]GROUP BY的扩展

GROUP BY的扩展主要包括ROLLUP,CUBE,GROUPING SETS三种形式。

ROLLUP

rollup相对于简单的分组合计增加了小计和合计,解释起来会比较抽象,下面我们来看看具体事例。

例1,统计不同部门工资的总和和所有部门工资的总和。

SQL> select deptno,sum(sal) from emp group by rollup(deptno);  DEPTNO  SUM(SAL)---------- ----------    10    8750    20   10875    30    9400        29025

例2,该例中先对deptno进行分组,再对job进行分组

SQL> select deptno,job,sum(sal) from emp group by rollup(deptno,job);  DEPTNO JOB      SUM(SAL)---------- --------- ----------    10 CLERK      1300  --10号部门中JOB为CLERK的工资的总和    10 MANAGER     2450    10 PRESIDENT    5000    10         8750  --10号所有工种工资的总和    20 CLERK      1900    20 ANALYST     6000    20 MANAGER     2975    20        10875    30 CLERK      950    30 MANAGER     2850    30 SALESMAN    5600    30         9400             29025  --所有部门,所有工种工资的总和13 rows selected.

如果要用普通的分组函数实现,可用UNION ALL语句:

--实现单个部门,单个工种的工资的总和
select deptno,job,sum(sal) from emp group by deptno,jobunion all
--实现单个部门工资的总和
select deptno,null,sum(sal) from emp group by deptnounion all
--实现所有部门工资的总和
select null,null,sum(sal) from emporder by 1,2

下面我们分别来看看两者的执行计划及统计信息,

ROLLUP语句:

Execution Plan-----------------------------------------------------------------------------| Id | Operation     | Name | Rows | Bytes | Cost (%CPU)| Time   |-----------------------------------------------------------------------------|  0 | SELECT STATEMENT   |    |   11 |  132 |   3 (34)| 00:00:01 ||  1 | SORT GROUP BY ROLLUP|    |   11 |  132 |   3 (34)| 00:00:01 ||  2 |  TABLE ACCESS FULL |  EMP |   14 |  168 |   2  (0)| 00:00:01 |-----------------------------------------------------------------------------Statistics----------------------------------------------------------   0 recursive calls   0 db block gets   2 consistent gets   0 physical reads   0 redo size  895 bytes sent via SQL*Net to client  519 bytes received via SQL*Net from client   2 SQL*Net roundtrips to/from client   1 sorts (memory)   0 sorts (disk)   13 rows processed

UNION ALL语句:

Execution Plan-----------------------------------------------------------------------------| Id | Operation     | Name | Rows | Bytes | Cost (%CPU)| Time   |-----------------------------------------------------------------------------|  0 | SELECT STATEMENT   |    |   15 |  150 |   9 (34)| 00:00:01 ||  1 | SORT ORDER BY    |    |   15 |  150 |   8 (75)| 00:00:01 ||  2 |  UNION-ALL     |    |    |    |       |     ||  3 |  HASH GROUP BY   |    |   11 |  132 |   3 (34)| 00:00:01 ||  4 |   TABLE ACCESS FULL|  EMP |   14 |  168 |   2  (0)| 00:00:01 ||  5 |  HASH GROUP BY   |    |   3 |   15 |   3 (34)| 00:00:01 ||  6 |   TABLE ACCESS FULL|  EMP |   14 |   70 |   2  (0)| 00:00:01 ||  7 |  SORT AGGREGATE  |    |   1 |   3 |       |     ||  8 |   TABLE ACCESS FULL|  EMP |   14 |   42 |   2  (0)| 00:00:01 |-----------------------------------------------------------------------------
Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 895 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 13 rows processed

不难看出,相同的功能实现,ROLLUP相对于UNION ALL效率有了极大的提升。

 

CUBE

cube相对于rollup,结果输出更加详细。

例1,在本例中还不是很明显。

SQL> select deptno,sum(sal) from emp group by cube(deptno);  DEPTNO  SUM(SAL)---------- ----------        29025    10    8750    20   10875    30    9400

例2,相对于rollup,cube还对工种这一列进行了专门的汇总。

SQL> select deptno,job,sum(sal) from emp group by cube(deptno,job);  DEPTNO JOB      SUM(SAL)---------- --------- ----------             29025      CLERK      4150      ANALYST     6000      MANAGER     8275      SALESMAN    5600      PRESIDENT    5000    10         8750    10 CLERK      1300    10 MANAGER     2450    10 PRESIDENT    5000    20         10875    20  CLERK      1900    20  ANALYST     6000    20 MANAGER     2975    30         9400    30 CLERK      950    30 MANAGER     2850    30 SALESMAN    560018 rows selected.

 

GROUPING SETS

GROUPING SETS相对于ROLLUP和CUBE,结果是分类统计的,可读性更好一些。

例1:

SQL> select deptno,job,to_char(hiredate,'yyyy')hireyear,sum(sal) from emp group by grouping sets(deptno,job,to_char(hiredate,'yyyy'));  DEPTNO JOB     HIRE  SUM(SAL)---------- ---------  ---- ----------      CLERK         4150      SALESMAN        5600      PRESIDENT       5000      MANAGER        8275      ANALYST        6000    30             9400    20            10875    10             8750            1987   4100            1980    800            1982   1300            1981   22825

例2:

SQL> select deptno,job,sum(sal) from emp group by grouping sets(deptno,job);  DEPTNO JOB      SUM(SAL)---------- ---------  ----------      CLERK       4150      SALESMAN     5600      PRESIDENT     5000      MANAGER      8275      ANALYST      6000    30          9400    20         10875    10          87508 rows selected.

对于该例,如何用UNION ALL实现呢?

select null deptno,job,sum(sal) from emp group by jobunion allselect deptno,null,sum(sal) from emp group by deptno;

两者的执行计划及统计信息分别如下:

GROUPING SETS:

Execution Plan--------------------------------------------------------------------------------------------------------| Id | Operation          | Name           | Rows | Bytes | Cost (%CPU)| Time   |--------------------------------------------------------------------------------------------------------|  0 | SELECT STATEMENT      |              |  11 |  352 |  10 (20)| 00:00:01 ||  1 | TEMP TABLE TRANSFORMATION |              |    |    |      |     ||  2 |  LOAD AS SELECT      | SYS_TEMP_0FD9D6795_E71F79 |    |    |      |     ||  3 |  TABLE ACCESS FULL    | EMP            |  14 |  168 |   2  (0)| 00:00:01 ||  4 |  LOAD AS SELECT      | SYS_TEMP_0FD9D6796_E71F79 |    |    |      |     ||  5 |  HASH GROUP BY      |              |   1 |  19 |   3 (34)| 00:00:01 ||  6 |   TABLE ACCESS FULL    | SYS_TEMP_0FD9D6795_E71F79 |   1 |  19 |   2  (0)| 00:00:01 ||  7 |  LOAD AS SELECT      | SYS_TEMP_0FD9D6796_E71F79 |    |    |      |     ||  8 |  HASH GROUP BY      |              |   1 |  26 |   3 (34)| 00:00:01 ||  9 |   TABLE ACCESS FULL    | SYS_TEMP_0FD9D6795_E71F79 |   1 |  26 |   2  (0)| 00:00:01 || 10 |  VIEW           |              |   1 |  32 |   2  (0)| 00:00:01 || 11 |  TABLE ACCESS FULL    | SYS_TEMP_0FD9D6796_E71F79 |   1 |  32 |   2  (0)| 00:00:01 |--------------------------------------------------------------------------------------------------------Statistics----------------------------------------------------------   4 recursive calls   24 db block gets   17 consistent gets   3 physical reads  1596 redo size  819 bytes sent via SQL*Net to client  519 bytes received via SQL*Net from client   2 SQL*Net roundtrips to/from client   0 sorts (memory)   0 sorts (disk)   8 rows processed

UNION ALL:

----------------------------------------------------------------------------| Id | Operation      | Name | Rows | Bytes | Cost (%CPU)| Time   |----------------------------------------------------------------------------|  0 | SELECT STATEMENT  |    |   8 |  65 |   6 (67)| 00:00:01 ||  1 | UNION-ALL     |    |    |    |      |     ||  2 |  HASH GROUP BY   |    |   5 |  50 |   3 (34)| 00:00:01 ||  3 |  TABLE ACCESS FULL| EMP |  14 |  140 |   2  (0)| 00:00:01 ||  4 |  HASH GROUP BY   |    |   3 |  15 |   3 (34)| 00:00:01 ||  5 |  TABLE ACCESS FULL| EMP |  14 |  70 |   2  (0)| 00:00:01 |----------------------------------------------------------------------------Statistics----------------------------------------------------------   0 recursive calls   0 db block gets   4 consistent gets   0 physical reads   0 redo size  819 bytes sent via SQL*Net to client  519 bytes received via SQL*Net from client   2 SQL*Net roundtrips to/from client   0 sorts (memory)   0 sorts (disk)   8 rows processed

和rollup不同的是,grouping sets的效率竟然比同等功能的union all语句低,这实现有点出乎意料。看来,也不可盲目应用Oracle提供的方案,至少,在本例中是如此。