Database Tutorial

Course Tutorial Site

Site Admin

SQL Statement Tuning

with 5 comments

SQL statement tuning is an advanced topic because it requires that you understand table definitions, constraints, and indexes plus how they interact through joins. As such, Oracle presents the most advanced set of tools for tuning SQL statements.

This example works through a demonstration of how you can tune statements in databases. The key elements of tuning SQL statements are understanding the structures and data. The key worry about tuning is that optimizing a query can harm the effectiveness of OLTP processing by INSERT and UPDATE statements. Any DML statement that adds or modifies data in tables also modifies indexes built to support queries against the table.

Oracle

You have two approaches in an Oracle database. The old way and the new way. They’re really not that different but the DBMS_XPLAN package provides a richer set of analytical data for less work.

The old way and the new way both use a single way for generating the SQL execution plan. You do that with the EXPLAIN PLAN statement.

The EXPLAIN PLAN statement is followed by a SET operator. You use the SET operator to assign a value to the STATEMENT_ID column. This is a column that uniquely identifies statements in the PLAN_TABLE. You must ensure that all STATEMENT_ID values are unique, or you’ll get incorrect results when you analyze plans. There’s no constraint that magically does this for you. After you set the statement identifier, you use the FOR keyword before the statement that you want to analyze.

An example follows below (tied to Lab #7).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
EXPLAIN PLAN
SET STATEMENT_ID = '&input'
FOR
SELECT   DISTINCT
         r.rental_id
,        c.contact_id
,        tu.check_out_date AS check_out_date
,        tu.return_date AS return_date
,        3 AS created_by
,        TRUNC(SYSDATE) AS creation_date
,        3 AS last_updated_by
,        TRUNC(SYSDATE) AS last_update_date
FROM     member m INNER JOIN contact c
ON       m.member_id = c.member_id INNER JOIN transaction_upload tu
ON       c.first_name = tu.first_name
AND      NVL(c.middle_name,'x') = NVL(tu.middle_name,'x')
AND      c.last_name = tu.last_name
AND      tu.account_number = m.account_number LEFT JOIN rental r
ON       c.contact_id = r.customer_id
AND      tu.check_out_date = r.check_out_date
AND      tu.return_date = r.return_date;

After you explain a statement, you need to analyze it. Oracle uses hierarchical queries to analyze them. The following is a simple script that lets you display the execution plan by nesting the levels of the explained plan.

Here’s where you chose to follow the old or new way. Click on the link below to see the old way if you’re interested but I’d suggest you use the new way.

Oracle’s New Way

Oracle’s new way uses the DBMS_XPLAN package. In this post, you’ll examine only how to display results and check the impact of small changes that lead to improved query performance.

Assuming you entered Query1 as the &input response to the EXPLAIN PLAN statement above, then you’d call it with the following syntax:

1
2
3
4
5
6
7
8
-- Expand the width of displayed output.
SET LINESIZE 120
 
-- Query using the TABLE function that returns a user-defined object type (UDT)
-- into a SQL aggregate table, which is a fancy word for normal SQL output
-- (unfortunately, it's the technically accurate word).
SELECT *
FROM   TABLE(dbms_xplan.display(NULL,'Query1'));

The query using the DISPLAY function of the DBMS_XPLAN returns a table of data. It should generate an output stream like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------
Plan hash value: 3289798709
 
-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                    |   229 | 24274 |    50   (8)| 00:00:01 |
|   1 |  HASH UNIQUE                   |                    |   229 | 24274 |    50   (8)| 00:00:01 |
|*  2 |   HASH JOIN OUTER              |                    |   229 | 24274 |    49   (7)| 00:00:01 |
|   3 |    VIEW                        |                    |     4 |   248 |    37   (6)| 00:00:01 |
|*  4 |     HASH JOIN                  |                    |     4 |   572 |    37   (6)| 00:00:01 |
|*  5 |      HASH JOIN                 |                    |    15 |  1230 |     7  (15)| 00:00:01 |
|   6 |       TABLE ACCESS FULL        | MEMBER             |     9 |   180 |     3   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL        | CONTACT            |    15 |   930 |     3   (0)| 00:00:01 |
|   8 |      EXTERNAL TABLE ACCESS FULL| TRANSACTION_UPLOAD |  8168 |   486K|    30   (4)| 00:00:01 |
|   9 |    TABLE ACCESS FULL           | RENTAL             |  4689 |   201K|    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND
              "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+))
   4 - access("C"."FIRST_NAME"="TU"."FIRST_NAME" AND
              NVL("C"."MIDDLE_NAME",'x')=NVL("TU"."MIDDLE_NAME",'x') AND "C"."LAST_NAME"="TU"."LAST_NAME"
              AND "TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER")
   5 - access("C"."MEMBER_ID"="M"."MEMBER_ID")
 
Note
-----
   - dynamic sampling used for this statement
 
30 rows selected.

The first thing to do is move the data immediately from an external table into a production table. You do that with the following command:

1
2
-- Create a new table from an existing one.
CREATE TABLE transaction_uploaded AS SELECT * FROM transaction_upload;

Truncating the PLAN_TABLE table, is the easiest way to start over by reusing the statement identifiers, like Query1. Then, you can edit your EXPLAIN PLAN script and change TRANSACTION_UPLOAD table reference to TRANSACTION_UPLOADED. Now, the query using the DISPLAY function of the DBMS_XPLAN returns a reduction in cost from 50 to 34.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 3624831533
 
--------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                   |   400 | 42400 |    34   (6)| 00:00:01 |
|   1 |  HASH UNIQUE           |                   |   400 | 42400 |    34   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |                   |   400 | 42400 |    33   (4)| 00:00:01 |
|   3 |    VIEW                |                   |     7 |   434 |    22   (5)| 00:00:01 |
|   4 |     NESTED LOOPS       |                   |     7 |  1001 |    22   (5)| 00:00:01 |
|*  5 |      HASH JOIN         |                   |    15 |  1230 |     7  (15)| 00:00:01 |
|   6 |       TABLE ACCESS FULL| MEMBER            |     9 |   180 |     3   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| CONTACT           |    15 |   930 |     3   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | IMPORT_DATE_RANGE |     1 |    61 |     1   (0)| 00:00:01 |
|   9 |    TABLE ACCESS FULL   | RENTAL            |  4689 |   201K|    11   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND
              "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND
              "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+))
   5 - access("C"."MEMBER_ID"="M"."MEMBER_ID")
   8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND
              "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND
              NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x'))
 
Note
-----
   - dynamic sampling used for this statement
 
31 rows selected.

A downside or quirk of thought, I’m never quite sure but folks look to the bottom not the middle first. Maybe in this case you tackle it first because it’s the attractive inefficiency due to the number of rows returned by a full table scan (most expensive).

You add a unique index that maps to the natural key for the RENTAL table, like this:

1
2
CREATE UNIQUE INDEX natural_key_rental
ON rental (rental_id, customer_id, check_out_date, return_date);

You run the EXPLAIN PLAN again, using Query2, and then analyze it. You’ll see that the query uses an index fast full scan and that the cost decreases by 2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
Plan hash value: 3402838417
 
---------------------------------------------------------------------------------------------
| Id  | Operation              | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                    |   400 | 42400 |    32   (7)| 00:00:01 |
|   1 |  HASH UNIQUE           |                    |   400 | 42400 |    32   (7)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |                    |   400 | 42400 |    31   (4)| 00:00:01 |
|   3 |    VIEW                |                    |     7 |   434 |    22   (5)| 00:00:01 |
|   4 |     NESTED LOOPS       |                    |     7 |  1001 |    22   (5)| 00:00:01 |
|*  5 |      HASH JOIN         |                    |    15 |  1230 |     7  (15)| 00:00:01 |
|   6 |       TABLE ACCESS FULL| MEMBER             |     9 |   180 |     3   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| CONTACT            |    15 |   930 |     3   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | IMPORT_DATE_RANGE  |     1 |    61 |     1   (0)| 00:00:01 |
|   9 |    INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL |  4689 |   201K|     9   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND
              "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND
              "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+))
   5 - access("C"."MEMBER_ID"="M"."MEMBER_ID")
   8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND
              "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND
              NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x'))
 
Note
-----
   - dynamic sampling used for this statement
 
31 rows selected.

You add a unique index that maps the surrogate and natural key for the MEMBER table, like this:

1
2
CREATE UNIQUE INDEX member_account
ON member (member_id, account_number);

You run the EXPLAIN PLAN again, using Query3, and then analyze it. You’ll see that the query uses an index range scan instead of a full table access of the MEMBER table. This decreases the cost by 2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 656512492
 
---------------------------------------------------------------------------------------------
| Id  | Operation              | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                    |   400 | 42400 |    30   (7)| 00:00:01 |
|   1 |  HASH UNIQUE           |                    |   400 | 42400 |    30   (7)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |                    |   400 | 42400 |    29   (4)| 00:00:01 |
|   3 |    VIEW                |                    |     7 |   434 |    19   (0)| 00:00:01 |
|   4 |     NESTED LOOPS       |                    |     7 |  1001 |    19   (0)| 00:00:01 |
|   5 |      NESTED LOOPS      |                    |    15 |  1230 |     4   (0)| 00:00:01 |
|   6 |       TABLE ACCESS FULL| CONTACT            |    15 |   930 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN | MEMBER_ACCOUNT     |     1 |    20 |     1   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | IMPORT_DATE_RANGE  |     1 |    61 |     1   (0)| 00:00:01 |
|   9 |    INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL |  4689 |   201K|     9   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND
              "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND
              "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+))
   7 - access("C"."MEMBER_ID"="M"."MEMBER_ID")
   8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND
              "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND
              NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x'))
 
Note
-----
   - dynamic sampling used for this statement
 
31 rows selected.

You add a unique index that maps the surrogate, foreign key to the MEMBER table, and the natural key for the CONTACT table, like this:

1
2
CREATE UNIQUE INDEX contact_member
ON contact (contact_id, member_id, last_name, first_name, NVL(middle_name,'x'));

You run the EXPLAIN PLAN again, using Query3, and then analyze it. You’ll see that the query uses an index full scan instead of a full table access of the CONTACT table. This decreases the cost by another 2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 1185696375
 
---------------------------------------------------------------------------------------------
| Id  | Operation              | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                    |   400 | 42400 |    28   (8)| 00:00:01 |
|   1 |  HASH UNIQUE           |                    |   400 | 42400 |    28   (8)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |                    |   400 | 42400 |    27   (4)| 00:00:01 |
|   3 |    VIEW                |                    |     7 |   434 |    17   (0)| 00:00:01 |
|   4 |     NESTED LOOPS       |                    |     7 |  1001 |    17   (0)| 00:00:01 |
|   5 |      NESTED LOOPS      |                    |    15 |  1230 |     2   (0)| 00:00:01 |
|   6 |       INDEX FULL SCAN  | CONTACT_MEMBER     |    15 |   930 |     1   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN | MEMBER_ACCOUNT     |     1 |    20 |     1   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | IMPORT_DATE_RANGE  |     1 |    61 |     1   (0)| 00:00:01 |
|   9 |    INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL |  4689 |   201K|     9   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND
              "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND
              "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+))
   7 - access("C"."MEMBER_ID"="M"."MEMBER_ID")
   8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND
              "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND
              NVL("MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x'))
 
Note
-----
   - dynamic sampling used for this statement
 
31 rows selected.

The question evolves after you tune the first query. Then, you must check what impacts may exist on other queries, like the statement that you use to merge data into the TRANSACTION table. You may have to suppress indexes in a subsequent query, or after you tune it, return and suppress an index here. You suppress an index by adding a '0' to a NUMBER or DATE column, and by concatenating a null string to a character data type.

Examples of Suppressing Indexes

The following is an example of suppressing an index built on a string. It extends the tuning statement earlier in the blog post, and references line #17. The concatenation of a null string suppresses the index.

17
AND      c.last_name = tu.last_name||''

The following is an example of suppressing an index built on a date. It extends the tuning statement earlier in the blog post, and references line #20. Adding a zero leaves the value intact and suppresses the index.

20
AND      tu.check_out_date = r.check_out_date+0

Written by michaelmclaughlin

October 8th, 2009 at 9:30 pm

Posted in

5 Responses to 'SQL Statement Tuning'

Subscribe to comments with RSS or TrackBack to 'SQL Statement Tuning'.

  1. For personal clarification,your blog states “You suppress an index by adding a ‘0’ to a NUMBER or DATE column, and by concatenating a null string to a character data type.” I was honestly confused are your referring to the select statements we are using for the tuning or the table we are querying in order to retrieve the data? Also I couldn’t find an example on the blog, is there an example of adding ‘0’ that I could reference?

    Johnny M.

    9 Apr 10 at 11:11 am

  2. “Truncating the table to start over with Query1.” To which table are you referring to when you say “Truncate”?

    Johnny M.

    9 Apr 10 at 12:03 pm

  3. I’ve made the change, let me know if you like it.

    michaelmclaughlin

    10 Apr 10 at 6:01 pm

  4. I’ve added a section. Please let me know if you find it adequate.

    michaelmclaughlin

    10 Apr 10 at 6:11 pm

  5. That was well explained. I wish I would’ve known some of this for lab 7.

    Michael Higginson

    20 Jul 10 at 1:09 pm

Leave a Reply