One day, users angrily complained that the Oracle database had suddenly become unbearably slow. They could feel the lag but had no idea when the slowdown actually started.

The Standout Troublemaker

Below is the “SQL Statistics” section from the latest AWR report:

The top SQL consumed more than 99% of Total DB Time. It executed 19 times, with an average elapsed time of 66 seconds. The next two PL/SQL procedures also showed poor performance: one ran 18 times, the other once, both with similar runtime. It was obvious that these PL/SQL procedures were invoking the same SQL. That SQL was the main culprit behind the slowdown.

Let’s check its execution plan in the AWR repository:

SQL> select * from table(dbms_xplan.display_awr('g81cbrq5yamf5'));

                                                                                          PLAN_TABLE_OUTPUT
___________________________________________________________________________________________________________
SQL_ID g81cbrq5yamf5
--------------------
SELECT ADDRESS_ID, CUSTOMER_ID, DATE_CREATED, HOUSE_NO_OR_NAME,
STREET_NAME, TOWN, COUNTY, COUNTRY, POST_CODE, ZIP_CODE FROM ADDRESSES
WHERE CUSTOMER_ID = :B2 AND ROWNUM < :B1

Plan hash value: 1286489376

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |       |       |  2681K(100)|          |
|*  1 |  COUNT STOPKEY     |           |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| ADDRESSES |     2 |   154 |  2681K  (1)| 00:01:45 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<:B1)
   2 - filter("CUSTOMER_ID"=:B2)

SQL_ID g81cbrq5yamf5
--------------------
SELECT ADDRESS_ID, CUSTOMER_ID, DATE_CREATED, HOUSE_NO_OR_NAME,
STREET_NAME, TOWN, COUNTY, COUNTRY, POST_CODE, ZIP_CODE FROM ADDRESSES
WHERE CUSTOMER_ID = :B2 AND ROWNUM < :B1

Plan hash value: 2480532011

--------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                 |       |       |     6 (100)|          |
|*  1 |  COUNT STOPKEY                       |                 |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESSES       |     2 |   154 |     6   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | ADDRESS_CUST_IX |     2 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<:B1)
   3 - access("CUSTOMER_ID"=:B2)

   PLAN_TABLE_OUTPUT
____________________

45 rows selected.

The output shows two execution plans for the same SQL:

Plan 1 (hash 1286489376): Performs a full table scan on the ADDRESSES table, with a cost of 2.68 million and an elapsed time of about 1 minute 45 seconds.

Plan 2 (hash 2480532011): Uses the ADDRESS_CUST_IX index, with a cost of only 6 and a runtime of around 1 second.

The difference is dramatic. Why did Plan 1 stop using the index? To find out, we checked the current index metadata:

SELECT index_name, index_type, uniqueness
FROM user_indexes
WHERE table_name = 'ADDRESSES';
        INDEX_NAME    INDEX_TYPE    UNIQUENESS
__________________ _____________ _____________
ADDRESS_PK         NORMAL/REV    UNIQUE

The result shows only one index, ADDRESS_PK. The ADDRESS_CUST_IX index from Plan 2 no longer exists. It might have been dropped or lost. That’s a clear lead.

Building a Timeline for the Plan Change

Now we know performance dropped because the SQL switched from Plan 2 to Plan 1. But when exactly did this switch happen? We can track it usingDBA_HIST_SQLSTAT, which stores historical SQL performance data, including plan hash values and elapsed times.

Here’s the query:

set veri off
define top_sql_id='g81cbrq5yamf5'

SELECT b.snap_id,
       TO_CHAR(b.end_interval_time, 'HH24:MI') AS snap_time,
       a.plan_hash_value,
       TRUNC(a.elapsed_time_delta / 1000000 / NULLIF(a.executions_delta, 0), 5) AS avg_elapsed_second
FROM dba_hist_sqlstat a,
     dba_hist_snapshot b
WHERE sql_id = '&top_sql_id'
  AND a.snap_id (+) = b.snap_id
  AND b.begin_interval_time > TRUNC(SYSDATE)
ORDER BY a.snap_id;

Output appears:

SNAP_ID    SNAP_TIME    PLAN_HASH_VALUE    AVG_ELAPSED_SECOND
__________ ____________ __________________ _____________________
   2932 06:00                2480532011               0.00034
   2933 06:30                2480532011               0.00121
   2934 07:00                2480532011               0.00124
   2935 07:30                2480532011               0.00124
   2936 08:00                2480532011               0.00119
   2937 08:30                2480532011               0.00112
   2938 09:00                2480532011               0.00114
   2939 09:30                2480532011               0.00115
   2940 10:00                2480532011                0.0011
   2941 10:30                1286489376              64.99351
   2942 11:00                1286489376              63.29527
   2943 11:30                1286489376              64.50286
   2944 12:00                1286489376              63.36064
   2945 12:30                1286489376              63.42192
   2946 13:00                1286489376              64.47289
   2947 13:30                1286489376              64.52588

16 rows selected.

The efficient Plan 2 (2480532011) ran smoothly until 10:00. At 10:30 (snapshot 2941), the plan switched to 1286489376, and the average runtime jumped from milliseconds to over a minute. The index loss likely occurred around that time.

Zooming In Further

At this point, we have nailed down that the execution plan switch occurred between 10:00 and 10:30. Can we narrow it further? Yes, with the DBA_HIST_ACTIVE_SESS_HISTORY view, which stores samples of active sessions every 10 seconds, we can sift through detailed execution data.

DEFINE top_sql_id='g81cbrq5yamf5'
SELECT sql_id,
       sql_child_number,
       TO_CHAR(sample_time, 'HH24:MI:SS') AS track_time,
       sql_plan_hash_value AS curr_sql_plan
FROM dba_hist_active_sess_history
WHERE snap_id = 2941
  AND sql_id = '&top_sql_id'
ORDER BY sample_time;

This SQL query retrieves execution details for the SQL statement from the view for snapshot ID 2941. It returns the SQL ID, child cursor number, sample time, and current plan hash value, ordered by sample time.

SQL_ID    SQL_CHILD_NUMBER    TRACK_TIME    CURR_SQL_PLAN
________________ ___________________ _____________ ________________
...
g81cbrq5yamf5                      0 10:12:22            2480532011
g81cbrq5yamf5                      0 10:12:32            2480532011
g81cbrq5yamf5                      0 10:14:25            2480532011
g81cbrq5yamf5                      1 10:15:26            1286489376
g81cbrq5yamf5                      1 10:15:26            1286489376
g81cbrq5yamf5                      1 10:15:26            1286489376
...

The output shows plan 2 (2480532011) was used until 10:14:25. By 10:15:26, plan 1 (1286489376) took over. Thus, the plan switch occurred around 10:15 — precisely when performance nosedived.

This example demonstrates how AWR and ASH can be used together — from a macro-level overview to a micro-level investigation — to pinpoint and resolve performance issues.

Leave a comment

I’m Yuan

Welcome to the blog of a performance maven in Oracle & MySQL. Feel free to contact me to share and discuss database knowledge.

My 20 Oracle Certifications

A screenshot from certview.oracle.com

My Book

MySQL 8.0 Operations and Optimization

Let’s connect