One day, users angrily complained that the Oracle database had suddenly become unbearably slow. They could feel the lag but had no idea when the slowdown actually started.
The Standout Troublemaker
Below is the “SQL Statistics” section from the latest AWR report:

The top SQL consumed more than 99% of Total DB Time. It executed 19 times, with an average elapsed time of 66 seconds. The next two PL/SQL procedures also showed poor performance: one ran 18 times, the other once, both with similar runtime. It was obvious that these PL/SQL procedures were invoking the same SQL. That SQL was the main culprit behind the slowdown.
Let’s check its execution plan in the AWR repository:
SQL> select * from table(dbms_xplan.display_awr('g81cbrq5yamf5'));
PLAN_TABLE_OUTPUT
___________________________________________________________________________________________________________
SQL_ID g81cbrq5yamf5
--------------------
SELECT ADDRESS_ID, CUSTOMER_ID, DATE_CREATED, HOUSE_NO_OR_NAME,
STREET_NAME, TOWN, COUNTY, COUNTRY, POST_CODE, ZIP_CODE FROM ADDRESSES
WHERE CUSTOMER_ID = :B2 AND ROWNUM < :B1
Plan hash value: 1286489376
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2681K(100)| |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | TABLE ACCESS FULL| ADDRESSES | 2 | 154 | 2681K (1)| 00:01:45 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<:B1)
2 - filter("CUSTOMER_ID"=:B2)
SQL_ID g81cbrq5yamf5
--------------------
SELECT ADDRESS_ID, CUSTOMER_ID, DATE_CREATED, HOUSE_NO_OR_NAME,
STREET_NAME, TOWN, COUNTY, COUNTRY, POST_CODE, ZIP_CODE FROM ADDRESSES
WHERE CUSTOMER_ID = :B2 AND ROWNUM < :B1
Plan hash value: 2480532011
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESSES | 2 | 154 | 6 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | ADDRESS_CUST_IX | 2 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<:B1)
3 - access("CUSTOMER_ID"=:B2)
PLAN_TABLE_OUTPUT
____________________
45 rows selected.
The output shows two execution plans for the same SQL:
Plan 1 (hash 1286489376): Performs a full table scan on the ADDRESSES table, with a cost of 2.68 million and an elapsed time of about 1 minute 45 seconds.
Plan 2 (hash 2480532011): Uses the ADDRESS_CUST_IX index, with a cost of only 6 and a runtime of around 1 second.
The difference is dramatic. Why did Plan 1 stop using the index? To find out, we checked the current index metadata:
SELECT index_name, index_type, uniqueness
FROM user_indexes
WHERE table_name = 'ADDRESSES';
INDEX_NAME INDEX_TYPE UNIQUENESS
__________________ _____________ _____________
ADDRESS_PK NORMAL/REV UNIQUE
The result shows only one index, ADDRESS_PK. The ADDRESS_CUST_IX index from Plan 2 no longer exists. It might have been dropped or lost. That’s a clear lead.
Building a Timeline for the Plan Change
Now we know performance dropped because the SQL switched from Plan 2 to Plan 1. But when exactly did this switch happen? We can track it usingDBA_HIST_SQLSTAT, which stores historical SQL performance data, including plan hash values and elapsed times.
Here’s the query:
set veri off
define top_sql_id='g81cbrq5yamf5'
SELECT b.snap_id,
TO_CHAR(b.end_interval_time, 'HH24:MI') AS snap_time,
a.plan_hash_value,
TRUNC(a.elapsed_time_delta / 1000000 / NULLIF(a.executions_delta, 0), 5) AS avg_elapsed_second
FROM dba_hist_sqlstat a,
dba_hist_snapshot b
WHERE sql_id = '&top_sql_id'
AND a.snap_id (+) = b.snap_id
AND b.begin_interval_time > TRUNC(SYSDATE)
ORDER BY a.snap_id;
Output appears:
SNAP_ID SNAP_TIME PLAN_HASH_VALUE AVG_ELAPSED_SECOND
__________ ____________ __________________ _____________________
2932 06:00 2480532011 0.00034
2933 06:30 2480532011 0.00121
2934 07:00 2480532011 0.00124
2935 07:30 2480532011 0.00124
2936 08:00 2480532011 0.00119
2937 08:30 2480532011 0.00112
2938 09:00 2480532011 0.00114
2939 09:30 2480532011 0.00115
2940 10:00 2480532011 0.0011
2941 10:30 1286489376 64.99351
2942 11:00 1286489376 63.29527
2943 11:30 1286489376 64.50286
2944 12:00 1286489376 63.36064
2945 12:30 1286489376 63.42192
2946 13:00 1286489376 64.47289
2947 13:30 1286489376 64.52588
16 rows selected.
The efficient Plan 2 (2480532011) ran smoothly until 10:00. At 10:30 (snapshot 2941), the plan switched to 1286489376, and the average runtime jumped from milliseconds to over a minute. The index loss likely occurred around that time.
Zooming In Further
At this point, we have nailed down that the execution plan switch occurred between 10:00 and 10:30. Can we narrow it further? Yes, with the DBA_HIST_ACTIVE_SESS_HISTORY view, which stores samples of active sessions every 10 seconds, we can sift through detailed execution data.
DEFINE top_sql_id='g81cbrq5yamf5'
SELECT sql_id,
sql_child_number,
TO_CHAR(sample_time, 'HH24:MI:SS') AS track_time,
sql_plan_hash_value AS curr_sql_plan
FROM dba_hist_active_sess_history
WHERE snap_id = 2941
AND sql_id = '&top_sql_id'
ORDER BY sample_time;
This SQL query retrieves execution details for the SQL statement from the view for snapshot ID 2941. It returns the SQL ID, child cursor number, sample time, and current plan hash value, ordered by sample time.
SQL_ID SQL_CHILD_NUMBER TRACK_TIME CURR_SQL_PLAN
________________ ___________________ _____________ ________________
...
g81cbrq5yamf5 0 10:12:22 2480532011
g81cbrq5yamf5 0 10:12:32 2480532011
g81cbrq5yamf5 0 10:14:25 2480532011
g81cbrq5yamf5 1 10:15:26 1286489376
g81cbrq5yamf5 1 10:15:26 1286489376
g81cbrq5yamf5 1 10:15:26 1286489376
...
The output shows plan 2 (2480532011) was used until 10:14:25. By 10:15:26, plan 1 (1286489376) took over. Thus, the plan switch occurred around 10:15 — precisely when performance nosedived.
This example demonstrates how AWR and ASH can be used together — from a macro-level overview to a micro-level investigation — to pinpoint and resolve performance issues.





Leave a comment