In the realm of database performance monitoring, identifying and resolving sporadic performance issues caused by outlier SQL statements can be particularly challenging. Traditional tools often rely on periodic snapshot and averages, which average out spikes and outliers, this limitation therefore obscures the identification of transient performance anomalies. This blog is going to explore a new method called “Sample-Based SQL Runtime Calculation”, which leverages granular raw data from ASH to deduce individual runtime of specific SQL statement. This approach can be used to locate culprit SQL statement responsible for fleeting performance issue.
Sample-Based SQL Runtime Calculation
The method of deducing a SQL runtime is conceptually simple. To get the runtime of a particular SQL with ASH, the SQL execution start time (sql_start_time column) is subtracted from the maximum sample time (sample_time column). The following SQL timelines a specific foreground session:
SELECT sample_id,
TO_CHAR(sample_time, 'HH24:MI:SS') AS sample_time,
TO_CHAR(sql_exec_start, 'HH24:MI:SS') AS sql_start_time,
sql_id,
sql_exec_id
FROM v$active_session_history
WHERE sample_time > SYSDATE - 30 / 60 / 24
AND session_id = 124
AND session_serial# = 39232
ORDER BY sample_id;
The output is:
SAMPLE_ID SAMPLE_T SQL_STAR SQL_ID SQL_EXEC_ID
---------- -------- -------- ------------- -----------
8937592 16:53:48 16:53:47 3wpnrxd1xpncv 16777216
8937593 16:53:49 16:53:47 3wpnrxd1xpncv 16777216
8937594 16:53:50 16:53:47 3wpnrxd1xpncv 16777216
8937595 16:53:51 16:53:47 3wpnrxd1xpncv 16777216
8937596 16:53:52 16:53:47 3wpnrxd1xpncv 16777216
8937597 16:53:53 16:53:47 3wpnrxd1xpncv 16777216
8937598 16:53:54 16:53:47 3wpnrxd1xpncv 16777216
8937599 16:53:55 16:53:47 3wpnrxd1xpncv 16777216
8937600 16:53:56 16:53:47 3wpnrxd1xpncv 16777216
8937601 16:53:57 16:53:47 3wpnrxd1xpncv 16777216
8937602 16:53:58 16:53:47 3wpnrxd1xpncv 16777216
8937603 16:53:59 16:53:47 3wpnrxd1xpncv 16777216
8937604 16:54:00 16:53:47 3wpnrxd1xpncv 16777216
8937605 16:54:01 16:53:47 3wpnrxd1xpncv 16777216
8937606 16:54:02 16:53:47 3wpnrxd1xpncv 16777216
8937607 16:54:03 16:53:47 3wpnrxd1xpncv 16777216
8937608 16:54:04 16:53:47 3wpnrxd1xpncv 16777216
8937609 16:54:05 16:53:47 3wpnrxd1xpncv 16777216
8937610 16:54:06 16:53:47 3wpnrxd1xpncv 16777216
8937611 16:54:07 16:53:47 3wpnrxd1xpncv 16777216
8937612 16:54:08 16:53:47 3wpnrxd1xpncv 16777216
8937613 16:54:09 16:53:47 3wpnrxd1xpncv 16777216
8937614 16:54:10 16:53:47 3wpnrxd1xpncv 16777216
8937615 16:54:11 16:53:47 3wpnrxd1xpncv 16777216
8937616 16:54:12 16:53:47 3wpnrxd1xpncv 16777216
8937617 16:54:13 16:53:47 3wpnrxd1xpncv 16777216
8937618 16:54:15 16:53:47 3wpnrxd1xpncv 16777216
8937619 16:54:16 16:53:47 3wpnrxd1xpncv 16777216
8937620 16:54:17 16:53:47 3wpnrxd1xpncv 16777216
8937621 16:54:18 16:53:47 3wpnrxd1xpncv 16777216
8937622 16:54:19 16:53:47 3wpnrxd1xpncv 16777216
8937623 16:54:20 16:53:47 3wpnrxd1xpncv 16777216
8937624 16:54:21 16:53:47 3wpnrxd1xpncv 16777216
8937625 16:54:22 16:53:47 3wpnrxd1xpncv 16777216
8937626 16:54:23 16:53:47 3wpnrxd1xpncv 16777216
8937627 16:54:24 16:53:47 3wpnrxd1xpncv 16777216
8937628 16:54:25 16:54:24 azrtjtkq79qa6 16777216
8937629 16:54:26 16:54:24 azrtjtkq79qa6 16777216
8937630 16:54:27 16:54:24 azrtjtkq79qa6 16777216
8937631 16:54:28 16:54:24 azrtjtkq79qa6 16777216
8937632 16:54:29 16:54:24 azrtjtkq79qa6 16777216
8937633 16:54:30 16:54:29 dkhax46cjukju 16777216
8937634 16:54:31 16:54:29 dkhax46cjukju 16777216
...
Based on the privided output, the following calculations can be made:
- For the SQL ID
3wpnrxd1xpncv, the start time is16:53:47, the last sample time is16:54:24, the deduced runtime is 37 seconds. - For the SQL ID
azrtjtkq79qa6, the start time is16:54:24,the last sample time is16:54:29, the runtime is 5 seconds.
The analyzed table can be changed from v$active_session_history to dba_hist_active_sess_history. The corresponding SQL query is as follows:
SELECT sample_id,
TO_CHAR(sample_time, 'HH24:MI:SS') AS sample_time,
TO_CHAR(sql_exec_start, 'HH24:MI:SS') AS sql_start_time,
sql_id,
sql_exec_id
FROM dba_hist_active_sess_history
WHERE session_id = 124
AND session_serial# = 39232
ORDER BY sample_id;
The output is :
SAMPLE_ID SAMPLE_T SQL_STAR SQL_ID SQL_EXEC_ID
---------- -------- -------- ------------- -----------
8937596 16:53:52 16:53:47 3wpnrxd1xpncv 16777216
8937606 16:54:02 16:53:47 3wpnrxd1xpncv 16777216
8937616 16:54:12 16:53:47 3wpnrxd1xpncv 16777216
8937626 16:54:23 16:53:47 3wpnrxd1xpncv 16777216
8937636 16:54:33 16:54:29 dkhax46cjukju 16777216
8937646 16:54:43 16:54:29 dkhax46cjukju 16777216
...
- For the SQL ID
3wpnrxd1xpncv, the start time is16:53:47, the maximum sample time is16:54:23, the deduced runtime is 36 seconds. - For the SQL ID
azrtjtkq79qa6, this ID is not present in the new output, indicating this SQL execution has not been captured due to the wider simpling interval.
Analyzing the prior two outputs yields the following observation:
- Actually, SQL ends running between the maximum sample time and the next sample time, so the actual execution is slighter longer than the execution time we deduced.
- Short-lived SQL statements might slip through the net if they start and finish between sampling points. However, if they are executed frequently enough, they will eventually be captured.
v$active_session_historyis more suitable for recent and active session analysis due to its higher sampling frequency (every second) and in-memory storage. This allows for more precise measurement of SQL execution times, especially for short-lived SQL statements.dba_hist_active_sess_historyis better suited for long-term historical analysis due to its lower sampling frequency (every 10 seconds) and disk storage, providing longer data retention. However, it may miss shorter SQL executions and provides less precise time.
By understanding these difference, we can choose the appropriate view based on the specific requirements of our performance analysis or monitor needs.
Use Cases
This method is very useful to locate culprit SQL statements behind random and transient slowdown. Image the following dialog:
User: The application suddenly took a long time to process a request!
DBA: Can you show me what happened?
User: No, it was a random occurrence.
DBA: Has this happened before?
User: No, it seemed like a one-off issue.
DBA: How long does this request usually take?
User: It usually takes a few seconds.
DBA: When did this slow request occur?
User: About two hours ago.
DBA: Can you provide a more specific time?
User: No, I’m not sure exactly when it happened.
To address the user’s complaint, we run the following query to capture 9 longest SQL executions within a one-hour window before and after two hours ago (you can easily change to your desired time window):
-- This query captures 9 longest SQL executions within a one-hour window before and after two hours ago
-- Dated: July 2024
-- Author: Yuan Yao
set tab off
set linesize 200
col run_seconds format 99999
select *
from (
select sql_id,
round((EXTRACT(DAY FROM (max(sample_time) - max(sql_exec_start))) * 86400) +
(EXTRACT(HOUR FROM (max(sample_time) - max(sql_exec_start))) * 3600) +
(EXTRACT(MINUTE FROM (max(sample_time) - max(sql_exec_start))) * 60) +
(EXTRACT(SECOND FROM (max(sample_time) - max(sql_exec_start)))), 2) as run_seconds,
to_char(max(sql_exec_start), 'YYYY-Mon-DD HH24:MI:SS') as exec_start,
to_char(max(sample_time), 'YYYY-Mon-DD HH24:MI:SS') as end_time,
sql_exec_id
from dba_hist_active_sess_history
where sql_exec_start between sysdate - interval '3' hour - interval '30' minute
and sysdate - interval '1' hour + interval '30' minute
and sql_exec_id > 0
group by sql_id, sql_exec_id
order by run_seconds desc
)
where rownum < 10;
An output example:
SQL_ID RUN_SECONDS EXEC_START END_TIME SQL_EXEC_ID
7nz11m353mj22 93.25 2024-Jul-10 13:03:53 2024-Jul-10 13:05:26 16777217
c1nxkb8m44cpz 33.61 2024-Jul-10 13:03:00 2024-Jul-10 13:03:33 16777217
as2dr3ag24gay 11.37 2024-Jul-10 13:03:12 2024-Jul-10 13:03:23 16777224
cx67k8h01fp7y 8.73 2024-Jul-10 13:05:38 2024-Jul-10 13:05:46 16777217
cguxmj11h9gjx 8.09 2024-Jul-10 13:03:46 2024-Jul-10 13:03:54 16777217
b6usrg82hwsa3 8.05 2024-Jul-10 13:12:59 2024-Jul-10 13:13:07 16777228
ajfdc0t0jh5z5 4.49 2024-Jul-10 13:05:32 2024-Jul-10 13:05:36 16777216
as2dr3ag24gay 3.45 2024-Jul-10 13:06:14 2024-Jul-10 13:06:17 16777225
6t9uu481ygn89 2.85 2024-Jul-10 13:03:41 2024-Jul-10 13:03:43 16777217
Let’s consider a scenario where we already know which SQL statement cased the transient performance issue. Although this SQL statement usually runs in an acceptable time range, we need to identify which specific execution is an outlier.
Here’s the query to find the outlier execution:
-- This query retrieves and calculates the execution time of a specific SQL statement on the current date
-- Dated: July 2024
-- Author: Yuan Yao
select sql_id,
sql_exec_id,
round((EXTRACT(DAY FROM (max(sample_time) - min(sql_exec_start))) * 86400) +
(EXTRACT(HOUR FROM (max(sample_time) - min(sql_exec_start))) * 3600) +
(EXTRACT(MINUTE FROM (max(sample_time) - min(sql_exec_start))) * 60) +
(EXTRACT(SECOND FROM (max(sample_time) - min(sql_exec_start)))), 2) as run_seconds,
to_char(min(sql_exec_start), 'YYYY-Mon-DD HH24:MI:SS') as exec_start,
to_char(max(sample_time), 'YYYY-Mon-DD HH24:MI:SS') as end_time
from dba_hist_active_sess_history
where sql_id = 'as2dr3ag24gay'
and trunc(sample_time) = trunc(sysdate)
and sql_exec_id > 0
group by sql_id, sql_exec_id
order by run_seconds desc;
The output is:
SQL_ID SQL_EXEC_ID RUN_SECONDS EXEC_START END_TIME
as2dr3ag24gay 16777224 45.12 2024-Jul-10 13:03:12 2024-Jul-10 13:03:57
as2dr3ag24gay 16777225 3.45 2024-Jul-10 13:06:14 2024-Jul-10 13:06:17
as2dr3ag24gay 16777226 3.67 2024-Jul-10 13:09:20 2024-Jul-10 13:09:24
as2dr3ag24gay 16777227 3.55 2024-Jul-10 13:12:31 2024-Jul-10 13:12:34
as2dr3ag24gay 16777228 3.60 2024-Jul-10 13:15:45 2024-Jul-10 13:15:48
as2dr3ag24gay 16777229 3.50 2024-Jul-10 13:18:50 2024-Jul-10 13:18:54
as2dr3ag24gay 16777230 3.48 2024-Jul-10 13:21:52 2024-Jul-10 13:21:55
as2dr3ag24gay 16777231 3.43 2024-Jul-10 13:24:58 2024-Jul-10 13:25:01
as2dr3ag24gay 16777232 3.46 2024-Jul-10 13:28:05 2024-Jul-10 13:28:08
as2dr3ag24gay 16777233 3.49 2024-Jul-10 13:31:10 2024-Jul-10 13:31:13
From the output, we can see that the execution of the SQL statement at “2024-Jul-10 13:03:12” is the outlier, with a runtime of 45.12. This execution is the likely cause of the transient performance issue.





Leave a comment