Pages

Monday, March 7, 2016

The simplest query for checking what’s happening in a database

 


When someone asks you to take a quick look into database performance and for whatever reason you can’t run your usual scripts or performance tools on there, ), then what query would you run first?
Yeah sometimes I’ve been not allowed to run custom scripts nor even touch the keyboard due security policies in effect.
Whenever you’re in such situation you want the command to be both short and effective for showing the database state.
The simplest query for determining database state performance wise would be this:
SQL> select event, state, count(*) from v$session_wait group by event, state order by 3 desc;

EVENT                                                            STATE                 COUNT(*)
---------------------------------------------------------------- ------------------- ----------
rdbms ipc message                                                WAITING                      9
SQL*Net message from client                                      WAITING                      8
log file sync                                                    WAITING                      6
gcs remote message                                               WAITING                      2
PL/SQL lock timer                                                WAITING                      2
PL/SQL lock timer                                                WAITED KNOWN TIME            2
Streams AQ: qmn coordinator idle wait                            WAITING                      1
smon timer                                                       WAITING                      1
log file parallel write                                          WAITING                      1
ges remote message                                               WAITING                      1
SQL*Net message to client                                        WAITED SHORT TIME            1
DIAG idle wait                                                   WAITING                      1
pmon timer                                                       WAITING                      1
db file sequential read                                          WAITING                      1
Streams AQ: waiting for messages in the queue                    WAITING                      1
rdbms ipc message                                                WAITED KNOWN TIME            1
jobq slave wait                                                  WAITING                      1
Streams AQ: qmn slave idle wait                                  WAITING                      1
Streams AQ: waiting for time management or cleanup tasks         WAITING                      1

19 rows selected.

It uses the Oracle wait interface to report what all database sessions are currently doing wait/CPU usage wise. Whenever there’s a systemic issue (like extremely slow log file writes) this query will give good hint towards the cause of problem. Of course just running couple of queries against wait interface doesn’t give you the full picture (as these kinds of database wide “healthchecks” can be misleading as we should be really measuring end user response time breakdown at session level and asking questions like what throughput/response time do you normally get) but nevertheless, if you want to see an instance sessions state overview, this is the simplest query I know.
Interpreting this query output should be combined with reading some OS performance tool output (like vmstat or perfmon), in order to determine whether the problem is induced by CPU overload. For example, if someone is running a parallel backup compression job on the server which is eating all CPU time, some of these waits may be just a side-effect of CPU overload).
Below is a cosmetically enhanced version of this command, as one thing I decode the “WAITED FOR xyz TIME” wait states to “WORKING” and “On CPU / runqueue” as event name as otherwise it’s easy to miss by accident that some sessions are not actually waiting on previous event anymore:

SQL> select
  2     count(*),
  3     CASE WHEN state != 'WAITING' THEN 'WORKING'
  4          ELSE 'WAITING'
  5     END AS state,
  6     CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
  7          ELSE event
  8     END AS sw_event
  9  FROM
 10     v$session_wait
 11  GROUP BY
 12     CASE WHEN state != 'WAITING' THEN 'WORKING'
 13          ELSE 'WAITING'
 14     END,
 15     CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
 16          ELSE event
 17     END
 18  ORDER BY
 19     1 DESC, 2 DESC
 20  /

  COUNT(*) STATE   EVENT
---------- ------- ----------------------------------------
        11 WAITING log file sync
         9 WAITING rdbms ipc message
         4 WAITING SQL*Net message from client
         3 WAITING PL/SQL lock timer
         2 WORKING On CPU / runqueue
         2 WAITING gcs remote message
         1 WAITING ges remote message
         1 WAITING pmon timer
         1 WAITING smon timer
         1 WAITING jobq slave wait
         1 WAITING Streams AQ: waiting for messages in the
         1 WAITING DIAG idle wait
         1 WAITING Streams AQ: qmn slave idle wait
         1 WAITING Streams AQ: waiting for time management
         1 WAITING db file sequential read
         1 WAITING log file parallel write
         1 WAITING Streams AQ: qmn coordinator idle wait

17 rows selected.

SQL>

Also, sometimes you might want to exclude the background processes and idle sessions from the picture:
SQL> select
  2     count(*),
  3     CASE WHEN state != 'WAITING' THEN 'WORKING'
  4          ELSE 'WAITING'
  5     END AS state,
  6     CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
  7          ELSE event
  8     END AS sw_event
  9  FROM
 10     v$session
 11  WHERE
 12      type = 'USER'
 13  AND status = 'ACTIVE'
 14  GROUP BY
 15     CASE WHEN state != 'WAITING' THEN 'WORKING'
 16          ELSE 'WAITING'
 17     END,
 18     CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
 19          ELSE event
 20     END
 21  ORDER BY
 22     1 DESC, 2 DESC
 23  /

  COUNT(*) STATE   EVENT
---------- ------- ----------------------------------------
         6 WAITING PL/SQL lock timer
         4 WORKING On CPU / runqueue
         3 WAITING db file sequential read
         1 WAITING read by other session
         1 WAITING Streams AQ: waiting for messages in the
         1 WAITING jobq slave wait

6 rows selected.

By the way, the above scripts report quite similar data what ASH is actually using (especially the instance performance graph which shows you the instance wait summary). ASH nicely puts the CPU count of server into the graph as well (that you would be able to put the number of “On CPU” sessions into perspective), so another useful command to run after this script is “show parameter cpu_count” or better yet, check at OS level to be sure :)
Note that you can use similar technique for easily viewing the instance activity from other perspectives/dimensions, like which SQL is being executed:
SQL> select sql_hash_value, count(*) from v$session
  2  where status = 'ACTIVE' group by sql_hash_value order by 2 desc;

SQL_HASH_VALUE   COUNT(*)
-------------- ----------
             0         20
     966758382          8
    2346103937          2
    3393152264          1
    3349907142          1
    2863564559          1
    4030344732          1
    1631089791          1

8 rows selected.

SQL> select sql_text,users_executing from v$sql where hash_value = 966758382;

SQL_TEXT                                                     USERS_EXECUTING
------------------------------------------------------------ ---------------
BEGIN :1 := orderentry.neworder(:2,:3,:4); END;                           10

So while these queries are nothing advanced nor really new, I hope this helps with the question “The whole database is suddenly slow – where to start?”
NB! If you want to move to the "New World" - and benefit from the awesomeness of Hadoop, without having to re-engineer your existing applications - check out Gluent, my new startup that will make history! ;-)

No comments:

Post a Comment