Site icon UnixArena

How to Find IOWAIT on Solaris ?

This article is the follow up article of  I/O bottle neck issues.Once you have determine the I/O-wait, then you are good proceed with this article.I/O wait is something that processor are waiting for I/O to complete either on disk or network.In production systems ,most of the application will be configured to write on the disk or LUNS not on the NFS. So here we will see how we can find out the process which are making the iowait.


Quick warp-up of determining the iowait. You can use number of commands to determine the iowait.
1.TOP

root@SOLG:~# top
last pid:  1823;  load avg:  0.01,  0.04,  0.08;  up 140+02:17:37                                                                   20:27:42
96 processes: 95 sleeping, 1 on cpu
CPU states: 93.1% idle,  2.9% user,  3.9% kernel,  0.0% iowait,  0.0% swap
Kernel: 332 ctxsw, 244 trap, 533 intr, 2023 syscall, 241 flt
Memory: 952M phys mem, 107M free mem, 1024M total swap, 1024M free swap

Check the “iowait” field.This value should be less than 2 or 3.

2. iostat -x

root@SOLG:~# iostat -xn
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.6    1.1   43.3   13.5  0.0  0.1    0.0   43.2   0   3 c8t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c7t0d0
    0.0    0.0    0.3    0.0  0.0  0.0    0.0   30.7   0   0 c8t1d0
root@SOLG:~#

Here You need to look at “wait” field . 

3.sar -d 1 5 

root@SOLG:/usr# sar -d 5 5
SunOS SOLG 5.11 11.1 i86pc    11/29/2013
20:38:43   device        %busy   avque   r+w/s  blks/s  avwait  avserv
20:38:48   ata1              0     0.0       0       0     0.0     0.0
           iscsi0            0     0.0       0       0     0.0     0.0
           mpt0              0     0.0       0       0     0.0     0.0
           scsi_vhc          0     0.0       0       0     0.0     0.0
           sd0               0     0.0       0       0     0.0     0.0
           sd0,a             0     0.0       0       0     0.0     0.0
           sd0,b             0     0.0       0       0     0.0     0.0
           sd0,h             0     0.0       0       0     0.0     0.0
           sd0,i             0     0.0       0       0     0.0     0.0
           sd0,q             0     0.0       0       0     0.0     0.0
           sd0,r             0     0.0       0       0     0.0     0.0
           sd1               0     0.0       0       0     0.0     0.0
           sd2               0     0.0       0       0     0.0     0.0
           sd2,a             0     0.0       0       0     0.0     0.0
           sd2,h             0     0.0       0       0     0.0     0.0
           sd2,i             0     0.0       0       0     0.0     0.0
           sd2,q             0     0.0       0       0     0.0     0.0
           sd2,r             0     0.0       0       0     0.0     0.0

Here you need to look at the “avwait” .

Now we will come to the subject.

I don’t have any direct command or simple loop to determine which process are making the IOWAIT. But You can download the small dtrace script which will help you on this.This script is designed to highlight processes that are causing the most disk I/O.This version of psio uses DTrace (Solaris 10 & 11).

Download the script and copy to /var/tmp. Modify the permission as 700 (only for root).

PSIO10 Download Here

WARNING:psio may use a large amount of memory if long samples are used on busy systems. 

Author: Brendan Gregg [Sydney, Australia]
Website:http://www.brendangregg.com/psio.html

1.The script name is psio10. Here is the help command output.

root@SOLG:~# ./psio10 -h
psio10 ver 0.71
USAGE: psio10 [-efhmnx] [seconds]
       psio10 [-efhnx] -i infile
       psio10 -o outfile [seconds]
   eg,
      psio10 5           # 5 second sample
      psio10 -x          # extended output, %I/Ot %I/Os %I/Oc %CPU and %MEM
      psio10 -e          # event listing (raw and verbose)
      psio10 --help      # print full help
root@SOLG:~#


2.I have initiated IO using cp command.Let’s see how the dtrace script is pulling the necessary info.

root@SOLG:~# ./psio10 5
     UID   PID  PPID %I/O    STIME TTY      TIME CMD
    root  1888  1780 15.5 22:01:41 pts/2   00:25 cp -R adm apache2 
    root     5     0 11.6   Jul 12 ?       00:30 zpool-rpool
    root     0     0  0.0   Jul 12 ?       00:01 sched
    root     1     0  0.0   Jul 12 ?       00:00 /usr/sbin/init
    root     2     0  0.0   Jul 12 ?       00:00 pageout
    root     3     0  0.0   Jul 12 ?       00:11 fsflush
    root     6     0  0.0   Jul 12 ?       00:06 kmem_task
    root     7     0  0.0   Jul 12 ?       00:00 intrd
    root     8     0  0.0   Jul 12 ?       00:00 vmtasks
    root    11     1  0.0   Jul 12 ?       00:03 /lib/svc/bin/svc.startd
    root    13     1  0.0   Jul 12 ?       00:13 /lib/svc/bin/svc.configd
  netcfg    40     1  0.0   Jul 12 ?       00:00 /lib/inet/netcfgd
    root    46     1  0.0   Jul 12 ?       00:01 /usr/sbin/dlmgmtd
  daemon    69     1  0.0   Jul 12 ?       00:01 /lib/crypto/kcfd
  netadm    92     1  0.0   Jul 12 ?       00:00 /lib/inet/ipmgmtd

That;s great. You can see cp command process on the top and iowait is 15.5 for that process. Here you need to ignore zpool process id.Because zpool is process for ZFS which writes the data into the disk.

3.For extended output,use -“x”

root@SOLG:~# ./psio10 -x
     UID   PID %CPU %I/Ot %I/Os %I/Oc %MEM S   TIME CMD
    root     0  0.0   0.0   0.0   0.0  0.0 T  00:01 sched
    root     1  0.0   0.0   0.0   0.0  0.1 S  00:00 /usr/sbin/init
    root     2  0.0   0.0   0.0   0.0  0.0 S  00:00 pageout
    root     3  0.1   0.0   0.0   0.0  0.0 S  00:11 fsflush
    root     5  0.4   0.0   0.0   0.0  0.0 S  00:31 zpool-rpool
    root     6  0.0   0.0   0.0   0.0  0.0 S  00:06 kmem_task
    root     7  0.0   0.0   0.0   0.0  0.0 S  00:00 intrd
    root     8  0.0   0.0   0.0   0.0  0.0 S  00:00 vmtasks
    root    11  0.0   0.0   0.0   0.0  0.4 S  00:03 /lib/svc/bin/svc.startd
    root    13  0.0   0.0   0.0   0.0  1.4 S  00:13 /lib/svc/bin/svc.configd
  netcfg    40  0.0   0.0   0.0   0.0  0.1 S  00:00 /lib/inet/netcfgd
    root    46  0.0   0.0   0.0   0.0  0.1 S  00:01 /usr/sbin/dlmgmtd
  daemon    69  0.0   0.0   0.0   0.0  0.4 S  00:02 /lib/crypto/kcfd


4. Understanding of fields.

FIELDS:
        %I/O    %I/O by time taken - duration of disk operation over
                available time (most useful field)
        %I/Ot   same as above
        %I/Os   %I/O by size - number of bytes transferred over total bytes
                in sample
        %I/Oc   %I/O by count - number of operations over total operations
                in sample
        IOTIME  Time taken for I/O (ms)
        IOSIZE  Size of I/O (bytes)
        IOCOUNT Count of I/O (number of operations)
        DEVICE  Device number or mount point name, eg "/var".
        BLOCK   Block address on disk
        INODE   Inode number


5.Here is some more information about psio.

        psio10             # default "ps -ef" style output, 1 second sample
        psio10 5           # sample for 5 seconds
        psio10 -e          # event listing (raw and verbose)
        psio10 -f          # full device output, print lines per device
        psio10 -h          # print usage
        psio10 --help      # print full help
        psio10 -i infile   # read from infile (a psio dump)
        psio10 -n          # numbered output, Time(ms) Size(bytes) and Count
        psio10 -o outfile  # write to outfile (create a psio dump)
        psio10 -s          # reduced output, PID, %I/O and CMD only
        psio10 -x          # extended output, %I/Ot %I/Os %I/Oc %CPU and %MEM

There is no doubt psio10 will make our job easier. Hope you will also use it .

If you are running with Solaris 11, then you can use inbuilt command to find what process are making high I/O.

root@UAAIS:~# /usr/dtrace/DTT/rwtop -Z 10 10
Tracing... Please wait.
2013 Dec 26 09:34:16,  load: 2.24,  app_r:      0 KB,  app_w:      0 KB

 ZONE    PID   PPID CMD              D            BYTES
    0    104      1 in.mpathd        R                1
    0    104      1 in.mpathd        W                1
    0   8167   8166 sshd             R               25
    0   8167   8166 sshd             W               68
2013 Dec 26 09:34:26,  load: 1.93,  app_r:      4 KB,  app_w:      0 KB

 ZONE    PID   PPID CMD              D            BYTES
    0   8167   8166 sshd             R              369
    0   8167   8166 sshd             W              420
    0   1169      1 vmtoolsd         R             4314
2013 Dec 26 09:34:36,  load: 1.64,  app_r:      0 KB,  app_w:      0 KB

 ZONE    PID   PPID CMD              D            BYTES
    0    104      1 in.mpathd        R                1
    0    104      1 in.mpathd        W                1
    0    179      1 utmpd            R                4
    0   8167   8166 sshd             R              312
    0   8167   8166 sshd             W              356
2013 Dec 26 09:34:46,  load: 1.39,  app_r:      0 KB,  app_w:      0 KB

 ZONE    PID   PPID CMD              D            BYTES
    0   8167   8166 sshd             R              426
    0   8167   8166 sshd             W              468
2013 Dec 26 09:34:56,  load: 1.27,  app_r:      4 KB,  app_w:      0 KB

 ZONE    PID   PPID CMD              D            BYTES
    0    104      1 in.mpathd        R                1
    0    104      1 in.mpathd        W                1
    0   8167   8166 sshd             R              255
    0   8167   8166 sshd             W              308
    0   1476      1 gnome-settings-d R              381
    0   1169      1 vmtoolsd         R             4314
^C
root@UAAIS:~#



Thank you for visiting UnixArena. Please leave a comment if you have any issues.

Exit mobile version