Home / Performance issues / Solaris memory management – Performance issue

Solaris memory management – Performance issue

Memory bottlenecks are evidenced by two different things happening on the system — paging and swapping. Paging refers to pages of memory being reclaimed by the page daemon when the system starts to get low on free memory. Swapping is more extreme, and refers to entire processes being swapped out.

To determine if you are only paging, or also swapping, examine two columns in the vmstat output. The first column is the sr (scan rate) column. If the value in this column is greater than zero then the page scanner is scanning memory pages to put them back on the free list to be reused. The page scanner runs when memory falls under the value of a system parameter known as lostfree – default value is 1/64th of physical memory – or cachefree if priority_paging is enabled   default value is 1/128th of physical memory. 

You should not worry about high scan rates if you are using the file system heavily. High scan rates can be normal in many circumstances. If priority_paging is enable, the page scanner steals the pages more effectively so the file system I/O does not cause unnecessary paging of applications.  priority_paging causes sr rate to be higher for its own good. Solaris 8 introduces the cyclic cache. With cyclic cache, the scanner is not used to reclaim pages  during file system I/O therefore if sr is greater than 0 then it’s a indication that the system is running low in memory.

To see if the system is swapping, refer to the w column. It is the third column of the output, and refers to entire processes which are swapped out. You can determine what these processes are by running the command ‘ /usr/bin/ ps -e -o pid,rss,args ‘ and looking for a RSS of 0 (sched, pageout and fsflush processes should always have a RSS of 0). 

If you have anything in the w column, the system is either low on memory right now, or have been in the past. If your system gets low on memory and processes are swapped out, it may take a long time for them to get back into memory. 

This is especially true if there are daemons running infrequently, because they have to receive an event in order to try to run again. This is not necessarily bad, as long as when they need to run, they will have the memory to do so.
If, over time, you see swapping, you should probably consider adding memory to the system or devising a strategy to lower overall memory usage on the system.

                                                      Credit to https://support.oracle.com 

1. Check any degraded memory on the system using below command.

# fmadm faulty

Login to console and execute showstatus command. (I.e the server type is M series )

XSCF> showstatus
CMU#1 Status:Normal;
* MEM#11B Status:Degraded;

If you see degraded part,then raise support case with oracle to replace the unit.

2.Execute the below command to find the scan rate and system swap activity by executing the below command.

# vmstat 5 5
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m1 m1 m1 m2 in sy cs us sy id
5 0 0 91618488 83691688 497 1634 263 145 143 0 0 4 4 4 81 35121 55605 42293 6 2 93
2 0 0 60244800 52376544 448 3410 5 17 15 0 0 0 0 0 0 55068 72720 62179 15 3 81
9 0 0 60082296 52388200 519 2080 2 39 39 0 0 0 0 0 0 54136 82583 60776 19 3 78
2 0 0 60236480 52381312 934 3421 8 82 82 0 0 0 0 0 0 57312 81429 64225 20 4 76
2 0 0 60245224 52382864 624 2833 0 68 68 0 0 0 0 0 0 53064 84185 58746 18 3 79

3.The below command will help you to identify which local zones are using more memory.

[root@Arena~]# prstat -s size -Z
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
21092 dbusr 17G 999M sleep 2 0 0:00:15 0.0% disp+work/1
13224 dbusr 17G 1082M sleep 24 0 0:00:51 0.0% disp+work/1
13162 dbusr 17G 1039M sleep 53 0 0:03:57 0.0% disp+work/1
13199 dbusr 17G 1038M sleep 54 0 0:04:14 0.0% disp+work/1
13174 dbusr 17G 957M sleep 54 0 0:02:10 0.0% disp+work/1
13155 dbusr 17G 1020M sleep 38 0 0:03:16 0.0% disp+work/1
13172 dbusr 17G 1034M sleep 49 0 0:04:10 0.0% disp+work/1
13169 dbusr 17G 996M sleep 51 0 0:02:17 0.0% disp+work/1
13888 dbusr 17G 986M sleep 59 0 0:00:16 0.0% disp+work/1
10877 dbusr 17G 1027M sleep 47 0 0:00:12 0.0% disp+work/1
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
21 644 88G 44G 17% 357:55:56 15% arenaz1
1 101 186G 60G 24% 76:14:21 0.6% arenaz2
2 75 25G 16G 6.3% 95:16:41 0.9% arenaz3
11 70 15G 13G 5.1% 20:18:25 1.4% arenaz4
3 47 14G 7809M 3.0% 69:54:52 0.1% arenaz5
As per above output, arenaz2 is consuming 24% of physical memory.


4.once you find the issue local zone,then login to that zone and execute the below command to determined which processing holding more memory.

 rootArenaz1 ~$ prstat -t -s size -c 1 1
NPROC USERNAME SWAP RSS MEMORY TIME CPU
854 dbusr1 18G 16G 6.1% 246:40:45 0.2%
310 mdbusr 132G 34G 13% 679:57:37 1.1%
103 mdbusr2 20G 9613M 3.7% 53:41:06 0.1%
61 mdbusr3 2114M 2209M 0.8% 7:20:57 0.0%
46 adm2 11G 7839M 3.0% 59:11:41 0.2%
63 mntg2 1470M 1368M 0.5% 12:36:33 0.0%
3 dbusr2 2331M 2588M 1.0% 0:08:50 0.0%
23 mdadm 1825M 1576M 0.6% 38:56:13 0.1%
As per the above command output,user mdbusr is consuming more memory.

5.In database server sometimes , semaphore will hold most of the physical memory.please see the below link for more information.

6.If you need to monitor memory throught out the day,it better to enable SAR on the system.Please check the below link for more information about SAR.

Thank you for reading this article.

Please leave a comment if you have any doubt ,i will get back to you as soon as possible.
VMTURBO-CLOUD-CAPACITY

One comment

  1. Thanks you for the professional articale

Leave a Reply

Your email address will not be published. Required fields are marked *