Home / Solaris 10 / Fork Error on Solaris SPARC/X86

Fork Error on Solaris SPARC/X86

In Solaris ,we may encounter a fork error issue very often if the system is not tuned well.Some times due to poor application configuration, system may not able to allocate the resource or system will end-up with insufficient resources.It happens because poor system design where the “system load” vs “system configuration” are not well balanced.For an example, if the system memory is full,then kernel can’t create a new process/thread without having enough free memory space.

You can see the fork error messages on /var/adm/messages file.

Possible Causes for fork error issue:

1. /tmp may be full
2. Swap space may be full
3. System reached max NLWP .
4. Insufficient File descriptor (ulimit -a)

1.Check /tmp is full or not
bash-3.00$ df -h /tmp
bash: fork: Resource temporarily unavailable

# exec df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-tmplv 3.9G 144M 3.6G 4% /tmp
If the system didn’t get enough room to create a new process then system will log fork errors on the /var/adm/messages.Check the swap space using prstat ,top,swap -s commands and find out what are the processes are consuming more virtual memory.
3.Checking NLWP,
Find the current NLWP use,(NLWP-Number of light weight process)
#ps -ef -o nlwp|grep -v NLWP|awk '{ sum += $1 } END { print sum}'
If you have Solaris zones,its better have zone wise NLWP output.

Script to find the NLWP per zone,

#for i in `zoneadm list -cv|awk '{print $2}'|egrep -v "ID|global|NAME" `; do echo $i;zlogin $i ps -ef -o nlwp|grep -v NLWP|awk '{sum += $1} END {print sum}'; done

To increase the NLWP,We will be encounter the issue,where the system is not able to create new thread on the system.We can also see  see allocation failures in segkp. This memory segment is used for allocating stacks for threads.It has a default size of 2GB  and this is sufficient for about 64k threads.If a system needs to run more threads, the size of segkp has to be tuned. This can be done with  /etc/system entry and that is valid after reboot:
set segkpsize=0x80000
The above value doubles the size of segkp to 4 Gbyte. So about 128k threads can be started.Larger system should have have minimum 4GB segkp to trigger more threads.

This NLWP value can restricted using zone.max-lwps parameter at the zone level.We can also do this restriction using project.max-lwps.

kstat provides error counter information of segkpsize and that will confirm whether we need to increase this value or not.

bash-3.00# kstat -p |grep segkp|grep fail
unix:0:segkp_12288:alloc_fail 0
unix:0:segkp_16384:alloc_fail 0
unix:0:segkp_20480:alloc_fail 0
unix:0:segkp_4096:alloc_fail 0
unix:0:segkp_8192:alloc_fail 0
vmem:37:segkp:fail 0
vmem:37:segkp:populate_fail 0

As per above output,there is no allocation failures and you no need to increase the segkpsize value.More information can be found using below command.

bash-3.00#  kstat -p -n segkp
vmem:37:segkp:alloc 6270
vmem:37:segkp:class vmem
vmem:37:segkp:contains 0
vmem:37:segkp:contains_search 0
vmem:37:segkp:crtime 0
vmem:37:segkp:fail 0
vmem:37:segkp:free 5348
vmem:37:segkp:lookup 331
vmem:37:segkp:mem_import 0
vmem:37:segkp:mem_inuse 25235456
vmem:37:segkp:mem_total 780140544
vmem:37:segkp:populate_fail 0
vmem:37:segkp:populate_wait 0
vmem:37:segkp:search 133339
vmem:37:segkp:snaptime 713462.654771966
vmem:37:segkp:vmem_source 0
vmem:37:segkp:wait 0


If the system reached the maximum file descriptor,you will get fork error even though system resources are free.Its better to have more file descriptor according to the application installed on your system.

To count the File descriptor:
# ls /proc/$PID/fd | wc -l
To check the soft limit for FD
# ulimit -Sn
256
To check the hard limit for FD
# ulimit -Hn
65536
Increase the FD according to the requirement

How to increase the file descriptor using ulimit command ?
To increase soft limit,

bash-3.00# ulimit -S -n 8192
bash-3.00# ulimit -Sn
8192

To increase hard limit,

bash-3.00# ulimit -H -n 65540
bash-3.00# ulimit -Hn
65540


You can also use prctl to view and modify this value.
To view,

bash-3.00# prctl -n process.max-file-descriptor  -i process $$
process: 2332: bash
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 8.19K - deny 2332
privileged 65.5K - deny -
system 2.15G max deny -


To modify the soft limit,

bash-3.00# prctl -n process.max-file-descriptor -t basic -v  4096 -r -i process $$
bash-3.00# prctl -n process.max-file-descriptor -i process $$
process: 2332: bash
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 4.10K - deny 2332
privileged 65.5K - deny -
system 2.15G max deny -

To modify the hard limit,

bash-3.00# prctl -n process.max-file-descriptor -t privileged -v  32786 -r -i process $$
bash-3.00# prctl -n process.max-file-descriptor -i process $$
process: 2332: bash
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 4.10K - deny 2332
privileged 32.8K - deny -
system 2.15G max deny -

So In prctl term,soft limit is nothing but value of “basic” and privileged is hardlimit. 

Thank you for reading this article.Please leave a comment if you have any doubt ,i will get back to you as soon as possible.

VMTURBO-CLOUD-CAPACITY