I have seen many of the candidates are very good in theoretical but they will struggle to answer scenario based and ITIL process based questions in Unix interviews. By just reading the text books and remembering commands won’t work. You need to apply your logical skills to succeed there.Be confident on what you know and do not comment on anything where you are weak. Last but not least, don’t be an overconfident. Overconfidence always kicks you out and it will question your listening skills.Listening Skills play a vital role in Unix administration.Here we will see some scenario based questions and answers which will help you in Unix interview.
If the Unix server is not accessible(Mean to say not able to ssh/telnet the server) then what are things you will check it in first shot.
1.Try to ping the server to check whether server is responding or not.If not responds,then login to console for further troubleshooting.
2. Login to hardware console(ILOM or XSCF) and try to access the OS from there(console -d 0 ).If server is up ,then login to server and check the IP address flags(ifconfig -a) and router connectivity.
3.If the address flag is “up; running” and router connectivity also fine ,then check the ssh/telnet services.(Ex: svcs -xv)
4.If everything is fine ,then check the system load.(“w” or “uptime”)
5.You will get fork errors in OS console if the system is running out resource.In these cases,you may not able to access the system and you need to free up some memory and reduce CPU usage.
Application team is requesting you to reboot the Solaris global zone.Will you reboot it immediately ? will you wait for stakeholders approval ?
1.You should not reboot the server without proper stake holders approvals.You need to involve incident management team and SDM prior to rebooting the server.
2.You need to snooze the monitoring alerts from monitoring system to stop false alerts being generated while rebooting .
3.If any cluster service is running ,please make sure you are halting the cluster properly.
4.Last but not least ,you should take all the configuration backup prior to the system rebooting.
5.Its better to have root disk details (eeprom) prior to rebooting it.
Application team is complaining about system performance issue.What are the things you check in the first shot ?
1.Run vmstat command to identify the system bottle neck.
2.Run iowait command to determine the disk I/O issue.
3.Check the NIC card status (dladm show-dev).If any link is down,then we need to involve network team to check it .
4.Perform hardware check .(fmadm faulty)
5.Get the list of high memory & CPU consuming process list and inform the process owner to check it.
6.If you are having many local zones on that global,identify which local zones consuming more resources. (prstat -Z) and fix the issue on that local zone first.
System is totally hung. How do you reboot the server after getting proper approval. ?
1.If the server is hung,you should not perform hard reboot in first shot.you need to force the system to panic.
2.You need to raise support case with oracle to find the root cause using the generated crashdump .
In some situations,system will not respond properly. (Ex:Unable to kill the process,Unable to umount the filesystem,local zone went to shutting_down state and never halted)
1.Most of the Unix admins will be faced such a situation in their experience. In these cases
you need to generate the live crashdump using “savecore -L” command and upload to oracle support to find the root cause. In most of the cases you need to end up with rebooting the system to fix these kind of issues.
What are the challenges you have faced in Unix Administration ?
You can impress the interviewer by saying couple of tough challenge which you have faced in the past.Here are the some of the challenges for you.
1.Recovered vxfs filesystem with help of Symantec.
2.Recovered the Solaris 10 using ZFS snapshot.
3.Recovered the destroyed zpool
4.Fixed the Strange issue with ifconfig
5.Find the root cause for high kernel usage using Dtrace.
6.Plumbed New IP address on the local zone.
7.Completed successful migration from UFS to ZFS.
8.Liveupgrade issues with local zone sitting on top of ZPOOL.
9.Patched Solaris global zone with ZFS and Non-global zone’s root FS with VXFS
For an example,If you are keep on getting incidents from one server.Those may be false alerts or there is an issue within server which needs UnixAdmin attention.What are the process you will follow to fix the repetitive alerts ?
1.In these kind of situations ,you need to involve problem management team to fix these repetitive alerts.Problem management team will involve multiple teams if required to fix this issue.
Note:problem management is not technical team ,those will help you to engage multiple team to fix this issues.
Thank you for reading this article.Please add if you have any questions in your mind and will update the article as soon as possible.