Q1. Explain Application / Server architecture being used in your project ?
Ans. We are using cluster of Web servers and Application servers. Load Balancer is used to manage the load between them. Down the layer we have middleware server and then DB server to access database.
Q2. Which Web and Application server is being used by your application ?
Ans. We are using Apache 2.3 and Tomcat 5.6.
Q3. What are the steps you follow if you receive an application outage ticket ?
a. Inform the stake holders that the issue is being worked upon.
b. Login to server to see if its responding.
c. Access Application and Web Server logs to see if the application is receiving requests.
d. If not, Involve the appropriate Network Team.
e. Inform the stakeholders regarding the progress.
f. Bounce the web / application server instance , if required.
g. Close the ticket with the steps taken to resolve the problem.
h. Complete the RCA ( Root Cause Analysis ) and submit the report to stake holders.
Q4. How do you monitor the server resources if inadvertently high traffic is reported ?
Ans. We use SAR command for that purpose. We also have GUI system monitoring tool to keep real time check of requests, load and memory usage.
Q5. How do you monitor your logs while investigating a high severity problem ?
Ans. We try to look for errors in the last n minutes when the issue occurred. If the issue is still occurring intermittently, We tail the logs for different application server instances to see the error snippets coming in the live logs.
Q6. Have you done any sort of automation in your system monitoring tasks ?
Ans. Yes we have created System as well as Log monitoring scripts to keep track of exceptions. We are also using a tool that will inform the stake holders if an exceptional event occurs with the system.
Q7. What all caching has been used in your application?
Ans. We are using Akamai as web server cache.
Q8. Have you ever faced any problem due to caching?
Ans. Yes , sometime we receive issues related to outdated pages being rendered to the user. In those cases we clear the cache and then try to investigate the reason for that. Sometime the issue is due to comparatively high refresh interval. In those cases we reduce the cache refresh interval.
Q9. What if the issue is related to Database server ?
Ans. We involve DBA and try to solve it through them. By the time they are solving it , we keep the stake holders informed regarding the progress.
Q10. Do you use command aliases while doing your work ?
Ans. Yes , I have created many aliases and saved them within my .profile file so that the system loads them, the each time i logs onto the server.
Q11. What are your responsibilities after the ticket has been closed ?
Ans. We inform the stakeholders regarding the resolution and steps taken for it. We updated the ticket notes and link it with the master / related tickets. RCA is done for the high priority and critical issues and a report is submitted.