Table of Contents


Server Analysis Process

Introduction

Facing an issue on the Server's side is not that different that dealing with problems on the client's one. We are going to use the very same technique to troubleshoot the Server that we use to do the clients. Thus also in this case we are talking about identifying a problem starting from an incident, that is its sympton.

On the server it can be easier to spot incident as it's possible to read documentation about the expected behavior and get fast addressed to the wrong one the server is performing. Otherwise it can be difficultier to track down the problem which originated it.

When facing an issue on the server, please take care to collect as many data on it as possible, using the classical Kepner and Tregoe Help process, which lists 5 points:

  1. Define the Incident

  2. Describe the Incident

  3. Troubleshooting

  4. Try to solve the incident supposing the most probable cause

  5. If failed at point 5, report the incident to the higher level of problem handling service desk

The server support is performed by PrivateWave only as the third level, so before addressing a new request make sure you:

  • collect all relevant information

  • first attempt to troubleshoot

  • deeper analysis and performs the troubleshooting

The communication between the second and the third line is carried out by a ticketing service. The second line has to report the issue by filling out all fields of the report properly. The third line has to report back using the same ticketing service updating the problem status. The ticketing service will report timings as well and will be used as an incidents/problems database to be consulted by the second level before escalating the problem.

1. Define the Incident

This is the first phase  of the troubleshooting process. The initial entry point is the call of the user reporting an Incident. In this phase the user describes autonomously the issue(s) experienced and the Help Desk guide him/her to:

Gather data

This is a platform specific procedure to guide the user collecting the following information:

  1. Server version number

  2. Number of Network Interfaces

  3. Network Segregation configuration

  4. Possible route specific configurations

  5. Short description of malfunction

  6. Server name

Verifying operational requirements

  1. Is the device supported?

  2. Is the OS version supported?

  3. Is the application installed?

  4. Are the application edition and version OK?

  5. Is data plan option available?

  6. Is Internet connection functioning?

If any of this requirement checks are answered with "no" then the user will be informed about the requirements. This action closes the ticket. Otherwise we go to phase 2.

2. Describe the Incident

In this phase the Help Desk asks the User to retrieve as many useful information as possible, aiming to fill completely the following Incident table.

 

IS

IS NOT

DIFFERENCES

CHANGES

WHAT

  • Exact error message

  • What is not working?

Similar systems/situations not failed

?

?

WHERE

  • Failure location

  • Connection type

  • Operator

  • Contract type

Where or in which network does it work?

?

?

WHEN

  • Incident date and time

  • Occurrence frequency

When does it work?

?

?

EXTENT

  • Which parts are involved?

  • Which systems are involved?

Which parts and/or systems do work well?

?

?

3. Troubleshooting

This is a crucial phase as the Help Desk has to figure out the possible causes of the Incident with the help of the previously collected data. We provide a decision work flow that lists the most common complaints that are the entry points for the troubleshooting. The following complaints are examined:

Call Performance

Connection

Application local issues

Server problems

User can't call

Can't Connect to Server

Application doesn't start

Server is unreachable

Bad Call Quality

Can't Register to Server

Application closes abruptly or crashes

Server is too slow answering

Call interrupts

----------------------------

Application hangs at some point

Can't login to the Server

User can't receive call

----------------------------

Application disappeared or missing

Can't find the User/Callee on the Server

----------------------------

----------------------------

Application goes into an infinite loop

----------------------------

4. Try to solve the incident using the most probable cause

Using the information collected and with the help of the troubleshooting workflow the Help Desk should be able to list the most probable causes of the Incident and thus propose to the User the related procedure(s) to try the solutions in the provided order. If the first solution fails, it's possible to step to the second one and so on until either the Incident is solved or the list ends. In the later case the Incident should be escalated to the higher level of Service Desk.

5. Report the Incident to the higher level of problem handling service desk

Even if the first/second level has found a solution (and so it can close the Incident reporting success), the Incident must be reported as a Problem and passed to the proper team along with the Incident data collected during the process and the steps taken during the troubleshooting and it should be still tested for the solution. It would be up to the Problem team to eventually close the ticket or perform further investigations or perform proper further actions. The same practice should be executed if no workaround or solutions has been found and the Incident is still open.

6. Close the Incident

After the Incident has been declared as solved, the answers of the third level can be:

  • not a problem: no further actions required by editor.

  • workaround identified and provided.

  • wait for the problem resolution: a software component upgrade will be provided.

    Timings will be provided along with the answers. The second level has to report back to the User and ask for the acceptance of the answer.   

  • No labels