5.2 Service Management Practices — Detail Practices
Key Takeaways
- An incident is an unplanned interruption to a service or a reduction in its quality; incident management's purpose is to minimize negative impact by restoring normal service operation as quickly as possible.
- A problem is a cause, or potential cause, of one or more incidents; a known error is a problem that has been analysed but not resolved; a workaround reduces or eliminates impact when no full resolution exists yet.
- Change enablement recognizes three change types: standard (pre-authorized, low-risk), normal (assessed and authorized, may use a CAB), and emergency (expedited, may use an ECAB).
- The service desk is the single point of contact (SPOC) between the provider and users, capturing demand for incident resolution and service requests.
- The watermelon SLA is green on the outside but red on the inside: every operational metric is met yet the customer is unhappy, showing SLAs must measure outcomes, not just component metrics.
Detail Service Management Practices
Five service management practices must be understood in detail for the exam — you need their purpose plus their key terms and behaviours. Read scenario stems carefully: the examiner tests whether you can tell these apart.
Incident Management
An incident is an unplanned interruption to a service, or a reduction in the quality of a service. The purpose of incident management is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible. Note the emphasis on speed of restoration, not on finding the underlying cause — that is problem management's job. Incidents are prioritized using impact and urgency. Major incidents have high impact and follow a separate, faster procedure with dedicated resources.
ITIL encourages swarming (specialists collaborating until resolution), escalation to the right people, good tools and knowledge, and self-help. A workaround may be applied to restore service quickly even before the cause is understood.
Problem Management
The purpose of problem management is to reduce the likelihood and impact of incidents by identifying actual and potential causes of incidents, and managing workarounds and known errors. Three definitions are frequently tested and easy to confuse:
| Term | Definition |
|---|---|
| Problem | A cause, or potential cause, of one or more incidents |
| Known error | A problem that has been analysed but has not been resolved |
| Workaround | A solution that reduces or eliminates the impact of an incident or problem for which a full resolution is not yet available |
Problem management has three phases: problem identification, problem control (analysing problems and documenting workarounds and known errors), and error control (managing known errors and potential permanent resolutions). A workaround created during problem control can be used by incident management to restore service faster. The exam trap: incident management restores service; problem management finds and removes causes. A single dramatic outage is an incident; the recurring underlying fault is a problem.
Worked example: a website crashes three times in a week (three incidents). Each time, incident management restarts the server to restore service — sometimes using a documented workaround. Problem management investigates and finds a memory leak in a library; because the fix is not yet deployed, the memory leak is recorded as a known error. Once a permanent fix is released, the known error is closed. Notice how the same event stream is handled by two practices with two different goals: fast restoration versus lasting cause removal.
Change Enablement
The purpose of change enablement is to maximize the number of successful service and product changes by ensuring that risks are properly assessed, authorizing changes to proceed, and managing the change schedule. A change is the addition, modification, or removal of anything that could have a direct or indirect effect on services. (The practice was briefly called 'change control' in early ITIL 4 material and was renamed change enablement — do not use the old ITIL v3 name 'change management' on the exam.) There are three types of change:
- Standard change — low-risk, pre-authorized, well understood; follows a documented, repeatable procedure and needs no separate risk assessment each time. It may be initiated as a service request.
- Normal change — must be scheduled, assessed, and authorized; ranges from minor to major. A change authority (person or group) approves it, sometimes with advice from a change advisory board (CAB).
- Emergency change — must be implemented as soon as possible, for example to fix a major incident or apply a security patch. Assessment and authorization are expedited, often by an emergency change advisory board (ECAB).
A change authority is whoever authorizes a change; for low-risk changes this is often decentralized to speed flow. The change schedule plans and communicates upcoming changes. ITIL 4 de-emphasizes a single heavyweight CAB that approves everything, favouring delegated authority and automation for standard changes. Do not confuse change enablement with organizational change management, which handles the human/people side of change.
Service Desk
The purpose of the service desk is to capture demand for incident resolution and service requests. It is the single point of contact (SPOC) between the service provider and its users, and the practical face of the provider. Channels include phone, chat, self-service portals, email, walk-up, and increasingly AI chatbots. A crucial exam point: the service desk's value comes from understanding the business and showing empathy, not only technical skill — its staff need excellent communication and emotional intelligence. Service desks can be centralized, virtual, or follow-the-sun.
Service Level Management
The purpose of service level management is to set clear business-based targets for service levels and to ensure that delivery of services is properly assessed, monitored, and managed against these targets. A service level agreement (SLA) is a documented agreement between a service provider and a customer that identifies both the services required and the expected level of service. A good SLA relates to a defined service, is tied to defined outcomes (not just single operational metrics), reflects genuine engagement, and is written simply.
Beware the watermelon SLA (the watermelon effect): a report that is green on the outside — every individual technical metric is met — but red on the inside because the customer is still dissatisfied. It warns against measuring only component-level metrics while ignoring the customer's actual experience and outcomes. The fix is to balance operational measures with outcome and experience measures, sometimes captured in an experience level agreement (XLA).
A payroll application has stopped responding for all users an hour before the pay run. Which practice's purpose most directly applies to getting the service working again as fast as possible?
A recurring fault has been analysed and its cause identified, but a permanent fix has not yet been implemented. In ITIL terms, this is best described as a:
A monthly SLA report shows every technical target (uptime, response time, ticket-closure speed) was met, yet the customer is clearly frustrated and considering leaving. What does this illustrate?