6.3 Problem Management

Key Takeaways

  • The purpose of problem management is to reduce the likelihood and impact of incidents by identifying actual and potential causes, and managing workarounds and known errors
  • A problem is a cause, or potential cause, of one or more incidents
  • A known error is a problem that has been analyzed but not yet resolved; a workaround reduces or eliminates impact without a full resolution
  • Problem management has three phases: problem identification, problem control, and error control
  • Incident management restores service fast; problem management investigates root cause to prevent recurrence
Last updated: June 2026

Purpose and Key Definitions

Quick Answer: The purpose of the problem management practice is to reduce the likelihood and impact of incidents by identifying actual and potential causes of incidents and managing workarounds and known errors. A problem is a cause, or potential cause, of one or more incidents.

Problem management is the second detail practice, and the exam almost always pairs it against incident management. The distinction is the single most-tested idea in this area: incident management restores service as fast as possible; problem management hunts the root cause to stop incidents recurring. Problem management trades speed for prevention.

Three terms must be precise:

  • Problem — a cause, or potential cause, of one or more incidents. Note that a problem can exist before any incident occurs.
  • Known error — a problem that has been analyzed but not yet resolved. Its cause is understood and documented.
  • Workaround — a solution that reduces or eliminates the impact of an incident or problem for which a full resolution is not yet available. Workarounds may become permanent if fixing the underlying problem is not cost-justified.

These three terms form a chain the exam loves to test: an unresolved problem that has been analyzed becomes a known error, and a workaround is the temporary relief applied while the known error remains open. A known error and its workaround are typically stored together in a known error database (KEDB) so that when the same symptoms recur, the service desk can apply the documented workaround instantly instead of re-diagnosing. This is one of the most practical links between problem management and incident management: problem management supplies the workarounds that incident management uses to restore service quickly.

The Three Phases of Problem Management

Problem management activities run across three phases. Knowing the phase names and what happens in each is a frequent exam item.

PhaseWhat happensKey output
1. Problem identificationActivities that detect and log problems — trend analysis of incidents, reviews of major incidents, supplier/partner information, recurring patternsA logged problem record
2. Problem controlAnalysis of problems from the provider's and consumer's perspective; prioritization; identifying root cause and documenting workarounds and known errorsDocumented workaround / known error
3. Error controlManaging known errors, reassessing the status of workarounds, and identifying potential permanent solutions that may be raised as change requestsA change request (to permanently fix)

How the phases connect to other practices

Problem identification often pulls data from incident records and monitoring & event management. Error control hands off to change enablement: when a permanent fix is found, problem management does not implement it directly — it raises a change request so the fix is properly assessed and authorized. This hand-off is a favorite exam scenario: problem management finds the fix; change enablement controls deploying it.

Problem control deserves extra attention because it is where prioritization happens. Not every problem warrants the same effort — ITIL 4 advises prioritizing problems by the risk they pose, considering both how likely the related incidents are to recur and how much impact they cause. Some problems are deliberately left as documented known errors with a standing workaround, because the cost of a permanent fix outweighs the residual risk.

Error control then periodically reassesses those known errors: a workaround that was acceptable last quarter may become intolerable as incident frequency climbs, prompting a change request that was previously deferred.

Problem Management vs. Incident Management

Because these two practices are constantly confused, lock in the contrast:

  • Incident management = react fast, restore service. Success measured by speed of restoration and meeting SLA resolution targets. May apply a temporary workaround supplied by problem management.
  • Problem management = investigate, prevent recurrence. Success measured by reduced incident volume and impact over time. Works in parallel with, not instead of, incident management.

A worked example: users report repeated crashes of an order-entry app (multiple incidents). Incident management restores each session quickly. Problem management opens a problem, analyzes it, finds a memory leak in a library, documents a workaround (restart the service nightly), and logs a known error while pursuing a vendor patch. When the patch is available, problem management raises a change request to change enablement.

Reactive and proactive problem management

Problem management is both reactive (analyzing incidents that have already happened) and proactive (analyzing trends, risks, and vulnerable areas to prevent incidents that have not yet happened). Proactive problem management connects strongly to risk management and continual improvement, because eliminating problems before they cause incidents is the highest-value outcome of the practice. Not every problem will be resolved — some are accepted as known errors when fixing them is not cost-justified, and the documented workaround simply stays in place.

A final distinction the exam may probe: problem management often involves collaboration across many teams and even suppliers, because root causes can sit anywhere in the technology stack. It is rarely a solo activity, and ITIL 4 encourages bringing diverse expertise together — analogous to swarming in incident management — to analyze stubborn problems.

Test Your Knowledge

What is the purpose of the problem management practice?

A
B
C
D
Test Your Knowledge

Which ITIL 4 term means 'a problem that has been analyzed but has not been resolved'?

A
B
C
D
Test Your Knowledge

In which order do the three phases of problem management occur?

A
B
C
D
Test Your Knowledge

When problem management identifies a permanent solution to a known error, what should it do?

A
B
C
D