Loading...

Run: Problem Management ITIL SOP

Problem Management ITIL SOP template for logging, triaging, investigating, and documenting recurring IT issues so teams can reduce repeat incidents and captu...

Fill this out, get a PDF emailed to you. No sign-up required. Want to run it with your team and track results? Sign up free →

Steps

The Service Desk Analyst reviews recurring incidents, major incidents, customer escalations, and monitoring alerts to identify a potential problem record. Record the triggering evidence, affected service, and incident pattern in the problem ticket. Link related incidents to the problem record where available.
The Problem Manager verifies that the issue is recurring, high-impact, or likely to recur and that it is appropriate for problem management. Confirm the affected service, user population, and business impact. Reject or redirect items that are single incidents without recurrence or systemic risk.
The Problem Manager assigns priority using impact and urgency criteria. Document the rationale for the priority, including service criticality, frequency, and business risk. Set escalation thresholds for major incidents, regulatory exposure, or widespread user impact.
The Application Support Engineer gathers logs, alerts, incident timelines, configuration details, and recent changes related to the problem. Capture evidence from affected systems, users, and support teams. Preserve timestamps, versions, and change references for traceability.
The Problem Manager leads root cause analysis using an appropriate method such as 5 Whys, fishbone analysis, or fault tree analysis. Identify the underlying cause, contributing factors, and any control failures. Document assumptions, exclusions, and unresolved questions separately from confirmed findings.
The Incident Manager defines a workaround that restores or reduces service impact without removing the root cause. Validate the workaround with the support team and confirm any limitations, side effects, or rollback conditions. Publish the workaround in the knowledge base and link it to the problem and incident records.
The Problem Manager determines whether the root cause is confirmed and whether a permanent fix is available or planned.
The Problem Manager records the known error in the ITSM system. Include the root cause summary, affected services, symptoms, workaround, and any monitoring or detection rules. Link the known error to all related incidents and change records.
The Problem Manager escalates the corrective action to the appropriate resolver group or change authority. Define the target fix, owner, due date, and risk of delay. Escalate immediately if the problem affects critical services, creates compliance risk, or exceeds the agreed tolerance for recurrence.
The Problem Manager verifies that the workaround, known error record, and any permanent fix are documented, linked, and communicated to stakeholders. Confirm that related incidents are updated and that monitoring or alerting reflects the final status. Close the problem only when the closure criteria in the record are satisfied.

Get your results

Enter your email — we'll send you a PDF of your filled-out template. We won't sign you up to anything; you can opt in to the trial from the email if you want.

Generated with MangoApps Templates — browse 240+ free
Ask AI Product Advisor

Hi! I'm the MangoApps Product Advisor. I can help you with:

  • Understanding our 40+ workplace apps
  • Finding the right solution for your needs
  • Answering questions about pricing and features
  • Pointing you to free tools you can try right now

What would you like to know?