Categories Topics
Problem Management

Problem Management (PM) is the process to document information related to system or application problems and appropriate workarounds for incidents. The primary objective is to prevent problems from recurring incidents to minimize impact to the business. The PM process should also include activities required to diagnose root cause of incidents and determine resolution. Problem resolution should be documented through a change management and release management process.

A sound Problem Management process should first start with a mechanism to document and audit new incidents as they occur. For example, a "trouble ticket" system can be used to log new incidents to include date, time, system names, locations, system/application owners, description of outage, status of resolution and impact to name a few. Some ticket solutions can help automate the process by generating tickets from a degraded system (e.g. a production system is offline or unauthorized change detected). Self-service websites or help desks can also allow end-users to reports incidents that would be reported to ticket system for problem resolution. Changes to systems and applications should also be verified with change management system to ensure changes are authorized. Deviations should be monitored and reported for appropriate incident handling by a security or compliance organization for remediation. It's important to prioritize and monitor certain system files or configurations that should rarely be changed (e.g. system configuration files used for access control or system binaries). Such unauthorized changes could be caused by a system breach by a malicious user or software and should be remediated quickly before access to sensitive information can occur.