What is the problem with Problem?

As you adequately put it, the problem is choice. From the Matrix Reloaded – 2003.

The Architect: The first Matrix was quite naturally perfect; it was a work of art, flawless, sublime. A triumph equalled only by its monumental failure… (sound like any ITIL programmes you’ve encountered?)

Neo: … Choice. The problem is choice.

The Architect: … While this answer functioned, it was obviously fundamentally flawed, thus creating the otherwise-contradictory systemic anomaly that if left unchecked might threaten the system itself. Ergo, those that refused the program, while a minority, if unchecked would constitute an escalating probability of disaster.

The Architect: … You are the eventuality of an anomaly, which despite my sincerest efforts I have been unable to eliminate from what is otherwise a harmony of mathematical precision. While it remains a burden assiduously avoided, it is not unexpected, and thus not beyond a measure of control, …

The problem with problem is that it exposes the choices. A failed disk brings the whole system down, who chose not to pay for or take the time to implement mirroring?

Solutions will always involve choices and no matter how good your systems are (ITIL, Six Sigma, CMMI etc.) choices will always lead to undesirable consequences.

The Merovingian: We are forever slaves to it. Our only hope, our only peace is to understand it, to understand the why… (not what the Merovingian meant, but hay-ho I’m writing this!).

ITILExpert’s take: Problem Management is the measure that controls the assiduously avoided, but not unexpected errors, which if left unchecked would constitute an escalating probability of disaster. The process exposes choices. Our challenge is to make it a learning experience rather than a witch-hunt.

What is the most useful information in a Known Error?

Isn’t it the combination of the symptom and the workaround?  E.g. you have a blue screen  with Error #1234.  We know that Error and we have a workaround – e.g. re-boot.  Unfortunately we don’t yet know the root cause.

This is why I advocate early raising of the Known Error.  The Known Error can be useful before one knows the root cause.

If your ITSM systems separates Problem and Known Error (implements them as separate ticket types), then my advice is to raise both at the same time.  Use the Problem ticket to manage the Root Cause Analysis and use the Known Error to manage the work around.  This is especially useful if the work around is being worked on by a different group to the one coordinating the RCA.

ITIL Expert’s take: Raise Known Errors early.  They can be useful well before the actual root cause is known.