Home > Software Engineering > How to properly do CPR (when software problems require CPR)

How to properly do CPR (when software problems require CPR)

It’s Saturday morning and it’s raining, both literally and figuratively.

Your cellular phone rings.  It’s M.  from technical support team.   M. just received a call from one of your most important customers.  The new software release has been installed and has not exhibited any problems for 2 weeks – until now.   The rain just turned into a monsoon.

The problem is urgent, time sensitive, and ambiguous.  Certain process has to run before Monday morning.  Yet the problem symptoms reported b y the customer are completely unfamiliar.  All critical scenarios were extensively tested prior to delivering the software to the customer.

It’s clear that something is wrong but in due time the problem will be found.

Before we discuss CPR in detail, do you have an emergency notification plan in your Engineering organization?  If yes – ensure it’s up to date.  If no – it will be difficult to reach the right engineers and solve the problem before Monday morning.  The customer’s business will be severely impacted.

CPR – or Customer Problem Report – is a must-do step after the solution has been found.

What is a CPR and why it’s so valuable?

CPR is a detailed (by definition –very binary and to the point) and actionable document created whenever a software problem has a significant impact on the customer.

The primary purpose of the CPR is …

–          Document all root causes and contributing factors:  describe every action items required to mitigate or eliminate the root cause

–          Identify any necessary technology changes

–          Identify any process changes within the organization.  For example, if the root cause of the problem turns out to an improper installation procedure, all recommendations required to prevent this problem will be clearly identified.

–          Identify organization changes.  For example, if the root cause turns out to be an inexperienced QA manager who did not accommodate a customer specific scenario, even this information has to be captured

Finally – CPR will not be effective without these 3 items:

–          For each action item identified above, a clear owner must be assigned to each action item

–          Then – “must complete by” dates must be established for each action item

–          Single accountable individual must be assigned to monitor progress which may involve (and usually does) members of multiple organizations:  Product Management, Engineering, QA, Technical Support, and Professional Services.

Until all action items identified by the CPR have been completed, the CPR is treated like any other open software defect and receives the same attention during defect / issue review meetings.

CPRs are an invaluable tool to enable and sustain a completely transparent organizational culture.

No one wants to see the customer’s business impacted as a result of the software not working as expected.   By definition, CPR is a cause for change, for the right reasons, as well as a vehicle to ensure that change happens on time.   Inability to change or unwillingness to improve will be quickly noted.   Those that cannot change will be simply replaced.

If you do not have a CPR process currently in place, try it.  When your Engineering team may have to do CPR on an ailing software release, the other kind of CPR – the subject of this discussion – will be very helpful.

Categories: Software Engineering
  1. November 3, 2010 at 3:19 pm

    Another ‘brief and great’ article. Thanks.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: