Home > Error Management, Software Engineering > Why runtime awareness in any code is important

Why runtime awareness in any code is important

This is a classic scenario that repeats itself over and over again.

The team spent a lot of time testing a new and very important software release.   Everyone is tired but happy.  It should be a good release.   But – shortly after the long-awaited  release has been installed by  THE customer (THE customer waited for this release 2 months), one of the processes performing a critical function does not work as expected.   Worse – this process performs critical transaction integrity functions and transaction verification logs do not contain valid data.

What happened?  Exhaustive regression tests executed perfectly.  Yet, something in the customer environment caused the software to work incorrectly.

The problem was eventually traced to a missing configuration file.  When the process started in customer’s own environment, the code looked for a specific configuration file and could not find it (even more troublesome – the installation procedure failed and no one noticed).   Then – the code proceeded to read default configuration parameters from a file so old that some of the parameters caused the process to cause multiple errors much later during the execution.

First – the solution (and lessons learned at the end):

– This process was immediately changed to report critical information when it started, while it was running, and when it ended.

– When the process started, it reported – among many critical items – location and contents of all configuration parameters read, accepted, rejected, defaults used instead, or ignored (and action taken afterwards)

– In addition, the process initialization was gracefully terminated – with detailed error messages – if a critical parameter was incorrect.  This is a good practice to ensure the customer continues to maintain continued awareness of all elements essential to the operation of any mission critical software product.

– While the process was running, any dynamic parameter changes were logged and reported

Second – what was the cost of not doing this right the first time?

– 3 days of exhaustive research by an experienced software engineer who had to jump on a plane and visit the customer.  No one else could help other than the code owner.

– 6 person-days of 2 other software engineers working around the clock to support their colleague at the customer location

– Delayed customer reference to a prospective client

Third – lessons learned:

– Make runtime awareness one of the design objectives

– Challenge the team during design reviews (“how would you know if these conditions may be present …”)

– Reject the release if insufficient runtime awareness has been engineered in the code

Runtime awareness in software products is not a new concept .  Yet, it’s importance has never been higher.

  1. November 1, 2009 at 9:19 pm

    When you install ClamAV, it installs a default configuration file that contains this line:

    # Comment or remove the line below.

    None of the processes associated with ClamAV will run if they detect an “example” configuration file.

    This is a good way to (1) deliver a high-quality, well documented configuration file and (2) to make sure that the ultimate user has actually looked at the file and contemplated its contents.

  1. November 1, 2009 at 11:46 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: