To test or not to test, that is not the question (part 1)

It is interesting to find out what approaches different people use to test their application code prior to delivering it to the market place. For example, if you were planning to introduce a radiation therapy machine into hospitals around the world, you’d probably want to be pretty sure that it would work safely under all conditions. But back in 1982 the Therac-25 was introduced into the market with a whole host of critical errors, some of which were traced back to the software.

Failures in the software of the Therac-25 radiation therapy machine led to the standards for software safety we have today. (Link: technet.hu)

In fact, the entire debacle surrounding this medical device is often used as a case study in software engineering courses and it led to the creation of the IEC 62304 standard “Medical device software – Software life cycle processes”. So you would think that over thirty years later the focus on software quality would be much higher and that testing of software would be an integral part of the product development process for products that could potentially harm or kill. Imagine my surprise at a recent exhibition when I asked a visitor how they tested their (potentially harmful to life) laser-based product and they replied “well, we power it up and run the equipment as per the user manual”…

With many embedded projects these days reaching several hundred-thousand lines of code, and with coders spread across multiple countries and continents and code being reused, it can be difficult to know where to begin when it comes to testing the code. Industries such as automotive are provided with some pretty clear guidelines in part 6 of the ISO 26262 – Road Vehicles – Functional Safety standard. Even if your application does not demand the fulfilment of a particular safety standard, following best practice can deliver higher quality code that suffers less-frequently from coding failures that are discovered once the end product is in the market place.

In Table 10, ISO 26262 provides a list of suitable methods for software unit testing. These include requirements-based testing, interface test and fault injection testing. Both requirements-based testing and interface test are listed as “highly recommended” for all ASIL safety levels. So, before starting, it is necessary to go back and analyse the original specifications used to build the product and write the code. Based upon this information, it is possible to develop some test cases that can be used to show the application code does what it was intended to do. Suitable methods are listed in Table 11 of the standard.

Once the test cases have been defined, we also need to ensure that the test cases have exercised all possible paths through the application code. Various coverage metrics are available in Table 12 in Section 9.4.5 of ISO 26262. In total there are three structural coverage metrics ranked in an order related to the ASIL level of the application being tested. The coverage metrics are listed here in that order with an explanation as to their meaning:

  • Statement Coverage – (highly recommended for ASIL A and B)
    With this metric, the code is analysed to ensure that all statements within the code have been executed during the tests. In a language such as C, the complexity can vary from simple statements through to compound statements that themselves contain further simple or compound statements.
  • Branch Coverage – (highly recommended for ASIL B, C and D)
    With this metric the goal is to prove that the tests resulted in the execution of all possible outcomes for control structures. Typically, this will require showing that, for example, if statements have been executed for both their true and false outcomes. This is fine for simple statements, but short-circuit evaluations as implemented in C for Boolean expressions can result in potential coding issues remaining untested. For example, consider the following statement: 

    if ((a || b) && c) { …

    If ‘a‘ and ‘c‘ are both true when entering this if statement, the value of the remaining variable, ‘b‘, is irrelevant and the statement evaluates to true. When ‘a‘ and ‘b‘ are both false, the value of variable ‘c‘ is irrelevant and the statement evaluates to false. With these two test cases, TxT and FFx, both possible outcomes of the if statement have been tested. However, these test cases do not check what influence ‘b’ might have on the code in the first test, or ‘c’ in the second.

  • MC/DC or Modified Condition/Decision Coverage – (highly recommended for ASIL D)
    With this metric we need to show, through our test cases, that each variable in the expression can have an influence on the result of the expression. So, rather than show that our if statement above can evaluate to both true and false, we need to show that each of the three variables ‘a’, ‘b’ and ‘c’ can independently change the outcome of the evaluation. Typically, it is necessary to define n+1 tests, where n is the number of atomic Boolean conditions in our expression. In our example we require 4 tests due to having 3 atomic Boolean conditions.
    So, for our example statement:

    if ((a || b) && c) { …

    we could use the following set of tests:
Test Number ABC Outcome
1 FFT F
2 FTT T
3 FTF F
4 TFT T

As we can see, the tests independently influence the outcome as follows:
1 & 2 – change in ‘b’ only changes the outcome
2 & 3 – change in ‘c’ only changes the outcome
1 & 4 – change in ‘a’ only changes the outcome

 

All of the above coverage metrics assume that the analysis is being done at source code level. Typical tool suites that can implement such testing require that the original application code be ‘instrumented’, meaning that extra test code be embedded in your application’s source code for the purpose of determining the coverage results.

The 'exit' of a function both before and after instrumentation code has been added.

Once this instrumentation code is inserted in the application, the code and test vectors will be:

  1. Executed on the host PC, using a microprocessor simulator.
  2. Either:

a. The test vectors will be executed on the target embedded system using the original application code.

b. The test vectors will be executed on the target embedded system using an instrumented version of the application code. The coverage data output collected by the instrumentation code has to be uploaded to the host development PC for evaluation.

This approach to code coverage measurement has several downsides, however, as we see below:

  • Recompilation is required to support coverage measurement:

- Code size increases.

- A communications interface must be available (to transfer results to host PC).

  • Real-time behaviour changes:

- Additional processor cycles needed to execute “code coverage” code.

- Code can be pushed across page boundaries.

- More cache misses can occur due to extra code.

- System testing impractical or impossible due to extra code being executed/real-time performance too heavily influenced.

  • Different compiler switches required:

- “Code coverage” code may require different compiler switch settings.

- Code execution order may change due to different evaluation orders.

  • Simulation environment:

- Does simulated MPU/MCU reflect any bugs in the silicon device being used?

Because of this, we can’t just sweep the differences between our testing environments and reality under the carpet. Section 9.4.6 of ISO 26262 requires that “…the differences in the source and object code, and differences between the test environment and target environment, shall be analysed…” with the aim of specifying further tests that can ensure the target environment is being fully tested.

This blog post will be continued and concluded in part 2 very soon - in the mean time, Happy BugHunting!

The content of this post would not have been possible without the help of Anja Visnikar, iSYSTEM's Test and Qualification Manager.