Measuring real-time performance of safety-critical multi-core automotive applications

The development of multi core applications is difficult at the best of times. In order to share the best practice surrounding these challenges, iSYSTEM and some of its closest partners founded the EMCC conference. In recent years, one of the main areas of focus has been on how to successfully partition applications that were developed for single core microcontrollers so that they execute on multicore architectures. In the automotive domain, it is of course expected that the application code must still fulfil all of the requirements of functional safety standards such as ISO 26262. By cleverly combining various software and hardware tools, it is still possible to deliver the required proof with regard to the performance of the system and its timings, which guarantee the application’s safety.

In this latest BugHunter blog post, we shall be examining the real-life challenges surrounding the development of an automotive steering system. The platform in question was based on a Leopard dual core microcontroller from NXP. The challenges in measuring and proving the timing performance of this particular application was made more difficult due to the influence that simultaneous access of various software components had on the shared memory regions and peripherals of the device in question. At times, during application execution, individual cores on the dual core device were registering a utilisation of 90% or more. It was becoming difficult to prove that the system could always fulfil its real-time criteria in a manner that would be acceptable for an ASIL D category application (Figure 1).

Figure 1 - Trace analysis showing the effect of blocking effect (top), CPU utilisation (middle) and requirements fulfilment (bottom)

The project started by reviewing the requirements laid down for the steering system. The timing requirements, that had already been entered informally into the requirements management system, had to be converted into a more formal digital document that could be used as part of the toolchain. The team decided to make use of the AUTOSAR Timing Extensions, a portable data format that could be used across various tools.

The next step in the process was to select a suitable trace method which would allow a high measurement depth, width and length that remained robust against measurement failures. Three different approaches were considered as follows:

  • Software trace: this method demands that the source code of the application is instrumented in order to log the moment in time that the associated event occurs.
  • Hardware trace: making use of the microcontrollers on chip hardware, it is possible to export both program execution and data access information, by using appropriate debugging tools.
  • Hybrid trace: an alternate approach that makes use of hardware trace for some elements of analysis and software trace for the remainder.

This particular application was based upon AUTOSAR where over 300 runnables were implemented. Various purely software trace based solutions were considered but none were found capable of instrumenting such a complex application. One of the key bottlenecks in the software trace method is the lack of available bandwidth to get the trace results off of the microcontroller in question and out to the PC being used for the analysis. Typically, such approaches only allow a certain amount of trace data to be buffered on the microcontrollers on-chip SRAM memory or they make use of a serial interface, such as CAN, where data transfer speeds prohibit the transfer of such large quantities of trace data.

Figure 2 - iSYSTEM's Leopard Emulation Adapter brings all 12 NEXUS trace channels out to a connector

Hardware trace seemed to be the way forward on this particular occasion. However, the device being used, an NXP MPC5643L, only offered a limited number of message data out channels. In such cases, as here, the semiconductor manufacturer typically offers a special development version of the microcontroller where additional interfaces are brought out pins specifically for the purpose of making a full trace interface available. The next challenge was how to make the trace interface available on a product that was, to all sense and purpose, finished. The development version of the MPC5643L could not be soldered onto the PCB and the necessary traces were neither implemented or routed to a suitable connector (Figure 2). Due to their high data rates, iSYSTEM always recommends to keep trace data port connections are short as possible, so routing these signal out to a connector was no consider likely to succeed.

Luckily, there was one way forward. By carefully cutting a hole in the housing of the steering system, it was possible to affix iSYSTEM’s “Leopard Nexus Emulation Adapter” to a socket on the PCB without having to make any layout changes to the board (Figure 3). Now, with the 12-bit NEXUS port exposed, it would be possible to execute live tests whilst simultaneously recording hardware trace data for later analysis.

Figure 3 - After a little surgery, the emulation adapter can be connected to the On-Chip Analyser

Despite the use of the hardware trace capability, various technical issues resulted in the need to fall back on a hybrid trace approach. It was decided to instrument both the runnables and the interrupt service routines (ISR). Of course, in order to take this approach, it was necessary to have full access to the source code in order to insert the instrumentation code. Additionally, it was imperative to ensure that the instrumentation code did not significantly influence the execution time of the application code. At this point another challenge arose: the operating system being used was not compliant with the OSEK standard. As a result, modifications were made to make the recording of the state of the tasks OSEK compliant.

After suitable modifications had been made to provide a more conformant registration of operating system task states, it was possible to record the hybrid trace data during active use of the steering wheel, during which the wheel was turned to the full extent of its travel. The raw trace data then required one further step to transform it into the Best Trace Format (BTF) format required for further analysis. This involved making use of the knowledge of the addresses certain structures and variables had been assigned, or inserting definitions in the place of raw data values. Timing Architect’s Inspector tool could now be used to analyse the performance of the steering system during actual use and provide feedback on whether the timing requirements were being fulfilled. Furthermore, it was possible to integrate this testing into the Jenkins Continuous Integration (CI) environment to allow automated timing measurements to be undertaken on all further software changes. At last the development team had the necessary tools available to determine which software changes were causing unacceptable utilisation demands on the core and the necessary teams could be informed upon the impact their code was having.

Figure 4 - Transformation of raw trace data to BTF

This study shows that it is possible to combine both software and hardware trace methods in a hybrid approach to collect the data required to prove the performance of a real-time application. Additionally, the approach can be applied to both single core, multi core or even many core systems. Providing there are suitably prepared timing requirements available in a format that is portable, trace data can be collected under real usage conditions for proving fulfilment of safety critical demands as well as part of automated testing. Due to the depth and recording length, analysis can be undertaken at every level from the system view down to individual software unit. The resulting insights highlight everything from resource bottlenecks through to performance sucking code as well as the location of potential run-time optimisations. However, the work also shows the importance of planning for such tests and analyses early in the project to ensure that the appropriate technical solution, such as use of an emulation adapter or other access to hardware trace information, can be implemented.

Many thanks for this post go to our friends at Timing-Architects, especially Felix Martin, Max Hempe and Michael Deubzer. If you have any questions on multi core or many core timing analysis, just drop us a line and we'll be more than happy to help.

Until next time, as always, Happy BugHunting!

Document Description
Leopard Nexus Emulation Adapter-