Vehicle ECU development: Hardware debugging for functional safety

August 17, 2023

Tests regard­ing func­tion­al safe­ty play an impor­tant role in the develop­ment of vehi­cle ECUs. Tests at var­i­ous stages help the devel­op­er to ensure con­for­mi­ty with ASIL stan­dards. What role does hard­ware debug­ging play here? An overview.

By Armin Stingl

An increas­ing num­ber of elec­tron­ic con­trol units (ECUs) are dis­trib­uted in mod­ern vehi­cles. These embed­ded sys­tems pro­vide var­i­ous func­tions, from basic dri­ving func­tions and safe­ty mea­sures to com­fort and enter­tain­ment func­tions. Faults in the ECUs can have dif­fer­ent impacts on pas­sen­ger safe­ty depend­ing on the func­tion. For this rea­son, ECUs are clas­si­fied accord­ing to the var­i­ous Auto­mo­tive Safe­ty Integri­ty Lev­els (ASIL).

For exam­ple: The ECU for elec­tric steer­ing obvi­ous­ly has a strong influ­ence on safe­ty, as a mal­func­tion of the steer­ing can lead to seri­ous injuries or even death of a pas­sen­ger. There­fore, this ECU is clas­si­fied as ASIL‑D, the high­est level of safe­ty require­ments.

When devel­op­ing an ECU with such an ASIL clas­si­fi­ca­tion, it is nec­es­sary to con­sid­er safe­ty from the very begin­ning. Accord­ing to ASIL‑D, the develop­ment process must meet the require­ments spec­i­fied in the ISO26262 stan­dard. There­fore, the entire soft­ware design should fol­low an appro­pri­ate develop­ment method­ol­o­gy. The tools used may need to be qual­i­fied to prove that they are suit­able for this pur­pose.

The ECU is then test­ed togeth­er with the cor­re­spond­ing soft­ware on sev­er­al stages using var­i­ous meth­ods. These include unit test­ing, sys­tem inte­gra­tion tests and hard­ware-in-the- loop tests. Many of these tests can be sup­port­ed using a hard­ware debug­ger to test as close as pos­si­ble to the real hard­ware – as rec­om­mend­ed by ISO26262.

Various requirements in one ECU

Today’s high-per­for­mance ECUs typ­i­cal­ly imple­ment not just one type of appli­ca­tion, but sev­er­al. These can have dif­fer­ent lev­els of func­tion­al safe­ty. For exam­ple, in addi­tion to an ASIL‑D appli­ca­tion, an ECU can also imple­ment a QM-level appli­ca­tion (qual­i­ty man­age­ment, i.e., with­out spe­cial func­tion­al safe­ty require­ments). What does this mean for soft­ware develop­ment?

figure1
Fig­ure 1: Dif­fer­ent appli­ca­tions with­in an ECU can have dif­fer­ent func­tion­al safe­ty require­ments.

In this case, there are two ways to meet the require­ments of the ISO26262 stan­dard: Either all com­po­nents are treat­ed as if they had to meet the high­est ASIL require­ments, or Free­dom from Inter­fer­ence (FFI) must be guar­an­teed. This means that it must be ensured that the QM part can­not inter­fere with the ASIL part in the same ECU (Fig­ure 1).

On the one hand, FFI must be guar­an­teed regard­ing mem­o­ry uti­liza­tion. The QM part must there­fore not be able to cor­rupt the mem­o­ry allo­cat­ed to an ASIL part. The sec­ond aspect is Tim­ing & Exe­cu­tion. Here, for exam­ple, it must be ensured that the exe­cu­tion of QM soft­ware can­not block the exe­cu­tion of ASIL soft­ware. The third aspect, Infor­ma­tion Exchange, refers to the dis­rup­tion of a data com­mu­ni­ca­tion between a sender and a receiv­er, e.g., by insert­ing invalid data or block­ing a com­mu­ni­ca­tion path.

Analysis of source code

For the suc­cess­ful develop­ment of safe­ty-crit­i­cal appli­ca­tions of an ECU, the com­pil­er must be qual­i­fied accord­ing­ly. It is help­ful if the com­pil­er is cer­ti­fied, such as the com­pil­er tool sets from TASKING.

How­ev­er, even for a qual­i­fied com­pil­er – as for all tools used – a risk assess­ment must be car­ried out in accor­dance with ISO26262. Every com­pil­er has bugs, too. All known ones are list­ed in the so-called erra­ta sheet. For safe­ty-crit­i­cal appli­ca­tions, the source code must be ana­lyzed and checked to see whether it could be affect­ed by known com­pil­er bugs.

Most­ly this is done man­u­al­ly. How­ev­er, there are tools, such as TASK­ING’s Tri­Core Inspec­tor, which auto­mat­i­cal­ly inspects source code for all known com­pil­er prob­lems and gen­er­ates a cor­re­spond­ing report. This report can then either be used to adjust the source code or sim­ply attached to the risk assess­ment report.

After the com­pil­er, the code itself must now be checked for errors, among other things regard­ing the FFI require­ments. Tools such as the Safe­ty Check­er from TASKING help here, com­pa­ra­ble to a sta­t­ic code analy­sis that is com­plete­ly com­pil­er-inde­pen­dent. The devel­op­er spec­i­fies to the tool the intend­ed access rights of all safe­ty and QM par­ti­tions in the sys­tem, i.e., read/write and exe­cute rights for spe­cif­ic mem­o­ry areas. Then the tool exam­ines the entire source code and tries to iden­ti­fy poten­tial leaks, i.e., pos­si­ble inter­fer­ence between par­ti­tions, e.g., due to unsafe, insuf­fi­cient­ly secured use of point­ers, glob­al vari­ables, or shared mem­o­ries.

The assump­tion here is that sep­a­ra­tion between par­ti­tions is not achieved by hard­ware meth­ods, i.e., meth­ods such as a mem­o­ry pro­tec­tion unit (MPU) or a hyper­vi­sor. Either such meth­ods are not planned or avail­able, or they are not yet enabled. In the lat­ter case, the tool helps trou­bleshoot or pre­pare the soft­ware for an MPU. Instead of debug­ging one MPU excep­tion after anoth­er once the MPU is enabled, the soft­ware can be pre­pared in advance. And that with­out run­ning the soft­ware on any real hard­ware.

Unit testing on the target hardware

The next test step in ECU develop­ment is the unit test. In most cases, unit tests are per­formed at source code level and on a host PC. This means that the source code to be test­ed is pack­aged in a test frame­work. Stubs (addi­tion­al code that replaces anoth­er code com­po­nent dur­ing the test run. e.g., to sim­u­late a com­po­nent that has not yet been imple­ment­ed or hard­ware-depen­dent com­po­nents such as IOs) can then be added here. Togeth­er with the test cases, the entire pack­age is com­piled and exe­cut­ed on the host com­put­er, such as a Win­dows or Linux PC. The result is a test report that essen­tial­ly gives a pass/fail for all test cases, usu­al­ly along with a code cov­er­age report. Since the exe­cu­tion is done on the PC, the basic exe­cu­tion on the real hard­ware is not cov­ered. The tests may not pro­duce iden­ti­cal results.

That’s why ISO26262 rec­om­mends: “The test envi­ron­ment for soft­ware unit test­ing shall cor­re­spond as close­ly as pos­si­ble to the tar­get envi­ron­ment.” So why not run the test direct­ly on the tar­get?

Hard­ware debug­ging tools are tra­di­tion­al­ly used for develop­ment and debug­ging of dri­vers, board/hardware bring-up, boot process­es and much more. In other words, for the “min­i­mal­ly inva­sive”, hard­ware-ori­ent­ed develop­ment of embed­ded soft­ware. In addi­tion to these stan­dard meth­ods, hard­ware debug­gers today also offer meth­ods for con­trol­ling soft­ware tests on the tar­get sys­tem. Here, the debug­ger con­nects to the actu­al tar­get hard­ware via stan­dard debug inter­faces, with the pur­pose of devel­op­ing and test­ing embed­ded soft­ware as close as pos­si­ble to the actu­al hard­ware. This helps specif­i­cal­ly in terms of safe­ty require­ments, i.e., FFI, exe­cu­tion and tim­ing, and infor­ma­tion shar­ing. Let’s look at some spe­cif­ic test use cases.

When set­ting up for unit test­ing, the source code of the soft­ware under test is cross- com­piled for the tar­get device and not instru­ment­ed. This means that the orig­i­nal pro­duc­tion code is test­ed on the tar­get device.

figure2
Fig­ure 2: Hard­ware debug­ging and test setup using the exam­ple of an Infi­neon AURIX-based tar­get sys­tem.

The actu­al con­trol of the tar­get to exe­cute the test, i.e., down­load­ing the code, call­ing the unit, i.e., the C func­tion under test, set­ting the test input vec­tors and read­ing back the test results is done by the under­ly­ing debug­ger, such as winIDEA with Blue­Box from TASKING (Fig­ure 2).

As with exe­cu­tion on a host PC, sim­i­lar test results are gen­er­at­ed here: Pass/Fail results for each test case and a Code Cov­er­age Report. How­ev­er, here code cov­er­age is mea­sured based on a hard­ware trace record and as men­tioned ear­li­er, with­out any source code instru­men­ta­tion.

figure3
Fig­ure 3: Exam­ple of debug­ging in unit test­ing: a C func­tion “cal­cu­late­Fu­el­Ef­fi­cien­cy”.

For a deep­er under­stand­ing of how a unit test is exe­cut­ed on a real tar­get with­out code instru­men­ta­tion, let’s con­sid­er the unit test for the C func­tion “cal­cu­late­Fu­el­Ef­fi­cien­cy” as a spe­cif­ic exam­ple (Fig­ure 3). When using a debug­ger, the entire appli­ca­tion does not have to be exe­cut­ed until the func­tion call. The debug­ger can set the instruc­tion point­er of the CPU direct­ly to the func­tion entry. Fol­low­ing C call­ing con­ven­tions, the debug­ger sets up the stack frame for the func­tion and then starts the CPU. The debug­ger stops the CPU when stub­bing or data injec­tion is required, for exam­ple, when sub­func­tions are called. Instead of call­ing these sub­func­tions, the CPU skips both func­tions and instead injects the desired return value direct­ly into the des­ig­nat­ed CPU reg­is­ter. The CPU exe­cutes until the func­tion returns, and here the debug­ger reads the result value, which can be checked against some pass/fail cri­te­ria. All this works with unchanged pro­duc­tion code.

Fault injection

After unit test­ing, we move on to sys­tem-level test­ing, for exam­ple with­in a hard­ware-in-the- loop con­fig­u­ra­tion. In this case, a debug­ger can be very use­ful to inject faults into the sys­tem to test the impact on the FFI in terms of mem­o­ry cor­rup­tion, exe­cu­tion, and infor­ma­tion shar­ing. The debug­ger is used to per­form on-the-fly manip­u­la­tions on on-chip resources such as CPU core reg­is­ters and mem­o­ry and then exam­ine the effects (Fig­ure 4). But faults can also be inject­ed at exter­nal inter­faces: Add-on mod­ules for CAN/LIN and analog/digital sig­nals con­nect­ed direct­ly to the debug­ger hard­ware and tar­get sys­tem can be used to inject faulty data into an ADC or from exter­nal sen­sors con­nect­ed via CAN or SPI.

figure4
Fig­ure 4: Inject­ed faults can be used to check the effects on a sys­tem.

For exam­ple, it is checked whether a mal­func­tion of a QM soft­ware can have an impact on the exe­cu­tion of an ASIL func­tion­al­i­ty. How the sys­tem reacts to such manip­u­la­tions is observed, for exam­ple, with on-chip traces, the record­ing of process­es in the soft­ware in real time via log­ging of exe­cu­tion times in the range of clock cycles. The hard­ware debug­ger con­nec­tion between the PC and real tar­get hard­ware is essen­tial here. This means that a cor­re­spond­ing on-chip debug trace inter­face on the proces­sor installed in the ECU must be avail­able and led out.

Timing analysis

But on-chip traces not only enable the behav­ior of the soft­ware to be mon­i­tored. Since hard­ware traces are non-intru­sive, i.e., have no influ­ence on the run­time, they are also ideal for tim­ing analy­sis. Tim­ing tests should be per­formed as part of sys­tem inte­gra­tion test­ing. The increas­ing load on the oper­at­ing sys­tem tasks nat­u­ral­ly affects the entire tim­ing sched­ule, and later crit­i­cal tim­ing require­ments can no longer be met.

In tim­ing analy­ses regard­ing safe­ty and FFI, it is also use­ful to look at the tim­ing mar­gins: Basi­cal­ly, the robust­ness of the entire soft­ware tim­ing can be checked.

A use case shows what a com­bined debug and trace tool can do: Three tasks are run­ning on a CPU, a 100ms, 50ms and a 10ms task. The 10ms task has the high­est pri­or­i­ty. More and more func­tions are added to the runnable of the 10ms task, and the effect on the response time of the 100ms task is mea­sured.

Such a test can be imple­ment­ed effort­less­ly by adding instru­men­ta­tion code to the runnable to extend the run­time. A debug­ger can change this instru­men­ta­tion code dur­ing the test run. Thus, the run­time of the runnable can be var­ied with­out hav­ing to rebuild the soft­ware.

An exam­ple result is shown in Fig­ure 5. The green curve shows the response time of the 100ms task ver­sus the run­time of the 10ms runnable. Assume that the 100ms task has a time con­straint that the response time must not exceed 85ms: this time con­straint can be achieved if the run­time of the 10ms runnable remains below 4.5ms. It is also inter­est­ing to note that at a cer­tain point, the response time increas­es almost expo­nen­tial­ly, and at this point the CPU load also stops increas­ing. This is a clear indi­ca­tion that the OS sched­ul­ing does not real­ly work reli­ably any­more since the sys­tem is already over­loaded.

figure5
Fig­ure 5: Hard­ware traces can be used to mea­sure the tim­ing con­di­tions of var­i­ous tasks.

Conclusion

The hard­ware debug­ger is increas­ing­ly a process tool – the basic func­tions of a debug­ger find their usual appli­ca­tion and are sup­ple­ment­ed with pow­er­ful analy­sis func­tions.

All the use cases and tools pre­sent­ed can be com­bined into a Con­tin­u­ous Inte­gra­tion pipeline that is explic­it­ly focused on soft­ware safe­ty test­ing. Thus, a test flow can be cre­at­ed, from sta­t­ic code analy­sis to unit tests on the tar­get sys­tem to debug and trace (tim­ing) tests.

Of course, it does not make sense to run all these tests on every com­mit. For exam­ple, a strat­e­gy would be to run code-level tests only at night or only on soft­ware releas­es, and inte­gra­tion tests every time soft­ware branch­es are merged into the mas­ter trunk.

Lit­er­a­ture

Markt & Tech­nik Trend Guide “Indus­triecom­put­er & Embed­ded Sys­teme”, April 2022, print: https://wfm-publish.blaetterkatalog.de/frontend/mvc/catalog/by-name/MUT?catalogName=MUTSH2202D
Design & Elek­tron­ik 08–09/21, Okto­ber 2021, print: https://wfm-publish.blaetterkatalog.de/frontend/mvc/catalog/by-name/DUE?catalogName=DUE2108D

Scroll to Top