# A novel Adaptive Fault Tolerant Flip-Flop Architecture based on TMR



### Luca Cassano<sup>1,2</sup>, Alberto Bosio<sup>3</sup>, Giorgio Di Natale<sup>3</sup>

<sup>1</sup>Dep. of Information Engineering, University of Pisa, Italy



<sup>2</sup>Dep. di Elettronica, Informatica e Bioingegneria, Politecnico di Milano, Italy

<sup>3</sup>Dep. De Microelectronicque, Lab. d'Informatique, de Robotique et de Microelectronique de Montpellier, France

<sup>1,2</sup>luca.cassano@{ing.unipi.it, polimi.it}, <sup>3</sup>{alberto.bosio, giorgio.dinatale}@lirmm.fr

#### Goal of the work

Provide designers of dependable systems with an adaptive fault-tolerance mechanism allowing:

#### Motivations

Modern ubiquitous computing require tiny, low-power and reliable devices.

Advances in CMOS miniaturization made electronic devices more ad more unreliable.

## • Error detection • Testability • Anti-aging

• Error correction • Graceful degradation

The Proposed Architecture



The classical TMR architecture provides high reliability but at the cost of a **high power consumption**.

Many applications do not require high reliability all the time.

In a classical TMR architecture the whole system is triplicated and its outputs are voted by a *fault-free* voter.

In the proposed approach conbinational and sequential logic are separately triplicated and the voting is carried out by the flip-flop themselves.

The proposed architecture relies on the an adaptive TMR flip-flop.

The reliability services offered by the proposed architecture are dinamically enabed/disabled by a controller:

- Single channel (no redundancy enabled, 1003)
- Error detection (2002 redudancy)
- Error correction (2003 redundancy)
- Sequential test of the flip-flops
- Parallel test of the flip-flops

• Anti-aging alternate use of the three available flip-flops

Graceful degradation after a fault occurrence

#### a) Classical TMR scheme

b) Proposed TMR scheme

iMuxScan

#### The TMR Flip-Flop





D1, D2, D3: data from the comb. logic.

When no redundancy is required two of the three FFs are fed with '0' so that the downstream comb. logic does not switch.

*Error*: error detection signal (when both 2002 and 2003 redundancies are enabled).

*C1, C2, C3*: error correction signals (when 2003 redudancy is enabled).

*d12, d13, d23*: faulty FF identification signals (when graceful degradation is enabled).

Q1, Q2, Q3: data to the comb. logic.

Scan\_in: test input. Scan\_out: test output (when sequential/parallel test are enabled)



- Case study: MiniMIPS@65nm
- 3 workloads
- Matrix Multiplication
- Quick Sort
- MD5

| Smart TMR                    | 1003 | 2002 | 2003 | TMR  |
|------------------------------|------|------|------|------|
| <b>Matrix Multiplication</b> | 185% | 308% | 443% | 326% |
| Quick Sort                   | 182% | 324% | 472% | 306% |
| MD5                          | 184% | 323% | 474% | 341% |

