## Adapting Synchronizers to the Effects of On Chip Variability

Jun Zhou, David Kinniment, Gordon Russell and Alex Yakovlev

> Microelectronics System Design Group Newcastle University



### **On-chip Variability**

Process Variation: Vth, Left, Weft
 Voltage Variation
 Non-uniform Power Supply Distribution
 Switching Activity
 IR drop
 Temperature Variation





## Why Synchronizer?



**Two Flip-flop Synchronizer** 

## Synchronization Time Constant τ & Synchronization time

Synchronizer Time Constant  $\tau$  determines the resolution speed of metastability in Synchronizers. Normally a synchronization time of 30 to 40  $\tau$  is required to give a 4-month Mean Time Between Failures (MTBF).



Input Time vs Output Time

## Effects of On-chip Variability on Synchronizer Performance

#### **Process Variation**

|        | 180nm | 90nm | 45nm |
|--------|-------|------|------|
| σ of τ | 4%    | 8%   | 16%  |

M. Garg et al., ISCAS 2005, May 2005 & International Technology Roadmap for Semiconductors 2005

#### Voltage & Temperature Variation

| Vdd (V)             | 1.1   | 1.0   | 0.9   | 0.8   | 0.7   | 0.6   | 0.5    | 0.4     |
|---------------------|-------|-------|-------|-------|-------|-------|--------|---------|
| τ (ps) at<br>27⁰C   | 12.19 | 13.67 | 15.46 | 19.64 | 30.71 | 60.55 | 159.45 | 525.82  |
| τ (ps) at -<br>25ºC | 10.24 | 12.06 | 14.28 | 18.66 | 36.33 | 97.81 | 338.43 | 1403.76 |

Simulation results of Jamb latch at 90nm



## Synchronizer Selection Scheme

#### Problem

Technology: 90nm Mean Value of  $\tau$ : 11 ps Standard Deviation of  $\tau$ : 8%

In the worst case we have to allow for a  $\tau$  of 3.09  $\sigma$  or 13.72 ps to ensure that the probability of a synchronizer having  $\tau$  worse than this is 0.001. For a 100-synchronizer system with 5 GHz clock and data a synchronization time of 40  $\tau$  is required to give a 4-month system MTBF. In this case the synchronization time of all synchronizers on the chip has to be increased by:

#### $(13.72 - 11) \times 40 = 108.77$ ps

This will add to the delay of all synchronizers on the chip and therefore, affect the system performance.

# Synchronizer Selection Scheme Solution 1

Increase the transistors size in the synchronizer to say 4 times its original value. We simply assume that this will reduce the standard deviation of  $\tau$  from 8% to:

$$\sigma = \frac{8\%}{\sqrt{4}} = 4\%$$

Now in stead of 108 ps, the synchronization time of all the synchronizers on the chip only need to be increased by:

$$(12.36 - 11) \times 40 = 54.4$$
 ps

Improvement: 54 ps

#### Disadvantage:

- 1. Power Consumption is also increased by 4 times.
- 2. Increasing transistors size can not reduce all kinds of process variations, so the actual standard deviation of  $\tau$  after increasing is more than 4%.

Synchronizer Selection Scheme Solution 2 (Synchronizer Selection Scheme) Make 4 standard size synchronizers, measure their  $\tau$  on chip, and select the best one.

The probability of one synchronizer having  $\tau$  worse than 11.81 ps is 17.8%, but the probability of all four synchronizers having  $\tau$  worse than this is 0.178<sup>4</sup>, or 0.001.

In this sense, now the synchronization time of all synchronizers on the chip only need to be increased by:

$$(11.81 - 11) \times 40 = 32.4$$
 ps

Improvement: 76 ps --- 22 ps better than Solution 1 (54 ps).

In addition, after the selection, all the other synchronizers are powered down, as is the measurement circuitry. Power during operation is therefore the same as for a single synchronizer.

## Synchronization Time Adjustment Scheme

#### Problem

Process Variation  $\longrightarrow$  25% worse value of  $\tau$ Voltage Variation & Temperature Variation  $\longrightarrow$  25% worse value of  $\tau$ 

In order to achieve the required MTBF, all the synchronizer times on the chip need to be extended to over 1.5 times their original values.

However, the actual amount of the variations for some of the synchronizers on the chip may be less than the worst case. So the extended synchronization time may be wasted.

**Solution:** Adjust the synchronization time of each synchronizer on the chip according to the actual process, voltage, temperature and data rate variations to improve the system performance on the condition that the required MTBF is still met.

## On-chip Measurement of Failure Rates

Both Scheme are based on the on-chip measurement of failure rates.



# Calculation of $\tau$ and MTBF from Failure Rates



## **FPGA** Implementation

To assess their feasibility, the two adaptation schemes proposed have been implemented using Xilinx's FPGA Spartan 3.



Synchronizer Selection Scheme



Synchronization Time Adjustment Scheme

## **FPGA** Implementation

|                                              | On-chip Overhead                            | Off-chip Overhead              |  |  |
|----------------------------------------------|---------------------------------------------|--------------------------------|--|--|
| Synchronizer<br>Selection Scheme             | 9 flipflops and 6 gates per synchronizer    | 34 flipflops and 110 gates     |  |  |
| Synchronization<br>Time adjustment<br>Scheme | 33 flipflops and 104 gates per synchronizer | 436 flipflops and 732<br>gates |  |  |

 Synchronizer Selection Scheme has a small on-chip and off-chip overhead, and it can be put entirely on chip.

Synchronization Time Adjustment Scheme has a relatively large overhead. When used to deal with process variation, infrequent voltage variation or temperature variation, the major part of it can be put off chip. When used to track frequent voltage variation or data rate variation, it has to be put on chip entirely. However, there are some ways to reduce the overhead such as making trade off between the calculation accuracy of MTBF and the Haedware overhead, or direct mapping failure rates to MTBF.



### Conclusions

• Two adaptation schemes have been proposed to reduce the effects of on-chip variability on synchronizers. To assess their feasibility, the two schemes have been implemented using Xilinx's FPGA Spartan 3.

Synchronizer Selection Scheme is used to mitigate the effects of process variation by selecting the best synchronizer from a bunch of redundant synchronizers. It has a small overhead and can be put entirely on chip. It only needs to operate once when setting up the chip. After selection all the redundant synchronizers can be powered down as is measurement circuitr so the power consumption during operation is the same as for a single synchronizer.

Synchronization Time Adjustment Scheme is used to improve the system performance by reducing the overdesigned synchronization time according to the actual on-chip variability on the condition that the required MTBF is still met. It has a relatively large overhead. When used to deal with process variation, infrequent voltage variation or temperature variation, the major part of it can be put off chip. When used to track frequent voltage variation or data rate variation, it has to be put on chip entirely. However, it is possible to reduce the overhead such as reducing the calculation accuracy of MTBF or direct mapping failure rates to MTBF.

## Thanks!

**Questions?** 

## Calculation of $\tau$ and MTBF from Failure Rates

Calculate τ

$$\therefore MTBF = \frac{e^{\frac{t}{\tau}}}{T_w f_c f_d}$$
$$\therefore \frac{MTBF2}{MTBF1} = e^{\frac{T2-T1}{\tau}}$$
$$\therefore \frac{Failure\_Rate1}{Failure\_Rate2} = e^{\frac{T2-T1}{\tau}}$$
$$\therefore \tau = \frac{T2-T1}{\ln \frac{Failure\_Rate1}{Failure\_Rate1}}$$

Calculate MTBF  

$$\therefore MTBF3 = MTBF1 * e^{\frac{T3-T1}{\tau}}$$

$$\therefore MTBF3 = \frac{Counter3\_output * Clock\_period}{Counter1\_output(known)} * e^{\frac{T3-T1}{\tau}}$$

$$\therefore \frac{Counter1\_output * MTBF3}{Clock\_period} = Counter3\_output * e^{\frac{T3-T1}{\tau}}$$

$$Let \quad X = \ln \frac{Counter1\_output * MTBF3}{Clock\_period}, \quad Y = \ln(Counter3\_output)$$

$$\therefore e^{X} = e^{Y} * e^{\frac{T3-T1}{\tau}}$$

$$\therefore X = Y + \frac{T3-T1}{\tau}$$