# Opportunities and Challenges for Ultra-Low Voltage Digital IC Design

Jun Zhou, Scientist / PhD Supervisor Institute of Microelectronics, A\*STAR, Singapore zhouj@ime.a-star.edu.sg

*Graduated in Nov. 2008 from Microelectronics System Design Group, Newcastle University* 

Abstract—Ultra-low voltage digital IC design is promising in achieving ultra-low power consumption for emerging applications such as IoT, smart sensor and wearable computing. This paper discusses the opportunities and challenges of ultra-low voltage digital IC design by reviewing and discussing the major design techniques for enabling ultra-low voltage operation, including ultra-low voltage device sizing, ultra-low voltage level shifter design, ultra-low voltage SRAM design and variation-resilient techniques for ultra-low voltage design.

# **1** INTRODUCTION

Emerging applications such as IoT, smart sensor and wearable computing require advanced built-in digital signal processing and controlling capability for "intelligent" and "connected" devices. In the meanwhile, ultra-low power consumption is demanded to prolong the battery life or achieve perpetual operation with energy harvester. This makes the design of ultra-low power digital ICs a must. In the past, ultra-low voltage design has been proved to be able to reduce the power consumption by  $5-10\times[1][2]$ , making it a very promising candidate for the power-constrained emerging applications. This paper reviews and discusses the major design techniques for the ultra-low voltage digital IC design. In Section 2, the ultra-low voltage device sizing methods are reviewed. In Section 3, the design techniques of ultra-low voltage SRAM are discussed. In Section 5, the variation resilience design techniques for ultra-low voltage operation are discussed, and in Section 6, the conclusions are drawn.

# 2 Ultra-Low Voltage Device Sizing for Logic Gates

Previous studies [2][3] have shown that the logic gates with low fan-in (e.g. fan-in less than 4) usually do not have functionality problems at ultra-low voltage even under

PVT variations. However, in terms of performance they are not optimal, as the device size is optimized for super-threshold operation. For example, the PMOS is usually sized around twice as big as NMOS in order to achieve balanced rise and fall delay. However, in the near/sub-threshold region, the strength ratio of PMOS and NMOS varies due to different current-voltage characteristic, leading to unbalanced rise and fall delay [2][3][4]. Another issue is that the impact of the parasitic effects becomes different in the near/sub-threshold region. For example, the impact of drain-induced barrier lowering (DIBL) become less due to reduced drain-to-source voltage and the impact of reverse short-channel effect (RSCE) becomes stronger due to the exponential dependence of current on threshold voltage. Sub-threshold device sizing methods considering the change of current-voltage characteristic and parasitic effects of have been proposed to achieve optimal performance for the near/sub-threshold operation. In [4], RSCE is utilized to boost the current by using non-minimum transistor length. This also mitigate the impact of process variations due to increased transistor area. In [5], inverse narrow width effect (INWE) is utilized to increase the current by using minimum-size finger. With constant current this results in reduced load capacitance and area. The same device sizing method is combined with minimum sizing in a dual-width sub-threshold standard cell library [6]. The INWE-aware cells are used in critical paths to achieve small delay while the minimum-sizing cells are used in non-critical paths to reduce the power and area. Based on these work, future ultra-low voltage device sizing methods may further explore circuit partitioning and hybrid device sizing method to achieve co-optimized delay, power and area.

#### vor vor vor vor vor vor (a) Type I (b) Type I (b) Type I (c) RSI based LS [7] (c) RSI based LS [7]

# 3 Ultra-Low Voltage Level Shifter

(e) Body bias based LS [11]

m







(h) Revised Wilson current mirror based LS [14]

Fig. 1. Ultra-low voltage level shifters.

Level shifter is a crucial component in ultra-low voltage digital ICs for voltage conversion between different voltage domains including core-to-core and core-to-I/O. The conventional level shifter topologies for super-threshold level shifting have functionality or performance issues when operating at ultra-low voltage. For example, for the cross-coupled level shifting structure (i.e. Type I) as shown in Fig. 1(a), when the input voltage is extremely low, the pull-down NMOS cannot overcome the strength of the pull-up PMOS even after heavy upsizing, which will cause functional failures. For the current-mirror level shifting structure (i.e. Type II) as shown in Fig. 1(b), the static source current causes significant standby power, which will diminish the power saving from ultra-low voltage operation. To address these issues, some ultra-low voltage level shifters have been proposed in the past as shown in Fig. 1(c)-(h). In [7][8] the pull-up network is weakened by using reduced swing inverter (RSI) so as to prevent functional failure in Type I level shifters. However, in this topology the delay is not scalable with supply voltage as the pull-up network is constantly weakened. In[9][10], multi-stage level shifter is used to reduce the effort for wide-range level shifting. This effectively avoids the heavy upsizing of the pull-down NMOS while resulting in increased complexity and relatively long delay compared with single-stage level shifters. In [11], forward body bias is applied to help the level shifting at the price of increased area and power due to body bias control. For Type II level shifter, the major effort are spent on reducing the static current. In [12] the source current is enabled/disabled based on the detection of input transition. This significantly reduce the standby power while increasing the delay and dynamic energy due to the operation of the detection circuits. In [13], feedback control is used to cut off the source current after the output of the level shifter flips. However, the feedback structure causes output drop and charging sharing issues, resulting in non-optimal delay and energy consumption. The issues were addressed in [14] via a revised Wilson-current mirror based level shifter, which also uses mixed- $V_T$  devices to achieve wide-range voltage conversion up to I/O voltage. While focusing on the performance optimization for ultra-low voltage operation, what is missing in the existing work is how to achieve optimal operation over a wide range of supply voltages across sub-threshold and super-threshold region. This need to be considered especially for wide-range dynamic voltage scaling (DVS)applications.

# 4 Ultra-Low Voltage SRAM

SRAM is heavily used in digital ICs. Conventional 6T SRAM cannot work at ultra-low voltage due to several issues such as read disturbance, degraded sensing margine and writability. To address these issues, the SRAM cell and write/read circuits need to be re-designed. In this section, several techniques are presented for energy efficient ultra-low voltage SRAMs.



Fig. 2.Summary of normalized minimum energy consumption over various device combinations [15]

#### 4.1 MTCMOS

Multi-threshold CMOS (MTCMOS) devices are commonly utilized in advanced CMOS technologies. Utilization of those devices properly can improve energy efficiency significantly. In general, devices with higher threshold voltage are used in non-critical paths while devices with lower threshold voltage are employed in critical paths. This improves the energy efficiency by reducing the leakage in the non-critical paths and maximizing the performance of the critical paths. In SRAMs, read delay is larger than write delay. Therefore, higher- $V_{th}$  devices are preferred in the write paths while lower- $V_{th}$  devices are better in the read paths. The variations in the energy of various device combinations are illustrated in Fig. 2. Note that the maximum energy occurs at the device combination of standard- $V_{th}$  devices in the write paths and low-er- $V_{th}$  devices in the read paths (SVT(W)-LVT(R)), which is  $6.24 \times$  better than that of LVT(W)-HVT(R). This indicates that proper device selection is not trivial in point of energy efficiency. The optimal device combination can be also affected by various circuit techniques for leakage reduction, dynamic power reduction, etc.

## 4.2 Read Assist Circuits



Fig. 3. Proposed SRAM cell with equalized bitline [15].



Fig. 4. Principle of the equalized bitline[15].

Scaling supply voltage degrades Ion-to-Ioff ratio, which affects read bitline sensing margin. This limits the number of SRAM cells per bitline, maximum temperature, operating supply voltage, etc. One technique to reduce the impact of bitline leakage on sensing is to equalize the bitline leakage. Fig. 3 shows an 9T SRAM that can generate same leakage regardless of the stored data. In unselected SRAM cells, either M7 or M9 is turned on while RVDD and /SEL are grounded. Compared to the conventional bitline sensing (Fig. 4(left)) where sensing margin is affected by the amount of data-dependent bitline leakage, the equalized leakage always provides sensing margin (Fig. 4(right)) with the equalized bitline leakage. Another technique for improving sensing margin is to realize static bitline. Fig. 5 explains the principle of the static bitline. Unlike the conventional dynamic-operation-based read operation, the static bitline is implemented by turning on the pull-up PMOS devices with proper strength adjustment. This prevents read bitline from being fully discharged to GND. The final bitline voltage levels are determined by the strength of the pull-down paths and that of the pull-up pths, which achieves static bitlines. As shown in Fig. 6, the static bitline provides larger sensing margin and sensing timing window compared to the conventional bitline structure. However, this requires additional power during read operation. Therefore, it is neces-



sary to turn off the pull-up devices or read wordline (RWL) quickly after completing read operation.

Fig. 5. (a) Schematic of the boosted bitline scheme [16] (b) timing diagram during a read operation of the conventional 8T and boosted bitline 8T.



**Time (s) x 10<sup>-6</sup>** Fig. 6. Simulated proposed RBL waveforms and RBL swing of the conventional 8T at 27 °C. RBL levels of data '1' is higher than data '0' in the proposed design. However it is reversed in the 8T design, indcating a wrong sensing[16].

### 4.3 Write Assist Circuits

Write operation is equally critical for reliable ultra-low voltage operation. Circuit techniques such as boosted wordline [17] and floating supply [17] improves write margin. However, they also exacerbate the half-selected cell stability. One technique for enhancing write margin without stability degradation is to use write-back techniques. The write-back operation is achieved by executing read operation before write operation. By writing read data into the write bitlines of unselected columns, the cell

stability disturbance caused by the conventional half-selection issue can be eliminated. However, this requires additional delay for executing the inserted read operation. However, by reducing the read delay through the hierarchical bitline structures, the overall performance of the write-back operation is comparable to the read performance. Therefore, the performance of the overall SRAM is not degraded. Fig. 7 explains the above write-back scheme, which is called 'fast local write-back'. A sample read/write timing diagram of fast local write-back scheme is presented in Fig. 8.



Fig. 7. Fast local write-back for improving cell stability.[15]



Fig.8. Timing diagram of the proposed fast local write-back scheme.[15]

# 5 Variation-Resilient Design

One of the largest challenges for ultra-low voltage digital IC design is the dramatically increased delay variations, which can be up to  $100 \times$  compared to that for nominal voltage operation [1]. The conventional worse-case design method will result in significant design overhead in this case. Earlier the issue was addressed by on-chip timing error monitoring using the replica critical paths and adaptive clock/voltage tuning. However, this cannot capture the local variations in the actual critical paths. In-situ timing monitoring techniques have been proposed to tackle this problem. In [19], razor technique is used to capture the late arrival signals (i.e. timing errors) by a shadow flipflop and correct them by architectural replay. However, this requires the minimal path delay to be increased to differentiate the late arrival and early arrival signals, leading to significant overhead due to buffer insertion. Also, its application is limited to high performance processors where architectural replay is available. In [20], canary flipflop technique is used to predict the error by monitoring artificially delayed signal. This eliminates the need for increasing minimum path delay. The drawback is there are some errors it cannot correct such as errors caused by fast variation or suddenly activated critical paths. In [21], half-path error monitoring is proposed to address the disadvantages of the razor and canary flipflop techniques. As the error is detected before the clock rising edge. It does not need to differentiate the late and early arrival signals, reducing the overhead of buffer insertion. Also, it is able to deal with errors caused by fast variations and suddenly activated critical path. Another advantage is that it is applicable to any digital designs as the error correction is done through general clock gating. For variation-resilient ultra-low voltage design techniques, the most important considerations include overhead, effectiveness and compatibility with standard design flow, which determine how the technique will be welcomed by major industry.

# 6 Conclusions

In this paper the major design techniques for ultra-low voltage digital IC design are reviewed. The device sizing methods considering current behavior and parasitic effects in near/sub-threshold region are reviewed and discussed. The ultra-low voltage level shifter design techniques employing revised cross-couple and current mirror structures are discussed. Various design techniques for ultra-low voltage SRAM are reviewed, including adoption of Multi-threshold CMOS device, read assist circuit and write assist circuits. The variation-resilient design techniques for ultra-low voltage operation are also reviewed and discussed.

### References

- R. G. Dreslinski, et al., "Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits," Proceedings of the IEEE, vol.98, no.2, pp.253,266, Feb. 2010
- [2] B.H. Calhoun, et al., "Modeling and sizing for minimum energy operation in subthreshold circuits," Solid-State Circuits, IEEE Journal of, vol.40, no.9, pp.1778,1786, Sept. 2005
- [3] J. Kwong, et al., "Variation-Driven Device Sizing for Minimum Energy Sub-threshold Circuits," Low Power Electronics and Design, 2006. ISLPED'06. Proceedings of the 2006 International Symposium on , vol., no., pp.8,13, 4-6 Oct. 2006
- [4] T. Kim, et al., "Utilizing Reverse Short Channel Effect for Optimal Subthreshold Circuit Design," Low Power Electronics and Design, 2006. ISLPED'06. Proceedings of the 2006 International Symposium on , vol., no., pp.127,130, 4-6 Oct. 2006

- [5] J. Zhou, et al., "A 40 nm inverse-narrow-width-effect-aware sub-threshold standard cell library," Design Automation Conference (DAC), 2011 48th ACM/EDAC/IEEE, vol., no., pp.441,446, 5-9 June 2011
- [6] J. Zhou, et al., "A 40 nm Dual-Width Standard Cell Library for Near/Sub-Threshold Operation," Circuits and Systems I: Regular Papers, IEEE Transactions on , vol.59, no.11, pp.2569,2577, Nov. 2012
- [7] Y. S. Lin, et al., "Single stage static level shifter designfor subthreshold to I/O voltage conversion," in Proc. ACM/IEEE Int.Symp. Low Power Electron. Design (ISLPED), Aug. 11–13, 2008, pp.197–200.
- [8] I. J. Chang, et al., "Robust level converter for subthreshold/super-threshold operation: 100 mV to 2.5 V," IEEETrans. Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 8, pp.1429–1437, Aug. 2011.
- [9] B. Zhai, et al., "Energyefficientsubthreshold processor design," IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst., vol. 17, no. 8, pp. 1127–1137, Aug. 2009.
- [10] S. N. Wooters, et al., "An energy-efficientsubthreshold level converter in 130-nm CMOS," IEEE Trans. CircuitsSyst. II, Exp. Briefs, vol. 57, no. 4, pp. 290–294, Apr. 2010.
- [11] "Level shifter circuit," U.S. Patent 7924080, 2011, Toshiba.
- [12] Y. Osaki, et al., "A low-power level shifter with logic error correction for extremely low-voltage digital CMOS LSIs," IEEE J. Solid-State Circuits, vol. 47, no. 7, pp.1776–1783, Jul. 2012.
- [13] S. Lutkemeier, et al., "A subthreshold to above-threshold level shifter comprising a wilson current mirror," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 9, pp. 721–724, Sep. 2010.
- [14] J. Zhou, et al., "An Ultra-Low Voltage Level Shifter Using Revised Wilson Current Mirror for Fast and Energy-Efficient Wide-Range Voltage Conversion from Sub-Threshold to I/O Voltage," Circuits and Systems I: Regular Papers, IEEE Transactions on , vol.62, no.3, pp.697,706, March 2015
- [15] B. Wang, et al., "Maximization of SRAM Energy Efficiency Utilizing MTCMOS Technology," Asia Symposium on Quality Electronic Design (ASQED), pp. 35-40, July 2012
- [16] Q. Li,et al., "A 5.61 pJ, 16 kb 9T SRAM with Single-ended Equalized Bitlines and Fast Local Write-back for Cell Stability Improvement," IEEE European Solid-State Device Research Conference (ESSDERC), pp. 201-204, Sept. 2012
- [17] A. Do, et al., "A 32kb 9T SRAM with PVT-tracking Read Margin Enhancement for Ultra-low Voltage Operation," IEEE International Symposium on Circuits and Systems (ISCAS), May pp. 2553-2556, 2015
- [18] B.H. Calhoun, et al., "A 256-kb 65-nm Sub-threshold SRAM Design for Ultra-Low-Voltage Operation," Solid-State Circuits, IEEE Journal of, vol.42, no.3, pp.680,688, March 2007
- [19] S. Das, et al., "Razor II: In situ error detection and correction for PVT and SER tolerance," JSSC, vol. 44, no. 1, pp. 32–48, Jan. 2009.
- [20] H. Fuketa, et al., "Adaptive performance compensation with in-situ timing error predictive sensors for subthreshold circuits," TVLSI, vol. 20, no. 2, pp. 333–343, Feb. 2012.
- [21] J. Zhou, et al., "HEPP: A new in-situ timing-error prediction and prevention technique for variation-tolerant ultra-low-voltage designs," Solid-State Circuits Conference (A-SSCC), 2013 IEEE Asian, vol., no., pp.129,132, 11-13 Nov. 2013