# A 0.3-V Operating, $V_{th}$ -Variation-Tolerant SRAM under DVS Environment for Memory-Rich SoC in 90-nm Technology Era and Beyond

Yasuhiro MORITA<sup>†a)</sup>, Hidehiro FUJIWARA<sup>††</sup>, Student Members, Hiroki NOGUCHI<sup>†††</sup>, Nonmember, Kentaro KAWAKAMI<sup>††</sup>, Junichi MIYAKOSHI<sup>††</sup>, Shinji MIKAMI<sup>†</sup>, Student Members, Koji NII<sup>††</sup>, Hiroshi KAWAGUCHI<sup>†††</sup>, Nonmembers, and Masahiko YOSHIMOTO<sup>†††</sup>, Member

**SUMMARY** We propose a voltage control scheme for 6T SRAM cells that makes a minimum operation voltage down to 0.3 V under DVS environment. A supply voltage to the memory cells and wordline drivers, bitline voltage, and body bias voltage of load pMOSFETs are controlled according to read and write operations, which secures operation margins even at a low operation voltage. A self-aligned timing control with a dummy wordline and its feedback is also introduced to guarantee stable operation in a wide range of the supply voltage. A measurement result of a 64-kb SRAM in a 90-nm process technology shows that a power reduction of 30% can be achieved at 100 MHz. In a 65-nm 64-Mb SRAM, a 74% power saving is expected at 1/6 of the maximum operating frequency. The performance penalty by the proposed scheme is less than 1%, and area overhead is 5.6%. *key words: SRAM, DVS, V*<sub>th</sub>-variation-tolerant, low power

### 1. Introduction

In order to save a power of an SoC, dynamic voltage scaling (DVS) that adaptively controls an operating frequency and supply voltage ( $V_{dd}$ ) has been implemented in a mobile system [1]. However, a minimum operation voltage ( $V_{min}$ ) is becoming higher as a fabrication technology is scaled down, since operation margins of memory cells in an embedded SRAM are degraded under both read and write conditions due to threshold-voltage ( $V_{th}$ ) variation of MOSFETs. Figure 1 illustrates so-called Pelgrom plots [2], which shows characteristics of standard deviation in 90-nm and 65-nm CMOS process technologies. A standard deviation of  $V_{th}$ ( $\sigma_{V_{th}}$ ) has been formulated as follows [3].

$$\sigma_{V_{\rm th}} \propto T_{\rm OX} \cdot \frac{\sqrt[4]{NT \cdot \ln(N/n_i)}}{\sqrt{L_{\rm eff} W_{\rm eff}}},\tag{1}$$

where  $T_{OX}$  is a gate oxide thickness, N is a channel dopant concentration, T is a absolute temperature,  $n_i$  is a intrinsic carrier concentration, and  $L_{eff}$  and  $W_{eff}$  are an effective

Manuscript received March 10, 2006.

Manuscript revised June 13, 2006.

<sup>†</sup>The authors are with the Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa-shi, 920-1192 Japan.

<sup>††</sup>The authors are with the Graduate School of Science and Technology, Kobe University, Kobe-shi, 657-8501 Japan.

<sup>†††</sup>The authors are with the Department of Computer and Systems Engineering, Kobe University, Kobe-shi, 657-8501 Japan.

a) E-mail: y-morita@cs28.cs.kobe-u.ac.jp

DOI: 10.1093/ietfec/e89-a.12.3634



Fig. 1 Pelgrom plots in 90-nm and 65-nm process technologies.

channel length and width of MOSFETs, respectively. The 65-nm process technology has the smaller slopes in the Pelgrom plots thanks to thinner  $T_{OX}$ . However, the  $V_{th}$  variation in the 65-nm technology is larger than that in the 90-nm one because of its smaller  $L_{eff}W_{eff}$ . Since a chip area of 80% or more is supposed to be occupied with memories [4] and unfortunately a  $V_{th}$  deviation is getting larger,  $V_{min}$  will be restricted by the  $V_{th}$  variation, which hinders wide-range power scaling of a future SoC with DVS capability.

Some techniques have been proposed to improve operation margins of memory cells. One is a power-line-floating in write operation for the conventional 6T memory cells [5], and another is a 7T cell which secures read margin by cutting a loop of a memory-cell flip-flop in read operation [6]. However, these techniques only improve either read or write margin, and thus  $V_{min}$  is still limited by the other operation. Both read and write margins are important. The voltage switching using the dual- $V_{cc}$  scheme is also proposed for improvement in both read and write margins [7], but it is not sufficient for DVS.

In this paper, we report an optimum voltage control scheme for 6T memory cells that lower  $V_{min}$  under the DVS environment. This scheme selectively controls voltages in the memory cell to expand the read and write margins. A self-aligned timing control is also developed so as to secure

Final manuscript received August 1, 2006.

a timing sequence of signals against process, voltage and temperature (PVT) variation.

The rest of this paper is organized as follows. Section 2 introduces the proposed optimum voltage control scheme, and shows the improvement of the operation margins in simulation. The circuit implementation of the proposed scheme and self-aligned timing control are also discussed in Sect. 2. In Sect. 3, experimental results of a 64-kb SRAM in a 90-nm CMOS technology are demonstrated, including a measured fail-bit count of memory cells and relationship between power and operating frequency. Simulation results in a case that a memory capacity increases and process technology is scaled down to 65 nm, are also shown in Sect. 3. Conclusions are given in Sect. 4.

#### 2. Optimum Voltage Control Scheme

## 2.1 Concept

Under the proposed DVS environment, a high supply voltage ( $V_{max}$ ) is applied externally, and a variable supply voltage ( $V_a$ ) from a DC/DC converter is also provided to logic and SRAM modules in an SoC, as shown in Fig. 2.  $V_a$  is adaptively controlled, which is between  $V_{min}$  and  $V_{max}$ . In this paper,  $V_{max}$  is set to 1.0 V as a nominal voltage in a 90-nm process technology. In the optimum voltage control scheme proposed, both  $V_a$  and  $V_{max}$  are provided to the SRAM, and a supply voltage, wordline (WL) and bitline (BL) voltages of the memory cells are switched according to the read and write conditions in Table 1 and Fig. 3. Figure 3 is schematics of a 6T memory cell, where  $V_{mc}$ ,  $V_{wl}$ , and  $V_{bl}$  are a supply voltage, a wordline voltage, and a bitline voltage of a cell under the read and write conditions, respectively, and V1 and V2 are voltages of data-stored nodes.

The optimum voltage control scheme improves operation margins in both read and write operations. A supply voltage of memory cells is set to  $V_{\text{max}}$  in the read operation to maximize the read margin. Alternatively in a write cycle, a WL voltage is set to  $V_{\text{max}}$  to obtain the write margin. An n-well bias  $(V_{\text{bp}})$  for load pMOSFETs in the memory cells is tied to  $V_{\text{max}}$ , which increases the pMOSFET  $V_{\text{th}}$  and write margin when  $V_{\text{a}}$  is less than  $V_{\text{max}}$ . Although the neg-



Fig. 2 Block diagram of SoC with DVS.

ative  $V_{bp}$  is utilized, a serious problem of a gate-induced drain leakage (GIDL) or negative bias temperature instability (NBTI) does not occur since  $V_{bp}$  does not exceed  $V_{max}$ . Moreover, this scheme can be implemented with a twin-tub process technology, and soft-error immunity is higher than that in a triple-well technology. The improvement of the operation margins with the proposed voltage control scheme is given in the next subsection.

#### 2.2 Improvement of Operation Margins

In order to estimate operation margins at each process corner, the worst case of local  $V_{\rm th}$  variations (random varia-

 Table 1
 Voltage controls in the conventional and proposed SRAM.

|  | State                   | Conv. | Proposed |       |                |                                                                                            |
|--|-------------------------|-------|----------|-------|----------------|--------------------------------------------------------------------------------------------|
|  | Part                    |       | Read     | Write | Non-<br>access |                                                                                            |
|  | Peripheral<br>Vdd       | Va    | Va       | Va    | Va             | Vmax: Maximum<br>supply voltage<br>Va: Variable supply<br>voltage under DVS<br>environment |
|  | BL pre-<br>charge level | Va    | Va       | Va    | Va             |                                                                                            |
|  | WL swing                | Va    | Va       | Vmax  | -              |                                                                                            |
|  | MC Vdd                  | Va    | Vmax     | Va    | Va             | BL: Bitline<br>WL: Wordline                                                                |
|  | MC Vbp                  | Va    | Vmax     | Vmax  | Vmax           | MC: Memory cell                                                                            |



(b) Write condition.

VwI=Vmax

**Fig. 3** Schematics of a 6T memory cell, the worst-case conditions of local  $V_{\rm th}$  variations, and voltage controls in the proposed scheme.

tions) of MOSFETs in a memory cell are set as indicated in Fig. 3.  $\sigma_{V_{\rm th}}$  is a standard deviation of  $V_{\rm th}$  in a MOSFET, but strictly the values are different among process corners, which will be discussed later. n is a coefficient. Four transistors in a memory cell (AC1, DR1, DR2 and LD2 for read; AC1, LD1, DR2 and LD2 for write in Fig. 3) affect operation margins [8], [9], and n = 3, for example, indicates that a local  $V_{\rm th}$  variation of  $6\sigma_{V_{\rm th}}$  is considered in a memory cell. The read margin is determined by a logical  $V_{\rm th}$  of an inverter in a memory-cell flip-flops (LD2 and DR2), and a voltage of a data-stored node, V1, when "L" is stored in V1 and a WL is activated. The worst case of the local  $V_{\rm th}$  fluctuation for the read margin is that the logical  $V_{\rm th}$  of the inverter is lowest, and that the ratio of the conductances between DR1 and AC1 are smallest. As for the write margin, the logical  $V_{\rm th}$  of an inverter and the voltage of V1 are important, when V1 is already "H" and then "L" is being written. The write margin is getting worse at a lower logical  $V_{\rm th}$  of the inverter and a larger ratio of the conductances between LD1 and AC1. Figure 4(a) is called a butterfly plot and illustrates a read margin in a memory cell at n = 0 (no  $V_{\text{th}}$  variations) and n = 3. Figure 4(b) shows a definition of a write margin [10]. The read and write margins worsen if  $V_{\rm th}$  variation occurs, which raises V<sub>min</sub>.

Figure 5 is a so-called milky-way plot of a memory

cell, and demonstrates the operation margins at three  $V_{\rm th}$ points (A-C). A diamond illustrated at the center in Fig. 5 shows process corners (FF, FS, SF, SS, and CC). Global V<sub>th</sub> variation (wafer-to-wafer/lot-to-lot variation) is reflected to the process corners. The local  $V_{\rm th}$  variation of  $6\sigma_{V_{\rm th}}$  described above is also considered in this figure. Point (A) is out of the read limit, and there is no read margin as shown in the bottom left butterfly curves. Point (A) explains the situation that read margin cannot be obtained and V1 of "L" will not be kept in during a read operation, which is due to the lower logical  $V_{\text{th}}$  of the inverter. Similarly Point (B) is out of the write limit and write operation will fail, since the voltage of V1 is much higher than 0 V when "L" is written, which is caused by the larger conductance ratio between LD1 and AC1. Both read and write operations will pass at Point (C) where the logical  $V_{\rm th}$  of the inverter is adequately high and the conductance ratio between LD1 and AC1 is much smaller. In the region between the read and write limit lines, a memory cell properly works.

Figures 6(a) and (b) show the detailed milky-way plots in the conventional and proposed optimum voltage control schemes. In the conventional SRAM, the margins become smaller as  $V_a$  is decreased. In particular at a  $V_a$  of 0.6 V, there is neither read nor write margin. To the contrary, the proposed scheme ensures sufficient read and write margins



**Fig. 4** (a) Read margins and (b) write margins at n = 0 (no  $V_{\text{th}}$  variation) and n = 3.



**Fig. 5** Milky-way plot of a memory cell when  $V_{dd}$ =1.0 V. At Point (A), there is no read margin while there is at Point (C). Write margin exists at Point (C) while it does not at Point (B).



**Fig. 6** Milky-way plots of (a) the conventional memory cell, and (b) the proposed one using optimum voltage control scheme.

even if  $V_a$  is low, and the margins rather becomes larger as  $V_a$  is reduced. The expansion of read margin is due to a lower voltage of V1 when V1 is "L." The gate voltages of DR1 and AC1 are  $V_{max}$  and  $V_a$ , respectively, hence the conductance ratio between DR1 and AC1 is larger. Similarly in the write operation of "L," the voltage of V1 is sufficiently low since the conductance ratio between LD1 and AC1 is small, which is attributed to the gate voltage of AC1 ( $V_{max}$ ). This is why the write margin is widening when  $V_a$  is low.

### 2.3 Circuit Design

Figure 7 illustrates a block diagram of the proposed SRAM to which the optimum voltage control scheme is applied. In the proposed SRAM, a memory cell array is divided into 64 blocks so that one block has 128 words by 8 bits, in which the voltage controls are done by a block-by-block basis since  $V_{dd}$  lines in the memory cells are along with bitlines (BLs) [11] unlike the row-by-row  $V_{dd}$  control [12].  $V_{dd}$ selectors are implemented in order to change the voltages  $(V_{\rm mc} \text{ and WL voltage})$ . A  $V_{\rm dd}$  selector consists of switches of p-channel MOSFETs, and the total channel width of the pMOSFET switches is set to 20-fold of the total channel width of MOSFETs in a memory cell. This value minimizes the  $V_{dd}$  switching time. If the switch width is small, the  $V_{dd}$ switching time becomes longer. On the other hand, even if the switch width is large, it still takes longer time to drive the gate of the switch.



**Fig.7** Block diagram of the proposed 64-kb SRAM with the optimum voltage control scheme. The control signals, WE, SE, and PC\_n are input to write circuit, sense amplifter, and bitline pre-charger, respectively. These signals are also utilized as control signals for other logic gates.



**Fig.8** Butterfly plots at voltage when  $V_a=0.4$  V and a WL voltage is varied.

Level shifters are introduced just after X decoders to amplify the WL voltages to  $V_{\text{max}}$  in a write cycle. The output voltage of the  $V_{dd}$  selector for WLs ( $V_{wl}$ ) is provided to an AND gate of GWL and Clwl, which controls local WL (LWL) voltages. Figure 8 demonstrates the read margins in a write cycle when  $V_a$  is fixed to 0.4 V and a voltage on a WL is varied. The margin could not be obtained if the voltage of WL was  $V_{\text{max}}$  (1.0 V). Again, this means that stored data would be destroyed under the write condition, since stored data in other blocks are destroyed if the conventional single WL scheme is used. Therefore, the divided-WL structure [13] that can hierarchically access to a local WL is applied in the proposed SRAM, in order to make a WL voltage in non-accessed memory cells grounded. Besides, a channel length of access MOSFETs (AC1 and AC2 in Fig. 3) is set to 0.15  $\mu$ m, which is longer than the minimum rule of 0.10  $\mu$ m. This does not only improve the read margin but also reduces a BL leakage current within a marginal delay overhead. The delay overhead is discussed in Sect. 3.3.

## 2.4 Self-Aligned Timing Control

A proper sequence of the voltage controls is of importance to avoid unexpected flips in memory cells. As pointed out in the previous subsection, when a cycle is changing from a write (or non access) to read operation, a destructive read might occur because a WL voltage gets higher than  $V_{mc}$ . In order to clear this issue, a timing adjustment between the WL voltage and  $V_{mc}$  in Fig. 9(a) is necessary. Moreover as shown in Fig. 9(b), another sequence that a WE signal is negated after a WL voltage is grounded, is required as well when a write operation is concluded. Otherwise, the bitline voltages would be floating while the WL voltage is high, and wrong data would be written to memory cells.

In order to secure these sequences, a self-aligned timing control is implemented with a dummy WL and its feedback (see Fig. 7).  $C_{vmc}$  in Fig. 7 is a control signal for memory cell power supply ( $V_{mc}$ ), and  $V_{dd}$  selector for memory cells chooses  $V_{max}$  if  $C_{vmc}$  is grounded.  $C_{vwl}$  means a control signal for a local WL voltage ( $V_{wl}$ ), and  $V_{wl}$  is raised to  $V_{max}$  when  $C_{vwl}$  is pulled down. At the beginning of a



Fig. 9 Timing chart of voltage controls.

read cycle and when a row and a column are selected, a control signal of RS·Read and CS are pulled up and  $C_{vmc}$  is grounded, which results in selecting  $V_{max}$  for  $V_{mc}$ . Then a local WL (LWL) is pulled up to  $V_a$  after  $C_{vmc}$  and global WL (GWL) are activated, which secures the timing sequence in Fig. 9(a). We confirmed that, in a simulation, the time required for the voltage switching is negligible compared to the cycle time because of a small capacitance in a memorycell block power supply. In contrast, after data are written to cells, clock signal is pulled down and the dummy WL starts falling. A WE signal is deactivated after receiving the feedback signal of the dummy WL. The use of the dummy WL guarantees the timing sequences in Fig. 9(b) against PVT fluctuation.

#### 3. Simulation and Measurement Results

#### 3.1 Chip Implementation

A 64-kb SRAM test chip was designed and fabricated in a 90-nm CMOS process technology to verify the feasibility of the proposed scheme. Figure 10 shows a micrograph of the test chip and a layout view of a memory-cell block. The area overhead of the proposed SRAM is 5.6%, which is basically caused by the  $V_{dd}$  selectors and level shifters.

## 3.2 Fail-Bit Count

Figure 11(a) demonstrates the proposed scheme improving the operation margins, by means of a measured fail-bit count (FBC) in a read operation.  $V_{min}$  is limited by read operation in our design, since the process corners are closer to the read limit curve in the milky-way plot as indicated in Fig. 6(a). The clock-cycle time in this measurement is as slow as 1  $\mu$ s in order to evaluate  $V_{min}$ , which is 0.55 V in the conventional SRAM while that of the proposed one is as low as 0.3 V at the CC process corner. In the conventional scheme, the  $V_{th}$ variation of the memory cells governs  $V_{min}$ . However in the proposed scheme, the value of 0.3 V means a lower limit of peripheral circuits since the memory cells have a larger margin as  $V_a$  is decreased as already shown in Fig. 6(b). In



Fig. 10 Chip micrograph and layout view of a memory-cell block.



Fig. 11 (a) Measured FBC, and (b) simulated BER when a process corner and memory capacity are varied.

the proposed SRAM, the GWL level shifter does not work below 0.3 V.

Note that in the conventional read/write operation,  $V_{min}$  gets higher at the FS corner as shown in Fig. 11(b) because the FS corner is closest to the read limit line in the milky-way plot. In addition, a bit error rate in data retention should be considered because the transistor  $V_{th}$  of cross-coupled in-



**Fig. 12** *P*-*f* curves in (a) 90-nm 64-kb SRAM and (b) various capacities and process technologies.

verters in a memory cell gets unbalanced as the  $V_{\rm th}$  variation increases. Thus, the stored data cannot be kept even if memory cells are not accessed. The simulated curves in the conventional read/write operation and data retention are obtained as follows:

- Minimum operation voltages of memory cells are estimated with a bit error rate, which is defined as the inverse of the capacity. Coefficient *n* in Fig. 3 is set to 2.17 for 64-kb, 2.52 for 2-Mb, and 2.83 for 64-Mb SRAMs.
- At FS corner, the read margin is degraded further since  $\sigma_{V_{\text{th}}}$  of load pMOSFETs, as expressed in (1), is larger due to a higher channel dopant concentration, *N*. This leads to get a BER- $V_{\text{a}}$  curve more gradual and raises  $V_{\text{min}}$ .

If a memory capacity is assumed to be 64 Mb, the conventional scheme does not work below 0.79 V in a 90-nm process technology, which hinders the DVS advantages. On the contrary,  $V_{min}$  in the proposed scheme in a 64-Mb SRAM can be down to 0.36 V which is restricted by data-retention characteristics. BER in a 65-nm process technology is also shown in Fig. 11(b). Since local  $V_{th}$  variation is larger as a process technology is scaled down as illustrated in Fig. 1,  $V_{min}$  becomes as high as 0.95 V in the conventional 65-nm 64-Mb SRAM, which means that there is just a 0.05-V room to change the supply voltage, and thus DVS never function under that condition.

#### 3.3 Power-versus-Frequency Characteristics

Figure 12(a) shows power dependences on an operating frequency (*P*-*f* curves) in the designed 90-nm 64-kb SRAM.  $V_a$  is implicitly adjusted according to the operating frequency. The performance penalty by applying the proposed scheme is less than 1% when  $V_a$  is 1.0 V.  $V_a$  must not be lowered than 0.66 V in the conventional scheme, which results in a higher power at a frequency of less than 300 MHz. The proposed scheme in the measurement allows the lowervoltage operation at the lower frequency, and we confirmed by the measurement that the 100-MHz 0.45-V operation has an advantage of 30% power reduction compared with the conventional scheme. As indicated in Fig. 12 (b), power savings are higher as a memory capacity increases and process technology is scaled down since  $V_{min}$  in such situation becomes higher. A power saving of 57% in a 90-nm 64-Mb SRAM and 74% in a 65-nm 64-Mb SRAM can be achieved at 1/6 of the maximum operating frequency.

## 4. Conclusions

This paper presented an optimum voltage control scheme in memory cells and self-aligned timing control method under DVS environment which expands read and write operation margins and allows an operation at as low as 0.3 V in a 90-nm SRAM. The proposed scheme achieves a power saving of 30% at 100 MHz in a 90-nm 64-kb SRAM, and power reduction of 74% at 1/6 of the maximum frequency in a 65-nm 64-Mb SRAM. Since the proposed scheme makes  $V_{min}$  much lower as memory capacity increases and process technology is scaled down, DVS can be enjoyed even on a future memory-rich SoC.

## Acknowledgments

The VLSI chip in this study was fabricated through the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo, with the collaboration by STARC, Fujitsu Limited, Matsushita Electric Industrial Company Limited, NEC Electronics Corporation, Renesas Technology Corporation, and Toshiba Corporation. The authors would like to appreciate Dr. K. Kobayashi of Kyoto University and Kyoto VDEC Sub-Center for measuring the test chips.

#### References

- H. Ohira, K. Kawakami, M. Kanamori, Y. Morita, M. Miyama, and M. Yoshimoto, "A feed-forward dynamic voltage control algorithm for low power MPEG4 on multi-regulated voltage CPU," IEICE Trans. Electron., vol.E87-C, no.4, pp.457–465, April 2004.
- [2] M.J.M. Pelgrom, A.C.J. Duinmaijer, and A.P.G. Welbers, "Matching properties of MOS transistors," IEEE J. Solid-State Circuits, vol.24, no.5, pp.1433–1440, Oct. 1989.
- [3] P.A. Stolk, F.P. Widdershoven, and D.B.M. Klaassen, "Modeling statistical dopant fluctuations in MOS transistors," IEEE Trans. Electron Devices, vol.45, no.9, pp.1960–1971, Sept. 1998.
- [4] International Technology Roadmap for Semiconductors 2005, http://www.itrs.net/Common/2005ITRS/Home2005.htm
- [5] M. Yamaoka, N. Maeda, Y. Shinozaki, Y. Shimazaki, K. Nii, S. Shimada, K. Yanagisawa, and T. Kawahara, "90-nm processvariation adaptive embedded SRAM modules with power-linefloating write technique," IEEE J. Solid-State Circuits, vol.41, no.3, pp.705–711, March 2006.
- [6] K. Takeda, Y. Hagihara, Y. Aimoto, M. Nomura, Y. Nakazawa, T. Ishii, and H. Kobatake, "A read-static-noise-margin-free SRAM cell for low-VDD and high-speed applications," IEEE J. Solid-State Circuits, vol.41, no.1, pp.113–121, Jan. 2006.
- [7] K. Zhang, U. Bhattacharya, Z. Chen, F. Hamzaoglu, D. Murray, N. Vallepalli, Y. Wang, B. Zheng, and M. Bohr, "A 3-GHz 70-Mb

SRAM in 65-nm CMOS technology with integrated column-based dynamic power supply," IEEE J. Solid-State Circuits, vol.41, no.1, pp.146–151, Jan. 2006.

- [8] F. Tachibana and T. Hiramoto, "Re-examination of impact of intrinsic dopant fluctuations on SRAM static noise margin," Proc. Int. Conf. on Solid State Devices and Materials, pp.192–193, Sept. 2004.
- [9] Y. Tsukamoto, K. Nii, S. Imaoka, Y. Oda, S. Ohbayashi, T. Yoshizawa, H. Makino, K. Ishibashi, and H. Shinohara, "Worst-case analysis to obtain stable read/write DC margin of high density 6T-SRAM-array with local V<sub>th</sub> variability," Proc. Int. Conf. on Computer Aided Design, 5A.2, Nov. 2005.
- [10] T. Douseki and S. Mutoh, "Static-noise margin analysis for a scaleddown CMOS memory cell," IEICE Trans. Electron. (Japanese Edition), vol.J75-C-II, no.7, pp.350–361, July 1992.
- [11] K. Osada, J.L. Shin, M. Khan, Y. Liou, K. Wang, K. Shoji, K. Kuroda, S. Ikeda, and K. Ishibashi, "Universal-V<sub>dd</sub> 0.65-2.0-V 32-kB cache using a voltage-adapted timing-generation scheme and a lithographically symmetrical cell," IEEE J. Solid-State Circuits, vol.36, no.11, pp.1738–1744, Nov. 2001.
- [12] F.R. Saliba, H. Kawaguchi, and T. Sakurai, "Experimental verification of row-by-row variable VDD scheme reducing 95% active leakage power of SRAM's," IEEE Symp. VLSI Circuits, pp.162– 165, June 2005.
- [13] M. Yoshimoto, K. Anami, H. Shinohara, T. Yoshihara, H. Takagi, S. Nagao, S. Kayano, and T. Nakano, "A divided word-line structure in the static RAM and its application to a 64K full CMOS RAM," IEEE J. Solid-State Circuits, vol.18, no.5, pp.479–485, Oct. 1983.



Yasuhiro Morita received the M.E. degree in electronics and computer science from Kanazawa University, Ishikawa, Japan, in 2005. He is currently working in the doctoral course at the same university. His current research interests include high-performance and low-power multimedia VLSI designs. Mr. Morita is a student member of IEEE.



**Hidehiro Fujiwara** received the B.E. degree in Computer and Systems Engineering from Kobe University, Hyogo, Japan, in 2005. He is currently working in the M.E. course at the same university. His current research is highperformance and low-power SRAM designs.



**Hiroki Noguchi** received the B.E. degree in Computer and Systems Engineering from Kobe University, Hyogo, Japan, in 2006. He is currently working in the M.E. course at the same university. His current research is highperformance and low-power SRAM designs.



Kentaro Kawakami received the B.E. degree in electrical and information engineering in 2002 and the M.E. degree in electronic and information system in 2004 from Kanazawa University, Ishikawa, Japan. He transferred his school from Kanazawa University to Kobe University in 2005. He is currently a Ph.D. candidate at Kobe University, Kobe, Japan. His research interests include low power circuits, motion video codec and LSI design methodology.



Junichi Miyakoshi received the B.E. degree in electrical and information engineering in 2002 and the M.E. degree in electronics and computer science from Kanazawa University, Ishikawa, Japan, in 2004. He is currently a Ph.D. candidate at Kobe University, Kobe, Japan. His current research interests include high-performance and low-power multimedia VLSI designs.



Shinji Mikami received the B.E. degree in electrical and information engineering in 2002 and the M.E. degree in electronic and information system in 2004 both from Kanazawa University, Ishikawa, Japan. He is currently a Ph.D. candidate at the same university. Currently, he is involved in research project of Kobe University to develop ultra low power wireless-networksensor-node. His research interests include lowpower RF circuit designs, media access controls and routing for sensor networks.



**Koji Nii** was born in Tokushima, Japan, in 1965. He received the B.E. and M.E degrees in electrical engineering from Tokushima University, Tokushima, Japan, in 1988 and 1990, respectively. In 1990, he joined the ASIC Design Engineering Center, Mitsubishi Electric Corporation, Itami, Japan, where he has been working on designing embedded SRAMs for advanced CMOS logic process. In 2003, Renesas Technology made a start. He currently works on the research and development of 45nm Embedded

SRAM in the Design Technology Div., Renesas Technology Corp. Also, he is currently a doctoral student of Kobe University, Hyogo, Japan. Mr. Nii is a member of the IEEE Solid-State Circuits Society, and Electron Devices Society.



**Hiroshi Kawaguchi** received the B.E. and M.E. degrees in electronic engineering from Chiba University, Chiba, Japan, in 1991 and 1993, respectively. He received the Ph.D. degree in engineering from the University of Tokyo, Tokyo, Japan, in 2006. He joined Konami Corporation, Kobe, Japan, in 1993, where he developed arcade entertainment systems. He moved to the Institute of Industrial Science, the University of Tokyo, as a Technical Associate in 1996, and was appointed to be a Research Associate in

2003. Since 2005, he has been a Research Associate in the Department of Computer and Systems Engineering, Kobe University, Kobe, Japan. He is also a Collaborative Researcher in the Institute of Industrial Science, the University of Tokyo. He is a recipient of the IEEE ISSCC 2004 Takuo Sugano Award for Outstanding Far-East Paper. He has served as a program committee member for IEEE Symposium on Low-Power and High-Speed Chips (COOL Chips). He is a guest associate editor of IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. His current research interests include low-power VLSI design, hardware design for wireless sensor network, and recognition processor. Dr. Kawaguchi is a member of the IEEE and ACM.



Masahiko Yoshimoto received the B.S. degree in electronic engineering from Nagoya Institute of Technology, Nagoya, Japan, in 1975, and the M.S. degree in electronic engineering from Nagoya University, Nagoya, Japan, in 1977. He received a Ph.D. degree in Electrical Engineering from Nagoya University, Nagoya, Japan in 1998. He joined the LSI Laboratory, Mitsubishi Electric Corp., Itami, Japan, in April 1977. From 1978 to 1983 he was engaged in the design of NMOS and CMOS static RAM in-

cluding a 64 K full CMOS RAM with the world's first divided-word-line structure. From 1984, he was involved in research and development of multimedia ULSI systems for digital broadcasting and digital communication systems based on MPEG2 and MPEG4 Codec LSI core technology. Since 2000, he has been a Professor of the Dept. of Electrical and Electronic Systems Engineering at Kanazawa University, Japan. Since 2004, he has been a Professor of the Dept. of Computer and Systems Engineering at Kobe University, Japan. His current activity is focused on research and development of multimedia and ubiquitous media VLSI systems including an ultra-low-power image compression processor and a low power wireless interface circuit. He holds 70 registered patents. He served on the Program Committee of the IEEE International Solid State Circuit Conference from 1991 to 1993. In addition, he has served as a Guest Editor for special issues on Low-Power System LSI, IP, and Related Technologies of IEICE Transactions in 2004. He received the R&D100 awards from R&D Magazine for development of the DISP and development of a realtime MPEG2 video encoder chipset in 1990 and 1996, respectively.