Advanced CMOS Cell Design

Advanced CMOS Cell Design Authors’ Proﬁles Etienne Sicard is currently professor at National Institute of Applied Scie...

Author: Etienne Sicard | Sonia Delmas Bendhia

219 downloads 2467 Views 36MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Advanced CMOS Cell Design

Authors’ Proﬁles Etienne Sicard is currently professor at National Institute of Applied Sciences (INSA) of Toulouse, Department of Electrical and Computer Engineering. He has been a visiting professor at the Department of Electronics, Carleton University, Ottawa, Canada, since 2004. Prior to this, he was Professor of Electronics at the Department of Physics, University of Balearic Islands, Spain. Etienne received his BS (1984) and PhD (1987) in Electrical Engineering from the University of Toulouse while working in the laboratory LAAS of Toulouse. Upon being awarded the Monbusho scholarship he worked at Osaka University during 1988-89. His research interests include several aspects of CAD tools for the design of integrated circuits, including signal integrity in deep sub-micron CMOS ICs and electromagnetic compatibility. Etienne is the author of several books, and software of micro-electronics (Microwind, Dsch) and speech therapy (Vocalab). He also has to his credit several technical papers on electromagnetic compatibility of CMOS integrated circuits. A member of the French SEE and the IEEE EMC society, Etienne was elected, in 2006, the distinguished IEEE lecturer for EMC of IMCs. He can be reached at: [email protected] Sonia Delmas Bendhia is Assistant Professor at INSA-Toulouse, Department of Electrical and Computer Engineering, where she teaches digital electronics, IC testing and reliability, analog and RF CMOS design. She is on the INSA Studies Directorate Board that organizes transversal educational courses, and is also in charge of promoting IT in teaching. Sonia holds an engineering diploma (1995) and a PhD (1998) in Electronic Design from INSA, Toulouse, France. Her research interests include signal integrity in deep sub-micron CMOS ICs and electromagnetic compatibility of ICs. She has authored several technical papers on signal integrity and EMC, and has also contributed to three books. She can be reached at [email protected]

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Advanced CMOS Cell Design

Etienne Sicard Professor INSA Electronic Engineering School of Toulouse, France

Sonia Delmas Bendhia Assistant Professor INSA Electronic Engineering School of Toulouse, France

McGraw-Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto

To Vinay, Bhupesh, Brijesh and Tarun

This page intentionally left blank

For more information about this title, click here

Contents Preface Acknowledgments Abbreviations and Symbols 1. Technology Scale-down 1.1 Recent Trends in CMOS Technology 1 1.2 Introducing the 90 nm Technology 5 References 12

xi xiii xv 1

2. Embedded Memories 2.1 The World of Memory 13 2.2 RAM Memory 15 2.3 RAM Array 18 2.4 Dynamic RAM Memory 23 2.5 EEPROM 24 2.6 Flash Memories 29 2.7 Ferroelectric RAM Memories 31 2.8 Memory Interface 34 References 34 Exercises 35

13

3. A Very-Simple-Microprocessor 3.1 Introduction 36 3.2 Instructions 38 3.3 Program Memory 39 3.4 Executing Instructions 40 3.5 Basic Block Design 45 3.6 Conclusion 65 References 66 Exercises 66

36

viii

Contents

4. Field-Programmable Gate Array 4.1 Introduction 67 4.2 Conﬁgurable Logic Circuits 69 4.3 Programmable Logic Block 77 4.4 Interconnection Between Blocks 79 4.5 Conclusion 90 References 90 Exercises 90

67

5. Radio-Frequency Circuits 5.1 Target Radio-Frequencies 93 5.2 Inductor 94 5.3 Power Ampliﬁer 102 5.4 Oscillators 114 5.5 Phase-Lock Loop 125 5.6 Frequency Converter 137 5.7 Sub-sampling Frequency Converter 5.8 Conclusion 153 References 155 Exercises 155

93

153

6. Converters and Sensors 6.1 Introduction 157 6.2 Digital-Analog Converter Architectures 158 6.3 Sample and Hold Circuits 170 6.4 Analog-Digital Converter Architectures 176 6.5 Temperature Sensor 184 6.6 Image Sensors 186 6.7 Conclusion 191 References 191 Exercises 191

157

7. Input/Output Interfacing 7.1 Power Supply 192 7.2 The Bonding Pad 193 7.3 The Pad Ring 196 7.4 Input Structures 201

192

Contents

7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13

ix

Digital Output Structures 216 Pull-up, Pull-down 228 Low Voltage Differential Swing 230 Power Clamp 232 Core/Pad Limitation 232 I/O Pad Description Using IBIS 234 Connecting to the Package 236 Signal Propagation Between Integrated Circuits 239 Conclusion 241 References 242 Exercises 243

8. Silicon on Insulator 8.1 Introduction 244 8.2 SOI Technology Issues 251 8.3 SOI Device Model 253 8.4 SOI Design 254 8.5 The Tera-Hertz MOS Device 255 8.6 Conclusion 257 References 257 Exercices 257

244

9. Future and Conclusion 9.1 Predicting the Unpredictable 258 9.2 Conclusion 259 References 260

258

Appendix A: Design Rules Lambda Units 261 Layout Design Rules 262 Pads 266 Electrical Extraction Principles 266 Node Capacitance Extraction 267 Resistance Extraction 270 Simulation Parameters 272 Technology Files for DSCH 275

261

x

Contents

Appendix B: MICROWIND31 Program Operation and Commands Getting Started 277 List of Commands in MICROWIND31 277

277

Appendix C: DSCH31 Logic Editor Operation and Commands Getting Started 318 Commands 318

318

Appendix D: Quick Reference Sheet MICROWIND31 Menus 330 MICROWIND3.1 Simulation Menu 335 DSCH3.1 Menus 335 Silicon Tool 337 List of Files 339 File Organization 339

330

Appendix E: Interface to WinSpice About WinSpice3 340 SPICE Syntax 340 Generate a SPICE File with DSCH3.1 345 Generate a SPICE File with MICROWIND3.1 References 355

340

348

Glossary

356

Index

361

Preface Our ﬁrst book, Basics of CMOS Cell Design, covered integrated circuit technology scale down, the MOS device model, layout and performance perspectives. It also included an extensive study of basic gates, interconnect and analog cells. We introduced basic cell design and simulation using user-friendly educational tools, Microwind and Dsch, developed by us. Advanced CMOS Cell Design takes the discussion further and illustrates how Microwind and Dsch versions 3.1 can be used to solve design problems. The book begins with an introduction to novel concepts in nano-scale technology, with a focus on 90 nm CMOS generation. In Chapter 2, various kinds of memories are discussed. Chapter 3 uses the medium of a project to explain microprocessor architecture, at the logic level. We would like to reiterate that this chapter would not have been possible without the able assistance and guidance of Dr Mafuz Aziz. The subject of Chapter 4 is ﬁeld programmable gate arrays, from a switch level. In Chapter 5 RF analog cells are described, including extensive details of mixers, voltage-controlled oscillators, phase-lock-loop and power ampliﬁers. The focus of Chapter 6 is on principles of analog-to-digital, digital-to-analog converters; the chapter also introduces CMOS sensors. Input-output interfacing principles are detailed in Chapter 7, including an in-depth study of I/O structures and technology reﬁnements. Silicon insulator technology is described in Chapter 8. Appendix A explains design rules, while details of all Microwind and Dsch commands are provided in Appendix B and C respectively. A quick reference sheet of the companion tools is provided in Appendix D. Students and practising electronic engineers will ﬁnd this a useful reference to learn the practical aspects of CMOS cell design. We welcome feedback, suggestions for improvements, and comments on anything that could have been done better.

About Microwind and Dsch The exercises, samples in this book require extensive use of the software tools Microwind and Dsch versions 3.1. A lite version of these tools, all schematic and layout ﬁles of the examples in the book can be downloaded from: http://www.microwind.org The URL to download the latest version of these tools is http://www.microwind.net

ETIENNE SICARD [email protected] SONIA DELMAS BENDHIA [email protected]

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

This page intentionally left blank

Acknowledgments Our special thanks to Bhupesh Purohit, Vinay Sharma and Brijesh Shah—the team from ni2designs—for putting in their best effort to promote the tools, Microwind and Dsch, as well as our two books (Basics of CMOS Cell Design and the current book, Advanced CMOS Cell Design). We thank the many reviewers of the tools, especially Charles Wagner, Joao Paulo Teixeira, Mahfuz Aziz, Saeed Dubas, Fernando Moraes, Gert Voland, Gerald Huguenin, Javier Garcia Zubia, Mario della Ragione, Ndubuisi Ekekwe, and S Natarajan. Also, Marie-Agnes Detourbe for diligently reviewing the manuscript. Thanks are due to Salman Zaffar for introducing a discussion on microprocessors. The chapter Very Simple Microprocessor was signiﬁcantly improved by Dr Mafuz Aziz, who strongly supported the project and provided valuable comments and suggestions. We would also like to thank R Chandra Sekhar and Vibhor Kataria of Tata McGraw-Hill, India, for publishing this book. Our acknowledgments would not be complete without thanking our parents, colleagues and friends for their constant support. ETIENNE SICARD SONIA DELMAS BENDHIA

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

This page intentionally left blank

Abbreviations and Symbols MULTIPLIERS Value 1018 1015 1012 109 106 103 100 10-3 10-6 10-9 10-12 10-15 10-18 10-21

Name PETA EXA TERA GIGA MEGA KILO – MILLI MICRO NANO PICO FEMTO ATTO ZEPTO

Standard Notation P E T G M (MEG in SPICE) K – m u n p f a z

PHYSICAL CONSTANTS AND PARAMETERS Name ε0 εr SiO2 εr Si εr ceramic k q µn µp γal γsi ni ρ al γ cu ρ cu ρ tungstène (W) ρ or (Ag) µ0 T

Value 8.85 e –12 Farad/m 3.9 – 4.2 11.8 12 1.381e–23 J/°K 1.6e-19 Coulomb 600 V.cm–2 270 V.cm–2 36.5 106 S/m 4 ⫻ 10–4 S/m 1.02 ⫻ 1010cm–3 0.0277 Ω.µm 58 ⫻ 106 S/m 0.0172 Ω.µm 0.0530 Ω.µm 0.0220 Ω.µm 1.257e–6 H/m 300°K (27°C)

Description Vacuum dielectric constant Relative dielectric constant of SiO2 Relative dielectric constant of silicon Relative dielectric constant of ceramic Bolztmann’s constant Electron charge Mobility of electrons in silicon Mobility of holes in silicon Aluminum conductivity Silicon conductivity Intrinsic carrier concentration in silicon at 300°K Aluminum resistivity Copper conductivity Copper resistivity Tungsten resistivity Gold resistivity Vacuum permeability Operating temperature

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Manufactured in the United States of America. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher. 0-07-150905-4 The material in this eBook also appears in the print version of this title: 0-07-148836-7. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. For more information, please contact George Hoare, Special Sales, at [email protected] or (212) 9044069. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise. DOI: 10.1036/0071488367

This page intentionally left blank

Advanced CMOS Cell Design

This page intentionally left blank

Professional

Want to learn more? We hope you enjoy this McGraw-Hill eBook! If you’d like more information about this book, its author, or related books and websites, please click here.

1 Technology Scale-down This chapter describes the recent improvements in technology scale-down in terms of density and speed. It introduces the 90 nm technology.

1.1 Recent Trends in CMOS Technology In this chapter, we shall give an updated overview of the evolution of important parameters such as Integrated Circuit (IC) complexity, gate length, switching delay and supply voltage, with a prospective vision down to the 22 nm CMOS technology. The book Basic CMOS Design [1] was focused on 130 nm technology. Recognizing a trend in IC complexity, Intel co-founder Gordon Moore extrapolated it to predict an exponential growth in the available memory and calculation speed of microprocessors. This, he said in 1965, would double every year [2]. With a slight correction (i.e. doubling every 18 months, see Fig. 1.1), Moore’s Law has held up to the Itanium® 2 processor, which has around 400 million transistors. The trend of CMOS technology improvement continues to be driven by the need to integrate more functions into a given area of silicon. Table 1.1 gives an overview of the key parameters for technological nodes from 180 nm, introduced in 1999, down to 22 nm, which is expected to be in production by around 2011. The physical gate length is slightly smaller than the technological node, as illustrated in Fig. 1.2. The gate material has long been polysilicon, with silicon dioxide (SiO2) as the insulator between the gate and the channel. The atom is a convenient measuring stick for the insulating material transistor beneath the gate. In 90 nm technology, the gate oxide consisted of about five atomic layers, which were 1.2 nm in thickness. The thinner the gate oxide, the higher the transistor current and consequently the switching speed. The SiO2 oxide has been regularly scaled down over the last decade, but has reached a physical limit of five atoms with the 90 nm CMOS process. With the 45 nm technology, new materials such as metal gates together with high-permittivity oxide should be introduced. Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

2

Advanced CMOS Cell Design

Fig. 1.1 Moore’s law compared to Intel processor complexity from 1970 to 2005

At each lithography scaling, the linear dimensions are reduced by a factor of approximately 0.7 and the areas are reduced by a factor two. Smaller cell sizes lead to a higher integration density. This has thus risen from 100 kilogates per mm2 for the 130-nm technology to almost one million gates per mm2 in 45 nm technology. In parallel, the size of a six-transistor memory point, such as those used in static RAM memories, passed below the 1 µm2 limit after the 65-nm technology. The IC market has been growing steadily for many years, due to an ever-increasing demand for electronic devices. The production of ICs for various technologies over the years is illustrated in Fig. 1.3. Table 1.1 Technological evolution and forecast up to 2011 Technology node

180 nm

130 nm

90 nm

65 nm

45 nm

32 nm

22 nm

First production

1999

2001

2003

2005

2007

2009

2011

Gate length

130 nm

70 nm

50 nm

35 nm

25 nm

17 nm

12 nm

Gate material

Poly

Poly

Poly

Poly

Metal

Metal

Metal

SiO2

SiO2

SiO2

SiON

High K

High K

High K

Atoms stacked on the gate oxide

10

8

5

5

5–10

5–10

5–10

kgates/mm2

100

200

350

500

900

1500

3000

Memory point µ2

4.5

2.4

1.3

0.6

0.3

0.15

0.08

Technology Scale-down

3

Fig. 1.2 The technology scale-down towards nano-scale devices

Fig. 1.3 Technology ramping every two years [3]

It can be seen that a new technology has appeared regularly every two years, with a ramp-up close to three years. The production peak has constantly increased, and similar trends are likely to be observed for novel technologies such as 65 nm (forecast peak in 2009). One very important trend associated with lithography scaling is the decrease in gate switching delay, as illustrated in Fig. 1.4. The IC speed is improved thanks to stronger currents capable of charging and

4

Advanced CMOS Cell Design

discharging smaller parasitic capacitances. A constant increase in the device current is highly desirable but raises a number of important issues.

Fig. 1.4 The reduction in channel length leads to tremendous benefits in terms of gate switching delay

Let us recall a first-order approximation of the device current, given Eq. 1.1: Ids = k

VDD µ L tOX

Fig. 1.5 The continuous decrease in supply voltages

(1.1)

Technology Scale-down

5

As may be deduced from the expression, there are at least three efficient ways of increasing the transistor current capabilities: • Increasing the supply voltage VDD (Fig. 1.5). Unfortunately, the supply voltage tends to follow the opposite trend, for low power consumption purposes. From 130 nm to 90 nm, the supply has been reduced from 1.5 to 1.2 V. • Reducing the distance L between the drain and the source. Fortunately, the channel length is automatically scaled with the technology. A scaling factor of 0.7 leads to a 33% increase in the absolute current. • Decreasing the oxide thickness tOX. The oxide thickness has been reduced from 1.8 nm (eight atoms) to 1.2 nm (five atoms). Unfortunately, the gate oxide leakage is exponentially increased, which affects the parasitic leakage currents and consequently the standby consumption. • Increasing the carrier mobility m. This parameter was kept unchanged up to the 90 nm generation, which was the first to exploit the concept of strained silicon to enhance the carrier mobility. Finding mobility enhancement techniques is mandatory, to maintain performance gain without deteriorating device leakage.

1.2 Introducing the 90 nm Technology A complete industrial 90 nm process was introduced by Intel in 2003 [3]. With transistor channels around 50 nm in size (50 billionths of a meter), comparable to the smallest micro-organisms, this technology is truly a nanotechnology. The main novelty related to the 90 nm technology is the introduction of strained silicon to speed up the carrier mobility. This boosts both the n-channel and p-channel transistor performances (Fig. 1.6). It has been known for decades that stretching the silicon lattice improves the carrier mobility, and consequently the device current.

Fig. 1.6 Strain generated by a silicon-nitride capping layer which increases the distance between atoms underneath the gate. This speeds up the electron mobility of n-channel MOS devices

6

Advanced CMOS Cell Design

Let us now focus on the silicon atoms forming a regular lattice structure inside which the electrons participating in the device current have to flow. In the case of electron carriers, stretching the lattice allows the charges to flow faster from the drain to the source, as depicted in Fig. 1.7. The mobility improvement exhibits a linear dependence with the tensile film thickness. An 80 nm film has resulted in a 10% saturation current improvement in Intel’s 90 nm technology [3]. The strain may also be applied from the bottom with a uniform layer of an alloy of silicon and germanium (SiGe).

Fig. 1.7

Compressive strain to reduce the distance between atoms underneath the gate, which speeds up the hole mobility of p-channel MOS devices

In a similar way, compressing the lattice slightly increases the speed of the p-type transistor for which the current carriers consist of holes. The combination of reduced channel length, decreased oxide thickness and strained silicon allows to achieve a substantial gain in drive current for both nMOS and pMOS devices. 1.2.1 N-channel MOS Device Characteristics Version 3.1 of the tool MICROWIND is configured in 90 nm technology by default. A cross-section of the n-channel and p-channel MOS devices is given in Fig. 1.8. The nMOS gate is capped with a specific silicon-nitride layer that induces lateral tensile channel strain for improved electron mobility. The I/V device characteristics of the low-leakage and high-speed MOS devices listed in Table 1.2 are obtained using the MOS model BSIM4 (See [1] for more information about this model). The device performances are close to those presented in [3]. The cross sections of the low-leakage and high-speed MOS devices (Fig. 1.8) do not reveal any major difference. Concerning the low-leakage MOS, the I/V characteristics reported in Fig. 1.9 demonstrate a drive current capability of around 0.6 mA for W = 0.5 µm, that is, 1.2 mA/µm at a voltage supply of 1.2 V. For the high-speed MOS, both the effective channel length and the threshold voltage are slightly reduced, to achieve an impressive drive current of around 1.5 mA/µm. The drawback of this astounding current drive is the leakage current, which rises from 60 nA/µm (low leakage) to 600 nA/µm (high speed), as seen in the Id/Vg curve for Vg = 0 V, Vb = 0 V (Fig. 1.10-b).

Technology Scale-down

Table 1.2 nMOS parameters featured in the 90 nm CMOS technology provided in MICROWIND Parameter

nMOS Low leakage

nMOS High speed

Draw length

0.1 µm

0.1 µm

Effective length

60 nm

50 nm

Width

0.5 µm

0.5 µm

Threshold voltage

0.28 V

0.25 V

Ion (VDD = 1.2 V)

0.63 mA

0.74 mA

Ioff

30 nA

300 nA

Fig. 1.8 Bird’s eye view and cross section of nMOS devices

7

8

Advanced CMOS Cell Design

Fig. 1.9 Id /Vd characteristics of the low-leakage and high-speed nMOS devices (W = 0.5 µm, L = 0.1 µm)

Fig. 1.10

Id /Vd characteristics (low scale) of the low-leakage and high-speed nMOS devices (W = 0.5 µm, L = 0.1 µm)

Technology Scale-down

9

1.2.2 P-channel MOS Device Characteristics Table 1.3 pMOS parameters featured in the 90 nm CMOS technology provided in MICROWIND Parameter

pMOS Low leakage

pMOS High speed

Drawn length

0.1 µm

0.1 µm

Effective length

60 nm

50 nm

Width

0.5 µm

0.5 µm

Ion (VDD = 1.2 V)

0.35 mA

0.39 mA

Ioff

21 nA

135 nA

Fig. 1.11 Cross section of the pMOS devices

The pMOS drive current in this 90 nm technology is as high as 700 µA/µm for low-leakage MOS and up to 800 µA/µm for high-speed MOS (Fig. 1.11). A novel Silicium-bermanium (Sibe) film induces compressive channel strain which boosts the pMOS hole mobility. These values are particularly high, as the target applications for this technology at Intel are high-speed digital circuits such as microprocessors. The leakage current is around 40 nA/µm for low-leakage MOS and near 300 nA/µm for high-speed devices.

10

Advanced CMOS Cell Design

1.2.3 High Speed, General Purpose and Low Power Process Variants The 90 nm process technology proposed in MICROWIND corresponds to the highest possible speed, at the cost of very important leakage current. This technology variant is called “high speed” as it is dedicated to applications for which high speed is the primary objective: fast microprocessors, fast DSP, etc. The second technological option is “general purpose” (Fig. 1.12). This targets standard products where the speed factor is not critical. The leakage current is one order of magnitude lower than in the high-speed option, and the gate delay is increased by 50%, as seen in the parameters listed in Table 1.4. The “low power” variant concerns ICs for which the leakage must remain as low as possible, a criterion that ranks first in applications such as embedded devices, mobile phones and personal organizers. The gate delay is multiplied by three as compared to the high-speed variant, mainly due to thicker oxides and a larger gate length.

Fig. 1.12 Introducing three variants of the 90 nm technology

1.2.4 High-permittivity Dielectrics The steady reduction in thickness of conventional oxides such as silicon dioxide (SiO2) results in reliability degradation and unacceptable current leakage. New dielectric materials (Table 1.5) with high permittivity (high-K) are needed to replace SiO2, both for the MOS device and the embedded capacitors. High-capacitance passive devices (known as Metal-Insulator-Metal, or MIM) are needed for various purposes including on-chip power supply decoupling, analog filtering for wireless applications and high-quality resonators for radio-frequency circuits. These capacitors should feature high reliability, low current leakage, low series resistance and low dielectric loss. They should also be fully compatible with the standard CMOS processes.

Technology Scale-down

11

Table 1.4 The three classes of 90 nm CMOS technologies and comparative performances Technology

High Speed

General Purpose

Lower Power

Typical applications

Fast µP, fast DSP

ASIC, microcontrollers, FPGA

Mobiles, embedded devices

VCC

1.2

1.0

1.2

tox (nm)

1.2

1.6

2.2

Leff (nm)

50

65

80

VT (V)

0.28

0.35

0.50

Idsat_n (µA/µm)

1200

700

500

Idsat_p (µA/µm)

700

300

200

Ioff (A/µm)

50n

5n

50p

Delay (ps/stage)

7

12

25

Table 1.5 New dielectric materials that may replace SiO2 in future technologies Relative Permittivity (εεr )

Comments

Material

Description

HfO2

Fluor-oxide

20

Proposed for 45 nm gate oxide

Ta2O5

Tantalum pentoxide

25

High crystallization temperature. Reliability issues

NixTa2O5

Niobium tantalum pentoxide

28

Good candidate for MIM capacitor

SiOxNy

Silicon oxide nitride

5–7

Used for 65 nm gate oxide

SiO2

Silicon dioxide

4

Important ultra-thin film leakage

Both MOS devices and passives may benefit from high-K insulators. Concerning MOS devices, high-K dielectrics can be made thicker than SiO2 films to obtain the same equivalent channel effect, thereby reducing leakage. Concerning passives, the larger the permittivity, the larger the charge that can be stored in the memory capacitor, thus resulting in higher capacitance values. Alternatively, the same capacitance may require less silicon area with high-K insulators than with conventional SiO2. Typical values for the capacitance range from 2 to 20 fF/µm2.

12

Advanced CMOS Cell Design

References [1] E. Sicard and S. Bendhia, Basic CMOS Cell Design, Tata McGraw-Hill, 2005, ISBN 0-07-059933-5. [2] G.E. Moore, “Cramming more components onto integrated circuits”, Electronics, Volume 38, No. 8, 1965. [3] T. Ghani and Col, “A 90 nm high volume manufacturing logic technology featuring novel 45 nm gate length strained silicon CMOS transistors”, Proceedings of IEDM 2003.

2 Embedded Memories

2.1 The World of Memory Semiconductor memories are vital components in modern ICs. Stand-alone memories represent roughly 30% of the global IC market. In a system-on-chip, memory circuits usually represent more than 75% of the total number of transistors.

Fig. 2.1 Major classes of CMOS compatible memories

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

14

Advanced CMOS Cell Design

There are two main classes of memories: volatile and non-volatile memories. • In volatile circuits (Fig. 2.1 left), the data is stored as long as the power is applied. The Dynamic Random Access Memory (DRAM) is the most common volatile memory. • Non-volatile memories are capable of storing the information even if the power is turned off (Fig. 2.1 right). Read-only Memory (ROM) is the simplest type of non-volatile memory. One-time Programmable Memories (PROM) are an important family, but the most popular among non-volatile memories are erasable and programmable devices. These include the old Electrically Programmable ROM (EPROM), the more recent Electrically Erasable PROM (EEPROM, FLASH), and the new Magneto-resistive RAM (MRAM) and Ferroelectric RAM (FRAM) memories.

Fig. 2.2 Typical memory organization

Figure 2.2 shows a typical memory organization layout. It consists of a memory array, a row decoder, a column decoder and a read/write circuit. The row decoder selects one row from 2N, thanks to an N-bit row selection address. The column decoder selects one row from 2M, thanks to an M-bit column selection address. The memory array is based on 2N rows and 2M columns of a repeated pattern, the basic memory cell. A typical value for N and M is 10, resulting in 1024 rows and 1024 columns, which corresponds to 1048576 elementary memory cells (one Mega bit).

Embedded Memories

15

2.2 RAM Memory The basic cell for static memory design is based on six transistors, which two pass gates instead of one. The corresponding schematic diagram is given in Fig. 2.3. The circuit consists of the two cross-coupled inverters (see [1], Chapter 6), but uses two additional pass transistors. The cell has been designed to be duplicated in X and Y in order to create a large array of cells. Usual sizes for Megabit SRAM memories are 1024 column × 1024 rows, or higher. A modest arrangement of 4 × 4 RAM cells is proposed in Fig. 2.4. The selection line WL concerns all the cells of one row. The bit lines BL and ~BL concern all the cells of one column.

Fig. 2.3 The layout of the six-transistor static memory cell (RAM6T.SCH)

Fig. 2.4 An array of 6T memory cells, with four rows and four columns (RAM6T.SCH)

16

Advanced CMOS Cell Design

The RAM layout is given in Fig. 2.5. The BL and ~BL signals are made with metal2 and cross the cell from top to bottom. The supply lines are horizontal, made with metal3. This allows easy matrix-style duplication of the RAM cell.

Fig. 2.5 Layout of the SRAM cell (RAM6T.MSK)

WRITE CYCLE. Values one or zero must be placed on Bit Line, and the data’s inverted value on ~Bit Line. Then the selection Word Line goes to one. The two-inverter latch takes the Bit Line value. When the selection Word Line returns to zero, the RAM is in a memory state. READ CYCLE. The selection signal Word Line must be asserted, but no information should be imposed on the bit lines. In that case, the stored data value propagates to Bit Line, and its inverted value ~Data propagates to ~Bit Line. SIMULATION. The simulation parameters correspond to the read and write cycles in the RAM. The proposed simulation steps consist of writing a zero, writing a one, and then reading the one. In a second phase, we write a one, write a zero, and then read the zero. The Bit Line and ~Bit Line signals are controlled by pulses (Fig. 2.6). The floating state is obtained by inserting the letter “x” instead of one or zero in the description of the signal. The simulation of the RAM cell is proposed in Fig. 2.7. At time 0.0, Data reaches an unpredictable value of one, after an unstable period. Meanwhile, ~Data reaches zero. At time 0.5 ns, the memory cell is selected by a one on World Line. As the Bit Line information is zero, the memory cell information Data goes down to zero. At time 1.5 ns, the memory cell is selected again. As the Bit Line information is now one, the memory cell information Data goes to one. During the read cycle, in which Bit Line and ~Bit Line signals are floating, the memory sets these wires to one and zero respectively, corresponding to the stored values.

Embedded Memories

17

Fig. 2.6 Bit Line pulse uses the “x” floating state to enable reading of the memory cell (RamStatic6T.MSK)

Fig. 2.7 Write cycle for SRAM cell (RamStatic6T.MSK)

18

Advanced CMOS Cell Design

2.3 RAM Array You can duplicate the RAM cell into a 4 × 4 bit array using the command Edit→Duplicate XY. Select the whole RAM cell and a new window appears. Enter the value « 4 » for X and « 4 » for Y into the menu. Click on « Generate ». A very interesting approach to obtain a more compact memory cell consists of sharing all possible contacts: the supply contact, the ground contact and the bit line contacts. The consequence is that the effective cell size can be significantly reduced (Fig. 2.8).

Fig. 2.8 Sharing all possible contacts leads to a very compact cell design (Ram6Tcompact.MSK)

The layout is functionally identical to the previous layout. The only difference is in the placement of MOS devices and contacts. We duplicate the RAM cell into a 64-bit array. The multiplication cannot be done directly by the command Duplicate XY, as we need to flip one cell horizontally to share lateral contacts, and flip the resulting block vertically to share vertical contacts (Fig. 2.9). 2.3.1

Row Selection Circuit

The row selection circuit decodes the row address and activates one single row (Fig. 2.10). This row is shared by all word line signals of the row. The row selection circuit is based on a multiplexor circuit. One line is asserted while all the other lines are at zero.

Embedded Memories

Fig. 2.9 Compact 16 × 4 array of memory cells with shared contacts (Ram16 × 4Compact.MSK)

Fig. 2.10 Row selection circuit

19

20

Advanced CMOS Cell Design

In the row selection circuit for the 16 × 4 array, we simply need to decode a two-bit address. Using AND gates is one simple solution. In Fig. 2.11, we present the schematic diagram of two-to-four and three-toeight decoders. In the case of a very large number of address lines, the decoder is split into sub-decoders, which handle a reduced number of address lines.

Fig. 2.11 Row selection circuit in two-bit and three-bit configuration (RamWordline.SCH)

2.3.2 Column Selection Circuit

Fig. 2.12 Column selection circuit principles

Embedded Memories

21

The column decoder selects a particular column in the memory array to read the contents of the selected memory cell (Fig. 2.12) or to modify its contents. The column selector is based on the same principles as those of the row decoder. The major modification is that the data flows both ways, that is either from the memory cell to the DataOut signal (read cycle), or from the DataIn signal to the cell (write cycle). Figure 2.13 proposes an architecture based on n-channel MOS pass transistors. We consider here four columns of memory cells, which require two address signals Address_Col[0] and Address_Col[1]. The n-channel MOS device is used as a switch controlled by the column selection. When the nMOS is on and Write is asserted, (Fig. 2.13) the DataIn is amplified by the buffer, flows from the bottom to the top and reaches the memory through BL and ~BL. If Write is off, the three-state inverter is in high impedance, which allows the information to be read on DataOut.

Fig. 2.13 Row selection and read/write circuit (RamColumn.SCH)

2.3.3 A Complete 64-bit SRAM The 64-bit Static RAM (SRAM) memory interface is shown in Fig. 2.14. The 64 bits of memory are organized in words of four-bits, meaning that DataIn and DataOut have a four-bit width. Each data D0..D15 occupies four contiguous memory cells in the array. Four address lines are necessary to decode one address among 16. The memory structure shown in Fig. 2.14 requires two address lines A0 and A1 for the word lines WL[0]..WL[3] and two address lines A2 and A3 for the bit line selection. The final layout of the 64-bit SRAM is proposed in Fig. 2.15.

22

Advanced CMOS Cell Design

Fig. 2.14 Architecture of the 64-bit RAM (RAM64.MSK)

Fig. 2.15 The complete RAM layout (RAM64.MSK)

Embedded Memories

23

2.4 Dynamic RAM Memory The Dynamic RAM (DRAM) memory has only one transistor, in order to improve the memory matrix density by almost one order of magnitude. The storage element is no longer the stable inverter loop, as for the SRAM, but only a capacitor Cs, also called the storage capacitor. The DRAM cell architecture is shown in Fig. 2.16.

Fig. 2.16 Simulation of the write cycle for one-transistor DRAM cell (RAM1T.SCH)

The write and hold operation for ‘1’ is shown in Fig. 2.16. The data is set on the bit line, the word line is then activated, and Cs is charged. As the pass transistor is n-type, the analog value reaches VDD-Vt. When WL is inactive, the storage capacitor Cs holds one.

Fig. 2.17 Simulation of the read cycle for one-transistor DRAM cell (RAM1T.SCH)

24

Advanced CMOS Cell Design

The read cycle is destructive for the stored information. Suppose that Cs holds a one (Fig. 2.17). The bit line is precharged to a voltage Vp (usually around VDD/2). When the word line is active, a communication is established between the bit line, loaded by capacitor CBL, and the memory, loaded by capacitor Cs. The charges are shared between these nodes, and the result is a small increase of the voltage Vp by ∆V, thanks to the injection of some charges from the memory. Commercial DRAM memories use storage capacitors with a value between 10 fF and 50 fF. This is done by creating a specific capacitor for the storage node appearing in Fig. 2.18 left, thanks to the following technological advances: the use of specific metal layers to create the lower plate and external walls of the RAM capacitor, an enlarged height between the substrate surface and metal1, and the use of highpermittivity dielectric oxide. SiO2 has a relative permittivity εr of 3.9. Other oxides compatible with the CMOS process have a higher permittivity (higher ‘K’); Si3N4 with εr close to 7.0, and Ta2O5 with εr equal to 23.

Fig. 2.18 Increasing the storage capacitance (left: junction capacitor, right: embedded capacitor)

A cross section of the DRAM layout is given in Fig. 2.19. The bit line is routed in metal2, and is connected to the cell through a metal1 and diffusion contact. The word line is the polysilicon gate. On the right side, the storage capacitor is a sandwich of conductor material connected to the diffusion contact, a thin oxide (SiO2 in this case) and a second conductor that fills the capacitor and is connected to the ground by a contact to the first level of metal. The capacitance is around 20 fF in this design. Higher capacitance values may be obtained using larger capacitor areas, at the price of a lower cell density.

2.5 EEPROM The basic element of an Electrically Erasable PROM (EEPROM) memory is the floating-gate transistor. The concept was introduced several years ago for the Erasable PROM (EPROM). It is based on the possibility of trapping electrons in an isolated polysilicon layer placed between the channel and the

Embedded Memories

25

Fig. 2.19 The stacked capacitor cell and the diffusion capacitor cell (DramEdram.MSK)

controlled gate. The charges have a direct impact on the threshold voltage of a double-gate device. When there is no charge in the floating gate (Fig. 2.20, upper part), the threshold voltage is low. This means that a significant current may flow between the source and the drain if a high voltage is applied on the gate. However, the channel is small as compared to a regular MOS, and the ION current is three to five times lower, for the same channel size.

Fig. 2.20 The two states of double-gate MOS (EepromExplain.SCH)

When charges are trapped in the floating polysilicon layer (Fig. 2.20, lower part), the threshold voltage is high, and almost no current flows through the device, independent of the gate value. As a matter of

26

Advanced CMOS Cell Design

fact, the electrons trapped in the floating gate prevent the creation of the channel by repelling channel electrons. Data retention is a key feature of EEPROM, as it must be guaranteed for a wide range of temperatures and operating conditions. Optimum electrical properties of the ultra-thin gate oxide and inter-gate oxide are critical for data retention. The typical data retention of an EEPROM is 10 years.

Fig. 2.21 The double-gate MOS generated by Microwind3 (Eeprom.MSK)

The double-gate MOS layout is shown in Fig. 2.21. The structure is very similar to the n-channel MOS device, except for the supplementary poly2 layer on top of the polysilicon. The lower polysilicon is unconnected, resulting in a floating node. Only the poly2 upper gate is connected to a metal layer through a poly2/metal contact situated at the top. The cross section of Fig. 2.21 reveals the stacked poly/ poly2 structure, with a thin oxide in between. 2.5.1 Double-gate MOS Charge The programming of a double-poly transistor involves the transfer of electrons from the source to the floating gate through the thin oxide (Fig. 2.22). Notice the high drain voltage (3 V) which is necessary to

Embedded Memories

27

transfer enough temperature to some electrons to make them “hot”, and the very high gate control needed to attract some of these hot electrons to the floating poly through the ultra-thin gate oxide. The very high voltage varies from 7 V to 12 V, depending on the technology. In MICROWIND the “++” symbols attached to the signal properties indicates that a voltage higher than the nominal supply is used.

Fig. 2.22 Double-gate MOS characteristics (a) without and (b) with charges (EepromCharge.MSK)

At initialization (Fig. 2.22a) no charge exists in the floating gate, resulting in the possibility of current when the poly2 gate voltage is high. However, the device is much less efficient than the standard n-channel MOS, due to an indirect control of the channel. The maximum current is small but significant. The programming operation is performed using a very high gate voltage on poly2, here 8 V. The mechanism for electron transfer from the grounded source to the floating polysilicon gate, called tunneling, is a slow process. In MICROWIND3, around 1000 ns are required. With a sufficiently positive voltage on the poly2 gate, the voltage difference between poly and source is high enough to enable electrons to pass

28

Advanced CMOS Cell Design

through the thin oxide. The electrons trapped on the floating gate increase the threshold voltage of the device, thus rapidly decreasing the channel current. When the gate is completely charged, no more current appears in the Id /Vd characteristics (Fig. 2.22b). 2.5.2 Double-gate MOS Discharge The floating gate may be discharged by ultra-violet light exposure or by electrical erasure. The UV technique is a heritage of the EPROM, which requires a specific package with a window to expose the memory bank to the specific light. The process is very slow (around 20 nm). After the UV exposure, the threshold voltage of the double-gate MOS returns to its low value, which enables the current to flow again. In MICROWIND3, the command Simulate → UV exposure to discharge floating gates simulates the exposure of all double-gate MOS to an ultra-violet light source. Alternatively, the charge can be accessed individually using the command Simulate → MOS characteristics. Changing the Charge cursor position dynamically modifies the MOS characteristics. For the electrical erase operation, the poly2 gate is grounded and high voltage (around 8 V) is applied to the source. Electrons are pulled off the floating gate, thanks to the high electrical field between the source and the floating gate. This charge transfer is called Fowler-Nordheim electron tunnelling (Fig. 2.23).

Fig. 2.23 Discharging the double-gate MOS device (EepromDischarge.MSK)

The basic structure for reading the EEPROM information is shown schematically in Fig. 2.24. After a precharge to VDD, and once WL is asserted, the bit line may either drop to VSS if the floating gate is empty of charges, or remain in a high voltage if the gate is charged. This disables the path between BL and the ground through the EEPROM device. In the case of Fig. 2.24 left, the floating gate has no charge, so BL is tied to ground after the precharge, meaning that DataOut is one. The write operation involves applying a very high voltage on the gate (8 V), and injecting a high or low state on BL. A zero on DataIn is equivalent to a high voltage on BL, which provokes the hot electron effect and charges the floating gate. In contrast, a one on DataIn keeps BL low, and no current flows on the EEPROM channel. In that case, the floating gate remains discharged.

Embedded Memories

29

Fig. 2.24 Reading and writing in EEPROM (Eeprom.MSK)

2.6 Flash Memories Flash memories are a variation of EEPROM memories. Flash arrays can be programmed electrically bitby-bit but can only be erased by blocks. Flash memories are based on a single double-poly MOS device, without any selection transistor (Fig. 2.25).

Fig. 2.25 Flash memory point and principles for charge/discharge (FlashMemory.SCH)

30

Advanced CMOS Cell Design

The immediate consequence of such a simple design is a more compact memory array and denser structures. Flash memories are commonly used in micro-controllers for the storage of application code, which gives the advantage of non-volatile memories and the possibility of reconfiguring and updating the code many times. The flash memory point usually has a ‘T-shape’, due to increased size of the source for optimum tunneling effect [1]. The horizontal polysilicon2 is the bit line, and the vertical metal2 is the word line, which links all drain regions together. The horizontal metal line links all sources together. It is common practice to violate usual design rules in order to achieve a more compact layout. In the case of Fig. 2.26, the poly extension is reduced from three lambda to two lambda.

Fig. 2.26 Flash memory point and associated cross section (Flash8x8.MSK)

Embedded Memories

31

2.7 Ferroelectric RAM Memories Ferroelectric RAM (FRAM) memories are the most advanced of the flash memory challengers [2]. The FRAM is similar to the DRAM except that the FRAM memory point is based on a two-state ferroelectric insulator, while the DRAM relies on a silicon dioxide capacitor. Mega bit FRAM are already available as stand-alone products. However, FRAM embedded memories have been made compatible since the 90 nm CMOS technology. The MICROWIND3 software should first be configured in 90 nm to access the FRAM properties using the command File → Select Foundry. One FRAM cell layout example is shown in Fig. 2.27.

Fig. 2.27 Bird’s view of FRAM cells showing the distinction between two domains

The 2D cross section (Fig. 2.27) shows the ferroelectric crystalline material made from a compound of lead, zirconium and titanium (PZT). The chemical formulation of PZT is an exotic PbZr1-xTixO3. Adjusting the proportion of zirconium and titanium changes the electrical properties of the material. The PbZrTiO3 molecular structure is given in Fig. 2.29. It is equivalent to a cube, where each of the eight corners is an atom of lead (Pb). In the center of the cube is a titanium atom, which is a class IVb element, with oxygen atoms at its ends, shared with neighbors. The two stable states of the molecule are shown in Fig. 2.29. The titanium atom may be moved inside the cell applying an electrical field. The remarkable properties of this insulator material are: the stable state of the titanium atom even without any electrical field, the low electrical field required to move the atom, and its very high dielectric constant (around 100). The PZT capacitor behavior is usually represented by an hysteresis curve as shown in Fig. 2.30. In the X-axis, the electrical field applied to the electrodes is displayed. The Y-axis represents the dipole orientation for each molecule. It can be seen that if a minimum field is applied on the capacitor, the polarization changes. An inverted electrical field is required to change the state of the material.

32

Advanced CMOS Cell Design

Fig. 2.28 Two domains of FRAM memory (FramCell.MSK)

Fig. 2.29 Two domains of the structure which change the orientation of the equivalent dipole

Consequently, the write cycle for a one simply consists of applying a large positive step which orients the dipoles north, and for a zero in applying a negative voltage step, which orients the dipoles south (Fig. 2.31).

Embedded Memories

33

Fig. 2.30 Hysteresis curve of the PZT insulator

Fig. 2.31 FRAM circuit principles and architecture (Fram4 × 4.SCH)

To read the domain information, an electrical field is applied to the PZT capacitor, through a voltage pulse. If the electric field is oriented in the opposite direction of the elementary dipole and is strong enough, the inner atom orientation is changed. This creates a significant current, which is amplified and considered as a one. If the electric field is oriented in the same direction as the elementary dipole, only a small current pulse is observed. This is considered as a zero. Reading the logical information is equivalent to observing the current peak and deciding whether the current peak is small (zero) or large (one). Notice that the read operation destroys the data stored in the PZT material, as for the DRAM cell. Just after the memory information is read, the logical information must be written back to the memory cell.

34

Advanced CMOS Cell Design

2.8 Memory Interface All inputs and outputs of the RAM are synchronized with the rise edge of the clock, and more than one word can be read or written in sequence. The typical chronograms of a synchronous RAM are shown in Fig. 2.32. The active edge of the clock is usually the rise edge. One read cycle includes three active clock edges in the example shown in Fig. 2.32. The row address selection is active at the first rise edge, followed by the column address selection. The data is valid at the third fall edge of the system clock.

Fig. 2.32 Synchronous RAM timing diagram

Double-data-rate memories involve both the rise and fall edge of the clock [1]. Furthermore, a series of data from adjacent memories may be sent on the data bus. Two contiguous data are sent, one on the rise edge of the clock, and the other on the fall edge of the clock. This technique is called “burst-of-two”. An example of double-data-rate and burst-of-two data in/out is proposed in Fig. 2.33. Notice that DataIn and DataOut work almost independently.

References [1] A. K. Sharma, Semiconductor Memories, Technology, Testing and Reliability, IEEE Press, 1997, ISBN 0-7803-1000-4. [2] L. Geppert, “The New Indelible Memories,” IEEE Spectrum, Mar. 2003, Vol. 40, No. 3, pp. 49–54.

Embedded Memories

35

Fig. 2.33 Double-data-rate diagram

EXERCISES 1. Compare the leakage current on a DRAM cell for the following technologies: 0.35 µm, 0.12 µm and 90 nm. 2. Given a 4 × 4 EEPROM memory array, create the chronograms to write the words 0001, 0010, 0100 and 1000, and then to read these values. 3. Modify the ROM array proposed in file ROM8×8.SCH to write the word “Welcome”.

36

Advanced CMOS Cell Design

3 A Very-Simple-Microprocessor (This chapter has been written in cooperation with Dr. Mahfuz Aziz, Senior Lecturer at the School of Electrical and Information Engineering, University of South Australia) This chapter gives an introduction to microprocessor architecture. The goal here is to build a four-bit processor at logic level and then simulate its internal structure step-by-step.

3.1 Introduction The Very-Simple-Microprocessor (VSM) is an updated version of the very popular Simple-As-Possible (SAP) computer architecture proposed by Albert P. Malvino [1] in 1993 in his famous book “Digital Computer Electronics”. The VSM computer introduces the basic concepts of microprocessor architecture in the simplest possible way. The VSM is very primitive, yet quite complex, as shown in Fig. 3.1.

Fig. 3.1 VSM basic architecture Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

A Very-Simple-Microprocessor

37

The function of each block is described in Table 3.1. Table 3.1 Main blocks of VSM architecture Block

Block Description

Size

Program Counter

The program counter counts from 0000 to 1111. It monitors the address of the active instruction. Initially, the program counter is set to 0000, so the microprocessor starts with the instruction at the first memory location.

Program Memory

The program memory stores the program. Each program line has an 8 × 8 bits eight-bit format: the four most significant bits represent the instruction itself, and the four least significant bits represent the data attached to the instruction, if necessary.

Accumulator A

The accumulator is a four-bit register. It is used to store one of the operands for an arithmetic operation. It also stores the intermediate results computed by the microprocessor. Upon request (EnableA), the accumulator result is placed on the internal bus.

four bits

Accumulator B

The accumulator B is also a four-bit register. It is used to store the second operand for an arithmetic operation. For addition, this operand is added to accumulator A and for substraction accumulator A is subtracted from this operand.

four bits

Arithmetic Unit

The Arithmetic Unit performs the operation

four bits

four bits

S = A + B (Addition) Or S = B + ~A + 1 (Subtraction) Input Register

The Input Register gives the opportunity to transfer data from the outside world to the microprocessor.

four bits

Output Register

The Output Register transfers the contents of the internal bus to the outside world. Usually, this instruction is executed at the end of a program to display the final result. The output register stores the output data on the falling edge of the clock. The output register is usually connected to a circuit, which transfers or displays the result to the user.

four bits

The operation of the VSM is based on a bus called “Internal Bus” (IB). Each block shown in Fig. 3.2 may take control of the bus using a specific enable signal. For example, accumulator A uses an enable signal called EnableA. When EnableA is high, the content of accumulator A is placed on the internal bus.

38

Advanced CMOS Cell Design

All the enable signals used in the VSM are shown in Fig. 3.2. Table 3.2 summarizes their functions. The control of these enable signals is provided by the MicroInstruction block, which plays a fundamental role in the operation of the microprocessor.

Fig. 3.2 Controller generates ‘Enable’ signals that allow one block to take control of the bus

Table 3.2 Four blocks may take control of the internal bus, thanks to Enable signals Enable Signal

Description

EnableA

Authorizes A to take control of the bus.

EnableAlu

Places the result of the arithmetic operation (ADD or SUB) on the bus.

EnableInstr

Places the data part of the instruction (four least significant bits) on the bus.

EnableIn

Transfers the contents of the external input to the internal bus.

3.2 Instructions Each instruction of the VSM is eight-bits long. However, only the four most significant bits represent the instruction itself. The remaining four bits contain the data. Therefore, only 16 different instructions are possible.

A Very-Simple-Microprocessor

39

3.2.1 No Operation (NOP = 0000) The No Operation instruction has no effect. It does not modify the content of any register. However, this instruction is very important to understand how the basic clock controls work. 3.2.2 Addition (ADD = 0001) The content of accumulator A is added to the data given as a parameter with the instruction. The result updates the accumulator A. The addition is performed on four bits. The carry is ignored. For example, considering that A = 2, the instruction “ADD 3” corresponds to A = A + 3, that is A = 2 + 3. The final value of A is 5. 3.2.3 Subtraction (SUB = 0010) The content of accumulator A is subtracted from the data given as a parameter, and the result updates the accumulator A. The subtraction is performed on four bits. The carry is ignored. 3.2.4 Get Input (In = 0100) The content of the input port is transferred to accumulator A. 3.2.5 Give Output (OUT = 0011) The content of accumulator A is stored on the output port. The output port is a four-bit register that memorizes the output value and makes it available to external devices until its content is refreshed by a new “Give Output” instruction. 3.2.6 Load Accumulator A (LDA = 0101) This instruction loads the accumulator A with the value given as a parameter. For example, the instruction LDA 9 transfers the value 9 (1001 in binary format) to accumulator A.

3.3 Program Memory The program memory contains up to eight bytes, where we store the instructions to be executed. Each instruction is eight-bits long. As shown in Fig. 3.3 each instruction is split into two parts: the four most significant bits represent the instruction code, while the four least significant bits represent the data. The program given in Table 3.3 loads accumulator A with the value ‘2’, then adds ‘1’, and places the result in the output register.

Fig. 3.3 Each instruction is split into four-bit microinstruction code and four-bit data fields

40

Advanced CMOS Cell Design

Table 3.3 A simple program for adding two four-bit numbers Mnemonic

OpCode (binary)

OpCode (hexa)

LDA 2 ADD 1 OUT NOP

0101 | 0010 0001 | 0001 0011 | 0000 0000 | 0000

0 × 52 0 × 11 0 × 30 0 × 00

Figure 3.4 shows the memory symbol along with the corresponding schematic diagram depicting the contents of all the eight memory locations. The memory has eight registers, each register having eight elementary memory cells. You can change the contents of the memory by clicking on the desired memory cells. When you save the schematic diagram, you also save the memory contents. The memory symbol may be found in the basic symbol palette in DSCH.

Fig. 3.4 Storing program in memory (VSM-mem8×8macro.SCH)

3.4 Executing Instructions 3.4.1 Introducing Microinstructions Each VSM instruction is executed as a sequence of four internal micro-operations, also called microinstructions. Therefore the period of execution of each instruction can be divided into four time phases (T1–T4), each for one microinstruction, as shown in Fig. 3.5. The reader should note the distinction

A Very-Simple-Microprocessor

41

between the microprocessor instruction itself, such as “LDA 2” and the four internal microinstructions needed to complete the “LDA 2” instruction, called phase one, two, three and four. The first two phases are called the fetch sequence. The corresponding microinstructions are independent of the user’s instruction. The last two phases are called the execute sequence. Table 3.4 summarizes the microinstructions.

Fig. 3.5 Execution of one VSM instruction involves execution of four microinstructions in four separate time phases

Table 3.4 Execution of one instruction is based on four time phases Phase

Name

Description

Phase one

Address state

The content of the desired memory location is loaded into the instruction register.

Phase two

Increment state

The program counter address is incremented. The instruction register provides the microinstruction decoder with the instruction.

Phase three

Execute step one

Depending on the instruction, the microprocessor performs the first step of the execution phase.

Phase four

Execute step two

The microprocessor performs the second step of the execution phase.

3.4.2 No Operation (NOP = 0000) The control flow for the “No Operation” instruction is shown in Fig. 3.6. The Fetch sequence corresponds to access to the memory (ReadMem = 1), and the loading of the corresponding instruction (LoadInstr = 1) during phase one. During phase two, the stored instruction is sent to the microinstruction

42

Advanced CMOS Cell Design

controller (EnableInstr = 1), while the counter is incremented (ProgCount = 1). As the ‘No Operation’ instruction does not affect any internal register, the execution phases (Phase three and phase four) do not correspond to any specific activity.

Fig. 3.6 Execution of microinstructions corresponding to NOP instruction

3.4.3 Addition (ADD = 0001) Addition is performed between the content of accumulator A and the four-bit data given as a parameter of the ADD instruction. Consequently, the addition is executed by storing the data in accumulator B

A Very-Simple-Microprocessor

43

(Phase three), then asking the arithmetic unit to produce the addition between accumulator A and accumulator B (Phase four), and finally by transferring the result back to accumulator A on the rising edge of the clock during phase four, as illustrated in Fig. 3.7.

Fig. 3.7 Execution of microinstructions corresponding to the ADD instruction

44

Advanced CMOS Cell Design

3.4.4 Subtraction (SUB = 0010) The execution phase of the subtraction instruction is identical to that of the addition instruction. The only difference is that the AddSub signal is set to zero, which means “Subtract”. 3.4.5 Get Input (In = 0100) The content of the input port is transferred to accumulator A during phase three (Fig. 3.8). There is nothing to do in phase four, when all registers remain inactive.

Fig. 3.8 Execution of microinstructions corresponding to the IN instruction

A Very-Simple-Microprocessor

45

3.4.6 Give Output (OUT = 0011) The content of accumulator A is transferred to the output port via the internal bus during phase three. The output port memorizes the accumulator value and makes it available to external devices, thanks to its four registers. The processor is inactive during phase four.

Fig. 3.9 Execution of microinstructions corresponding to the OUT instruction

3.4.7 Load Instruction (LDA = 0101) The load instruction transfers the four-bit data given as a parameter of the LDA instruction to accumulator A. For example, the instruction “LDA 9” transfers the value 9 (1001 in binary format) to accumulator A. In Fig. 3.10, the four least significant bits of the instruction register are placed on the internal bus and then transferred to accumulator A. As a result, the updated value of A is 1001. There is no activity during phase four.

3.5 Basic Block Design The structure of each sub-block of the microprocessor is presented in detail here. 3.5.1 Accumulator A The accumulator is composed of four edge-sensitive D flip-flops as shown in Fig. 3.11. The register output is available through AluA0..AluA3 for the ADD and SUB operations. The content of A is transferred to the internal bus when EnableA is asserted. We use tri-state inverters to facilitate access to the internal bus. The latchA signal authorizes the transfer of input data (here, through a keyboard) to accumulator A at the falling edge of the main clock.

46

Advanced CMOS Cell Design

Fig. 3.10 Microinstruction during phase three executes the load operation. During phase four, the processor is inactive

A Very-Simple-Microprocessor

47

Fig. 3.11 Structure of accumulator A showing its connections to the internal bus and arithmetic unit (VsmAccumulatorA.SCH)

3.5.2 Accumulator B Like accumulator A, the accumulator B is composed of four edge-sensitive D flip-flops as shown in Fig. 3.12. The register output is available through AluB0..AluB3 for the ADD and SUB operations. The

Fig. 3.12 Structure of accumulator B showing its connections to the arithmetic unit (Vsm-AccumulatorB.SCH)

48

Advanced CMOS Cell Design

“latchB” signal authorizes the transfer of input data (here, through a keyboard) to accumulator B at the falling edge of the main clock. 3.5.3 Add/Subtract Block The addition is based on the full-adder sub-circuit that has been described in Chapter seven of the book “Basic CMOS cell design” by Sicard and Bendhia [2]. The full-adder consists of a set of XOR gates for generating the Sum output and a complex gate for generating the Carry output, as shown in Fig. 3.13.

Fig. 3.13 Internal structure of full-adder (Vsm-fullAdder.SCH)

Adding two four-bit numbers requires four cascaded full-adders as illustrated in Fig. 3.14. The carry signal propagates from the lower stage to the upper stage in order to perform the complete add operation. To subtract two numbers (B-A in this case) using the same full-adders, we need to build two supplementary things: • A circuit that produces the one’s complement of A • A small circuit that sets the initial carry to one. One approach consists of using multiplexer circuits, which may be found in Symbol palette → Advanced Symbol menu> sub-menu Switches. When Sel equals zero, the input i0 is transferred to the output. Otherwise, i1 is transferred to the output. Consequently, AddSub = 0 corresponds to the transfer of A to the adder chain (Add operation), while AddSub = 1 corresponds to the transfer of ~A to the adder chain (Subtract operation). At this point, it sounds very interesting to connect the accumulators and the arithmetic unit in order to perform manually what the microprocessor will later do with its internal sequencer. The circuit made of the accumulators A and B, and the arithmetic unit, is shown in Fig. 3.15. The two keyboards serve as inputs A and B. The displays are placed on the internal buses between the arithmetic units and the accumulators as well as on the output bus of the arithmetic unit.

A Very-Simple-Microprocessor

Fig. 3.14

49

Structure of the arithmetic unit, which performs the ADD and SUB operations (Vsm-ArithmeticUnit.SCH)

Trying to operate this simple circuit would be a very interesting introduction to the microprocessor’s operation. Below is the set of actions we need to perform sequentially in order to add two numbers: • De-active the main Reset. Initially the Reset pin is set to zero (default value at the start), which corresponds to an active Reset. Both registers A and B are cleared (A = 0, B = 0). Nothing will work until you set the button ~MainReset to one. • Load the desired value on A. Click on a digit on the lower keyboard named “A”, for example “3”. Click LatchA and wait for at least one complete cycle of the main clock. The accumulator A stores 3 at the falling edge of the clock. • Load the desired value on B. Click on a digit on the upper keyboard named “B”, for example “2”. Click LatchB and wait at least one complete cycle of the main clock. The accumulator B stores “2” at the falling edge of the clock. The arithmetic unit computes the sum A + B as AddSub is set to zero by default. This corresponds to the ADD instruction. However the result is not displayed, as EnableAlu is zero. • Set EnableAlu to one to display the result “5”, as shown in Fig. 3.15.

50

Advanced CMOS Cell Design

Fig. 3.15

The connection between accumulators A and B, and the arithmetic unit to test ADD and SUB instructions (Vsm-RegARegBAlu.SCH)

3.5.4 The Input Register The input register is a simple set of three-state buffers as shown in Fig. 3.16. There is no need for D-registers as the input will be directly transferred to accumulator A.

Fig. 3.16 Input register (Vsm-InRegister.SCH)

A Very-Simple-Microprocessor

51

3.5.5 The Output Register The output register is composed of D-register cells as shown in Fig. 3.17. On the positive edge of the clock, the data is saved in the registers. It is very important that the data is stored on the positive edge of the clock during phase three, and not on the negative edge. The latter would give rise to synchronization conflicts. Therefore, a NAND gate is used to make the circuit sensitive to the rising edge of the main clock, as shown in Fig. 3.18.

Fig. 3.17 Internal structure of output register (Vsm-OutRegister.SCH)

Fig. 3.18 The output register must store data at the rising edge of the clock in phase three

52

Advanced CMOS Cell Design

3.5.6 A Manual Microprocessor In this section, we propose to build a manually-controlled microprocessor which consists of accumulators A and B, and the input and output registers. The goal of the simulation reported in Fig. 3.19 is to transfer the input information (DataIn) to the output port (DataOut). To perform this transfer, we need to enable the input port (EnableIn = 1) and then enable the output port (EnableOut = 1). At the next rising edge of the main clock, the contents of the input keyboard (“5” in this case) will appear on the display connected to the output register. Several other transfers may be performed: • Input register to accumulator A • Input register to accumulator B • Result of the addition of A and B to the output port The arrow symbol (Symbol menu Advanced → Symbol → Arrow) is used to ease electrical connections for the clock and reset signals. In the example shown in Fig. 3.19, connections are made automatically among all arrows having the same name. Double click the Clk arrow symbol in Fig. 319 to access the arrow name which identifies the electrical net. In the example shown in Fig. 3.20, we build two different electrical connections, one called Clk and the other called Rst. Note that the electrical node names are not case sensitive.

Fig. 3.19 A manually-controlled microprocessor (Vsm-RegARegBAluInOut.SCH)

A Very-Simple-Microprocessor

53

Fig. 3.20 Building arrow connections to ease the electrical wiring of the main signals (Vsm_arrow.SCH)

3.5.7 The Phase Generator In order to transform the previous “manual” microprocessor into a fully-programmable microprocessor, we need to build several circuits to generate the appropriate control signals. First, the phase counter must produce the four phase signals Phase0 to Phase3 at the negative edge of the clock. The counter must be reset by an active low Clear signal. The design of the phase counter is based on edge-sensitive latches and XOR gates as shown in Fig. 3.21.

Fig. 3.21 Phase counter structure (Vsm-RingCounter4.SCH)

54

Advanced CMOS Cell Design

Fig. 3.22 Simulation of phase counter (Vsm-RingCounter4.SCH)

When the Clear signal becomes inactive (logic high) the phases appear sequentially (Fig. 3.22). 3.5.8 Program Counter 0-to-15 The program counter plays a very important role in the microprocessor as it supplies the main program memory with the address of the active instruction (Fig. 3.23). At the start, the program counter is zero. At the end of each instruction the program counter is incremented in order to select the next instruction.

Fig. 3.23 The program counter supplies program memory with the address of active instruction

A Very-Simple-Microprocessor

55

One simple way to build a 0-to-15 counter is to use a cascaded chain of edge-sensitive D flip-flops, as shown in Fig. 3.24. The circuit is very simple, but works asynchronously. This means that due to propagation delays between stages, some intermediate results appear on the display for a very short period of time. These glitches have no impact on the microprocessor operation as the counter is incremented during phase two of the microinstruction sequence, and is only exploited during phase one of the next instruction to load the instruction register.

Fig. 3.24

Program counter at work. Counting is enabled only during phase two, at the falling edge of the main clock (Vsm-Counter16.SCH)

3.5.9 The Instruction Register The instruction register stores the instruction being executed. The eight-bit information is split into two parts: the most significant bits correspond to the instruction code, while the least significant bits are the data. The instruction code is stored in the four D-registers situated at the bottom of Fig. 3.25, in order to be available for the microinstruction decoder. The data is stored in four separate D-register cells and can be made available on the internal bus. The instruction register keeps a copy of the current instruction and releases the main memory, which can be accessed later for both read or write operation.

56

Advanced CMOS Cell Design

Fig. 3.25

Instruction register stores contents of the memory and separates code part (lower registers) from data part (upper registers) (Vsm-InstructionReg.SCH)

3.5.10 The Microinstruction Controller The microinstruction controller is the ‘heart’ of the microprocessor. It generates the most important signals for controlling the operation of the processor, for example, the Enable and latch signals. The design of the microinstruction controller is shown in Fig. 3.26. The input to the microinstruction controller is the instruction code from the instruction register plus the phase information from the phase counter. The four-input AND gates serve as instruction decoders. For example, the instruction 0000 turns on the upper AND gate, which corresponds to the NOP instruction. Notice that phase0 and phase1 are not connected to the instruction decoder. This is because the first two phases are not dependent on the instruction itself. Then, depending on the type of instruction, the desired control signals are set to one if active, or kept at zero to be inactive.

A Very-Simple-Microprocessor

Fig. 3.26

57

Control signals activated by the microinstruction controller during the first two time phases are same for all instructions and depend on instruction code during the last two phases

3.5.11 The Complete Microprocessor It is time now to connect all the sub-circuits together and test the entire microprocessor. Each of these sub-circuits has been embedded into a symbol where only the input and output pins appear. The complete circuit is shown in Fig. 3.27. We should keep in mind that this is only a very simple and very low complexity microprocessor. Before starting the simulation, we must load the program into the memory. The program shown in Table 3.5 has been written into the microprocessor’s memory.

Table 3.5 The code stored into program memory Mnemonic

OpCode (binary)

OpCode (hexa)

LDA 1 ADD 2 OUT

0101 | 0001 0001 | 0010 0011 | 0000

0 × 51 0 × 12 0 × 30

58

Advanced CMOS Cell Design

Fig. 3.27 Microprocessor circuit ready for simulation (Vsm-Microprocessor.SCH)

Once simulation starts, there are several things to do in order to run the code: • De-active the reset signal MainClear (1) • Click on the main clock (2) • At each active edge of the clock, observe the phase counter shifting from phase0 to phase1, phase2, phase3 and back to phase0 (3). • Starting in phase two, the instruction is loaded into the microinstruction controller. The active instruction appears as shown in (4), which corresponds here to “Load (0101)”. • You can monitor the memory contents and the active memory location (5). • Also worth monitoring is the internal bus (6). • If required by the program, you can enter data through the keyboard named DataIn (7). • If the “OUT” instruction is running, the result should appear on the output display (8). At the end of the addition program, the screen appears as reported in Fig. 3.28. 3.5.12 Memory Move One important feature NOT handled by the very simple microprocessor is the memory move (MOVE). This instruction transfers the contents of a memory location to accumulator A or vice versa. Why did we not build this functionality into the first version of our processor? This is because the structure of the memory control and access must be deeply modified and would require a significant amount of supplementary hardware.

A Very-Simple-Microprocessor

59

Fig. 3.28 Final result of the addition of “1” and “2” using the program proposed in Table 3.5 (Vsm-Microprocessor.SCH)

Assuming that the MOVE operation transfers the contents of one memory location to A, we need to perform the following sequence of operations: during phase three, we need to have access to a new memory location, whose address is not the one currently stored in the program counter. This means that a new type of access must be provided in the processor from the internal bus to the memory, without altering the contents of the instruction register. The differences between the two structures are displayed in Fig. 3.29.

Fig. 3.29 Modifying the microprocessor to handle MOVE instruction

60

Advanced CMOS Cell Design

In practice, the MOVE instruction can be incorporated by adding the following: • A direct path from memory to the internal bus (with its appropriate Enable control) • A four-bit address bus from the instruction register to the memory • A multiplexer for selecting a memory address either from the Program Counter or from the Instruction Register. 3.5.13 Physical Implementation

Description of the Design Flow The VSM processor has been described and simulated at logic level using DSCH, and saved under the name vsm-microprocessor.SCH. It can be converted automatically into layout using MICROWIND. The design flow is detailed in Fig. 3.30. First we create a VERILOG description of the VSM processor using the command File → Make Verilog File. The resulting text file vsm-microprocessor.TXT contains a VERILOG description of the processor. This file can be compiled in MICROWIND using the command Compile → Compile Verilog File in order to automatically generate the layout of the processor.

Fig. 3.30 Automatically generating the layout of VSM processor from logic circuit

A Very-Simple-Microprocessor

61

VERILOG Translation In its basic version, the microprocessor includes 312 primitives. This relatively small number of devices is due to the fact that the memory symbol is ignored during the translation to VERILOG. This is because the memory macro-cell used in the microprocessor design is not a real memory as it does not contain any real memory element such as flip-flops. The warning generated by DSCH during the VERILOG translation is shown in Fig. 3.31. A partial view of the VERILOG description of the VSM (vsmmicroprocessor.TXT) is shown in Fig. 3.32.

Fig. 3.31 Warning concerning the memory macro that has not been translated into a standard VERILOG description

Fig. 3.32 A partial view of VERILOG description of the four-bit microprocessor (vsm-microprocessor.TXT)

62

Advanced CMOS Cell Design

3.5.14 Creating the Layout of the Complete Microprocessor To generate a complete layout of the microprocessor, we need to design a cell-based 8 × 8 bit memory that works exactly as the memory macro-cell. This can be done by constructing an array of 8 × 8 register cells based on very simple ring inverters as shown in Fig. 3.33. Data can be written to the memory cell via nMOS N1 when the Write control is high. Data is read from the cell when the Read control is high.

Fig. 3.33 Design of a very simple memory cell based on two ring inverters (Vsm-memorycell.SCH)

The design of an 8 × 8-bit memory array is shown in Fig. 3.34. There are eight memory cells in each row for storing the eight-bits of an instruction. At any one time only one memory location (one row) can be accessed by asserting one of the signals MemLocn0-MemLocn7. These eight signals are generated by the three-to-eight decoder shown in Fig. 3.35 using the three-bit address information (Addr2-Addr0). In Fig. 3.34, either the Read or the Write signal is asserted for a Read or a Write operation. The complete 8 × 8 memory including the address decoder is shown in Fig. 3.36. In order to generate a layout of the microprocessor we replace the memory macro (Vsm-Mem8×8 Macro.sch) used in the microprocessor of Fig. 3.27 with the real 8 × 8 memory block presented in Fig. 3.36. The new microprocessor containing this real memory block is shown in Fig. 3.37. Note that the three-bit address information can be supplied to the memory (VsmMem8×8) either from the top keypad titled Addr or from the program counter using a set of three multiplexers controlled by the WriteMem signal. During memory write operation (WriteMem is high) the address comes from the top keypad. Therefore the user is able to specify the memory addresses where to store instructions. When the processor executes instructions it reads the instructions from memory one after the other according to the addresses supplied from the program counter (WriteMem is low).

A Very-Simple-Microprocessor

Fig. 3.34 An 8 × 8-bit memory array (Vsm-Mem8×8Array.SCH)

Fig. 3.35 A three-to-eight decoder for memory addressing (Vsm-3to8Decoder.SCH)

63

64

Advanced CMOS Cell Design

Fig. 3.36 The complete 8 × 8 memory including address decoder (Vsm-Mem8×8.SCH)

Fig. 3.37 Complete microprocessor containing real memory (Vsm-ProcessorRealMem.SCH)

A Very-Simple-Microprocessor

65

Follow the steps below for entering a program into the memory and then simulating the operation of the processor with the loaded program.

Program Entry Enter the program given in Table 3.5 into the processor memory as follows: • Start simulation in Dsch3. • The processor should be disabled by default. In any case it can be disabled by making sure that MainClear is active (low). • Assert the Memory Write signal by clicking the WriteMem button (high). • Enter address (0) using the top keypad titled Addr. The first memory location is now selected. • Enter the first instruction using the two bottom keypads titled Inst and Data. • Change addresses sequentially and enter the corresponding instructions. • No clocking is necessary for the program entry operation. • When all instructions are entered into the memory, click the WriteMem button in order to disable the memory write operation.

Program Execution • Enable the processor by deactivating the MainClear (high). • Cycle through various phases of processor operation by repeatedly clicking on the MainClock button until all instructions are executed by the processor. • At each active edge of the clock observe the phase counter shifting from phase0 to phase1, then phase2, then phase3, and back to phase0 for the next instruction. • You can observe the intermediate results in the top display (attached to the Arithmetic Unit) as each instruction is read and executed by the processor. • When the OUT instruction is executed, the final result appears on the output display (attached to the Output Register). Figure 3.38 shows the simulation results from execution of the program given in Table 3.5. More details about the implementation of the VSM microprocessor may be found on the web site of MICROWIND [3], and concern the interfacing of the microprocessor to the external world.

3.6 Conclusion In this chapter, the design of a very simple four-bit microprocessor has been presented. The basic processor implements five instructions. This gives the foundations for building more complex processors with extended instruction sets, more sophisticated exchanges between the main memory and the accumulators, and more powerful arithmetic units, in order to build a more attractive microprocessor.

66

Advanced CMOS Cell Design

Fig. 3.38 Simulation results for addition of ‘1’ and ‘2’ by a processor using the program given in Table 3.5

References [1] A. P. Malvino, J.A. Brown, Digital computer electronics, Third Edition, Glenco-Macmillan, 1992, ISBN 0-02-800594-5, USA. [2] E. Sicard, S. Bendhia, Basics of CMOS Cell design, Tata McGraw-Hill, 2005, IBSN 0-07-059933-5. [MICROWIND] The MICROWIND web site is www.microwind.org

EXERCISES 1. Modify the microprocessor in order to handle the MOVE operation from the memory to the accumulator A, according to the recommendations of Fig. 3.29. The instruction code can be 0110, with the op-code MOVE. 2. List the necessary hardware and supplementary control signals in order to perform the STORE operation from accumulator A to a desired memory location. 3. Modify the arithmetic unit in order to perform the Shift Right one bit (SHR, coded 1000) and Shift Left one bit (SHL, coded 1001) operations. What new input do you need to add? In order to reduce the number of ALU controls, how can you handle the ADD, SUB, SHR and SHL signals? 4. Modify the microinstruction controller to handle the ADD, SUB, SHR and SHL operation. 5. Test the new microprocessor with these enhanced functions.

Field-Programmable Gate Array

67

4 Field-Programmable Gate Array This chapter introduces the principles, implementation and programming of configurable logic circuits, from the point-of-view of cell design and interconnection strategy.

4.1 Introduction Field-Programmable Gate Arrays (FPGA) are specific ICs that can be user-programmed easily. The FPGA contains versatile functions, configurable interconnects and an input/output interface to adapt to the user specification. FPGAs allow rapid prototyping using custom logic structures, and are very popular for limited production products. Modern FPGAs are extremely dense, with a complexity of several millions of gates which enable the emulation of very complex hardware such as parallel microprocessors, mixture of processor and signal processing, and so on. One key advantage of FPGAs is their ability to be reprogrammed, in order to create a completely different hardware by modifying the logic gate array. The usual structure of FPGA is given in Fig. 4.1. One example of a very simple function (three-input XOR) implemented in a FPGA is given in Fig. 4.2. Three pads on the left are configured as inputs, one logic block is used to create the three-input XOR and one pad on the right is used as output. The propagation of signals is handled by interconnect lines, connected together at specific programmable interconnect points. Three pads are configured as inputs and represent the logical information A, B and C (Fig. 4.3). An internal routing path is created to establish an electrical link between the I/O region and the logic block. Internally, the logic block may be configured in any combination of sequential basic functions. Each logic block usually supports three to eight logic inputs. In our example, the block is configured as a three-input XOR. Then, other internal routing wires are configured in order to carry out the signal to an I/O pad configured as an output. The global propagation delay of such architecture is evidently very high, if compared to a three-input XOR gate that may be found in the cell library. This is usually the price to pay for configurable logic circuits. Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

68

Advanced CMOS Cell Design

Configured I/O pads

Programmable logic blocks

Programmable interconnected points

Fig. 4.1 Basic structure of an FPGA

Fig. 4.2 Using an FPGA to build a three-input XOR gate

Notice that FPGAs not only exist as simple components, but also as macro-blocks in system-on-chip designs (Fig. 4.4). In the case of communication systems, the configurable logic may be dynamically changed to adapt to improved communication protocols. In the case of very-low-power systems, the configurable logic may handle several different tasks in series, rather than embedding all corresponding hardware that never works in parallel.

Field-Programmable Gate Array

69

Fig. 4.3 Equivalent circuit for FPGA configured in XOR3 gate

Fig. 4.4 FPGAs exist as stand-alone ICs or blocks within a system-on-chip

4.2 Configurable Logic Circuits The programmable logic block must be able to implement all basic logic functions, that is INV, AND, NAND, OR, NOR, XOR, XNOR, and so on. Several approaches are used in the FPGA industry to achieve this goal. The first approach consists in the use of multiplexors, the second one in the use of look-up tables.

70

Advanced CMOS Cell Design

Multiplexors Surprisingly, a two-input multiplexor can be used as a programmable function generator, as illustrated in Table 4.1. Remember that the multiplexor output f is equal to i0 if en = 0, and i1 if en = 1. For example, the inverter is created if the multiplexor input i0 is equal to one, i1 is equal to zero, and enable is connected to A. In that case, the output f is the ~A. Figure 4.5 describes the use of multiplexors to produce the OR, AND, NOT and BUF functions. Table 4.1 Use of multiplexor to build logic functions Function

Boolean expression for output f

i0

i1

en

BUF(A)

f=A

0

A

1

NOT(A)

f=~(A)

1

0

A

AND(A,B)

f=A&B

0

B

A

OR(A,B)

f=A|B

B

1

A

Fig. 4.5 Use of multiplexors to build logic functions (fpgaMux.SCH)

Although NOT, AND and OR are directly available, other functions such as NAND, NOR and XOR cannot be built directly using a single two-input multiplexor, but need at least two multiplexor circuits. The XOR function is shown in Fig. 4.6. The four-input XOR gate would require six multiplexor cells. Remember that each multiplexor cell consists of a minimum of six transistors for a buffered output, and has three delay stages (two inverters and the pass transistor). The XOR4 implementation would comprise

Field-Programmable Gate Array

71

Fig. 4.6 The XOR gate built from two multiplexor circuits (fpgaMux.SCH)

a total of 18 delay stages, which are far too important. Therefore, the multiplexor approach is not very efficient for many logical functions. Look-Up Table The Look-Up Table (LUT) is by far the most versatile circuit to create a configurable logic function [1]. The LUT shown in Table 4.2 has three main inputs F0, F1 and F2. The main output is Fout, which is a logical function of F0, F1 and F2. The output Fout is defined by the values given to Value[0]..Value[7]. The three values F0, F1, F2 create a three-bit address i between zero and seven, so that Fout gets the value of Value[i]. In the example given in Fig. 4.7, the input creates the number “5”, so Value[5] is routed to Fout. Table 4.2 gives Value[i] for the most common logical functions of F0, F1 and F2.

Fig. 4.7 The 3-bit address i selects one of the 8 values

72

Advanced CMOS Cell Design

Table 4.2 Link between basic logic functions and the information stored in Value[0]..[7] Function

Value[0]

Value[1]

Value[2]

Value[3]

Value[4]

Value[5]

Value[6]

Value[7]

~F0

0

1

0

1

0

1

0

1

~F1

0

0

1

1

0

0

1

1

~F2

0

0

0

0

1

1

1

1

F0&F1

0

0

0

1

0

0

0

1

F0|F1|F2

0

1

1

1

1

1

1

1

F0^F1^F2

0

1

1

0

1

0

0

1

In the case of the three-input XOR, (F0^F1^F2) the set of values of Fout given in the truth-table of Table 4.3, must be assigned to Value[0]..Value[7]. In the schematic diagram shown in Fig. 4.8 we must assign manually the Fout truth-table to each of the eight buttons. Then Fout produces the XOR function of inputs F0, F1 and F2. Table 4.3 Truth-table of the three-input XOR gate for its implementation in an LUT F2

F1

F0

Fout= F0^F1^F2

Assigned to

0

0

0

0

Value[0]

0

0

1

1

Value[1]

0

1

0

1

Value[2]

0

1

1

0

Value[3]

1

0

0

1

Value[4]

1

0

1

0

Value[5]

1

1

0

0

Value[6]

1

1

1

1

Value[7]

Memory Points Memory points are essential components of the configurable logic blocks. The memory point is used to store one logical value, corresponding to the logic truth-table. For a three-input function (F0, F1, F2 in the previous LUT), we need an array of eight memory points to store the information Value[0]..Value[7]. There exist here also several approaches to store one single bit of information. The one that is illustrated in Fig. 4.9 consists of D-reg cells. Each register stores one logical information Value[i]. The Dreg cells are chained in order to limit the control signals to one clock ClockProg and one data signal DataProg. The logical data Value[i] is fully programmed by a word of eight bits sent in series to the signal DataProg.

Field-Programmable Gate Array

Fig. 4.8

73

The output f produces a logical function Fout according to an LUT stored in memory point Value[i] (FpgaLutStructure.SCH)

The configuration of the three-input LUT into a three-input XOR gate follows a strict protocol described in Fig. 4.10. A series of eight active edges is generated by the ClockProg signal (Dreg is active on fall edges). This is done by configuring a pulse-generator with series of zero and one as shown below. At each active edge, the shift register is fed by a new value presented sequentially at input DataProg (Fig. 4.11). As the D-reg is active on fall edge, data may be changed on each rise edge. Notice that the last register corresponds to Value[7]. Therefore, Value[7] must be inserted first, and Value[0] last. This means that the DataProg pulse must describe the truth-table in reverse order, as shown below. Most FPGA designs use D-reg cells to store the LUT configuration. Notice that the configuration is lost when the power supply is down. Fuse and Antifuse To retain the configuration even without power supply, non-volatile memories must be used. A one-time programmable non-volatile memory is the fuse [1][2]. Usually, a contact between metal layers is used as a fuse, as an over-current would blow its structure, as illustrated in Fig. 4.12. Although this technique induces severe damages close to the contact, no specific technological layer is required as it is a CMOS compatible approach.

74

Advanced CMOS Cell Design

Fig. 4.9 The look-up information is given by a shift register based on D-reg cells (FpgaLutDreg.SCH)

Fig. 4.10 Programming the ClockProg pulse to generate eight active edges (FpgaLutDreg.SCH)

Field-Programmable Gate Array

75

Fig. 4.11 At the end of the eighth clock period, the LUT is configured as a three-input XOR (FpgaLutDreg.SCH)

Fig. 4.12 Contact fuse

A driver with large channel width (several µm), supplied by the highest available voltage (VDDH) generates a very strong current pulse. The schematic diagram of the fuse circuit is shown in Fig. 4.13. When the command BlowFuse is active, both nMOS and pMOS devices are on, leading to a short circuit current. This current must be higher than 15 mA to destroy the contact. In contrast to the fuse, the normal state of the antifuse is to be opened. In the example shown in Fig. 4.14, a thin insulator interrupts the contact between metal1 and metal2. A very high voltage applied between metal1 and metal2 (typically 10 V) breaks the oxide and provokes a conductive path between the metal layers. The use of very high voltage on the chip requires a careful use of high-voltage MOS, and of specific I/O pads, to ensure that no part of the circuit is damaged. Another popular structure, called ONO (Oxide, Nitride, Oxide) leads to a resistive path when programmed. The typical value of the resistance is 500 Ω. Statistically, the spread of the resistance is much larger for the SiO2 than for the ONO fuse [1]. This makes the ONO fuse more attractive, at the price of supplementary process steps.

76

Advanced CMOS Cell Design

Fig. 4.13 Fuse circuit programming (FuseCircuits.SCH)

Fig. 4.14 The antifuse principles and the comparative resistance spread for ONO and SiO2

Other types of non-volatile memories are being used for hardware programming of FPGA arrays: EEPROM and FRAM memories. These memories are not altered when the power supply is down, and can be reprogrammed a large number of times. These types of memory cells are detailed in Chapter Nine. Implementation in DSCH In DSCH, an LUT symbol is proposed in the symbol menu (Fig. 4.15). It is equivalent to the schematic diagram of Fig. 4.8. An important property of the LUT symbol is its ability to retain the internal programming as a non-volatile memory would do. The user’s interface of the LUT symbol is given in Fig. 4.15. There are three ways of filling the LUT. One consists in defining each array element with a zero or a one. The number corresponds to the logic combination of inputs F2, F1, F0. For example n°4 is coded 100 in binary, corresponding to F2 = 1, F1 = 0 and F0 = 0. A second solution consists in choosing the function description in the list. The logic information Fout assigned to each combination

Field-Programmable Gate Array

77

of inputs updates the LUT. A third solution is also proposed: enter a description based on inputs F0, F1 and F2, and the logic operators “~” (Not), “&” (And), “|” (Or) and “^” (Xor). Then click the button Fill LUT to transfer the result of the expression to the table.

Fig. 4.15 The LUT symbol

4.3 Programmable Logic Block The programmable logic block consists of a LUT, a D-register and some multiplexors. There exist numerous possible structures for logic blocks. We present in Fig. 4.16 a simple structure which has some similarities with the Xilinx XC5200 series (See [1] for detailed information on its internal structure). The configurable block contains two active structures, the LUT and the D-register, that may work independently or be mixed together. The output of the LUT is directly connected to the block output Fout. The output can also serve as the input data for the D-register, thanks to the multiplexor controlled by DataIn_Fout. The DataOut net can simply pass the signal DataIn. In that case the cell is transparent. The DataOut signal can also pass the signal nQ, depending on the multiplexor status controlled by DataIn_nQ

78

Advanced CMOS Cell Design

Fig. 4.16 Simple configurable logic block including the LUT and a D-register (FpgaCell.SCH)

The block now consists of the LUT and the D-register. We chain the information DataIn_Fout and DataIn_nQ on the path of the shift register by adding two supplementary Dreg cells. Each Dreg still uses the same clock ClockProg and chained input data DataProg. The complete circuit is shown in Fig. 4.17.

Fig. 4.17

LUT, D-register and shift register, including the two multiplexor cells (FpgaBlockStructure.SCH)

Field-Programmable Gate Array

79

Configuring of the block is achieved thanks to 10 active clock edges on ClockProg, and 10 serial data bits on DataProg (Table 4.4). The chain of Dreg starts at Dreg0 (upper Dreg in Fig. 4.17, which produced Value[0]) and stops at Dreg9 (right side of Fig. 4.17 which produced DataIn/nQ). The information that flows at the far end of the register chain is defined at the first cycle, while the closest register is configured by the data present at the last active clock edge. Table 4.4 Serial data information used to program LUT memory points

Clock cycle DataProg

1

2

3

4

5

6

7

8

9

10

DataIn/ Nq

DataIn/ Fout

Val [7]

Val [6]

Val [5]

Val [4]

Val [3]

Val [2]

Val [1]

Val [0]

4.4 Interconnection Between Blocks The interconnection strategy between logic blocks is detailed in this paragraph. We shall focus on the programmable interconnect point and the programmable switching matrix. Then, we will discuss the global implementation of the structure. Programmable Interconnect Point The elementary programmable interconnect point (PIP) may be found in the Advanced set of Switches symbols (Fig. 4.18). It consists of a configurable bridge between two interconnects. The PIP may have two states: ‘On’ and ‘Off’. You may switch from ‘On’ to ‘Off’ by a double-click on the symbol (screen shown in Fig. 4.19) and a click on the button On/off. The bridge can be built from a transmission gate, controlled once again by a D-reg cell (Fig. 4.20). When the register information contains a zero, the transmission gate is off and no link exists between Interco1 and Interco2. When the information held by the register is one, the transmission gate establishes a resistive link between Interco1 and Interco2. The resistance value is around 100 Ω. The regrouping of programmable interconnect points into a matrix is of key importance to ensure the largest routing flexibility. Examples of three × three and three × two PIP matrices are shown in Fig. 4.21.

80

Advanced CMOS Cell Design

Fig. 4.18 The PIP in the palette of symbols

Fig. 4.19 Changing the state of PIP (FpgaPip.SCH)

Field-Programmable Gate Array

(a) Switch off

81

(b) Switch on

Fig. 4.20 Internal structure of PIP and illustration of its behavior when (a) Off and (b) On (FpgaPip.SCH)

The link between In1 and Out1, In2 and Out2, In3 and Out3 is achieved by turning some PIP on. A specific routing tool usually handles this task, but the manual re-arrangement is not rare in some complex situations. In DSCH, just press the key “O” to switch the PIP On and Off.

Fig. 4.21 Matrix of PIPs (FpgaPip.SCH)

Switching Matrix The switching matrix is a sophisticated programmable interconnect point, which enables a wide range of routing combinations within a single interconnect crossing. The aspect of the switching matrix is given in Fig. 4.22. The matrix includes six configurable bridges between the two main interconnects. The switching matrix symbol may be found in Advanced set of Switches symbols. By a double-click on the matrix symbol, you can access the six On/Off switches. To ease the programming of the matrix, short-cuts exist in DSCH. You can change the state of the matrix by placing the cursor on the desired symbol and pressing the following keys: • To switch off the matrix, press the key “O”.

82

Advanced CMOS Cell Design

Fig. 4.22 Changing the state of matrix (FpgaMatrix.SCH)

• To switch on the matrix, press the key “O”. • To enable an horizontal link, press the key “-”. • To enable a vertical link, press the key “|”. Examples of three × two and three × three switching matrices are given in Fig. 4.23. The routing possibilities are numerous, which improves the configurability of the logic blocs. Implementation of the Switching Matrix From a practical point-of-view, the switching matrix can be built from a regrouping of six transmission gates (Fig. 4.24). Each transmission gate is controlled by an associated Dreg cell, which memorizes the desired configuration. The D-reg cells are chained so that one single input DataIn and one clock LoadClock are enough to configure the matrix. Array of Blocks The configurable blocks are associated with programmable interconnect points and switching matrix to create a complete configurable core. An example of a double configurable block and its associated configurable routing is proposed in Fig. 4.25.

Field-Programmable Gate Array

83

Fig. 4.23 Three × two switching matrix and example of routing strategy between six inputs and outputs (fpgaMatrix.SCH)

Full-Adder Example The truth-table and logical expression for the full-adder are recalled in Table 4.5. The implementation of the CARRY and SUM functions is achieved by programming two LUTs according to the truth-tables reported in Table 4.4.

Fig. 4.24 Transmission gates placed on routing lines to build the matrix (FpgaMatrix3.SCH)

84

Advanced CMOS Cell Design

Fig. 4.25 Configurable blocks, switching matrix, configurable I/Os and arrays of PIP (fpga2blocks.SCH)

Table 4.5 Full-adder truth-table Full Adder A B

C

SUM

CARRY

RESULT

0

0

0

0

0

0

0

0

1

1

0

1

0

1

0

1

0

1

0

1

1

0

1

2

1

0

0

1

0

1

1

0

1

0

1

2

1

1

0

0

1

2

1

1

1

1

1

3

The general diagram of the full-adder implementation is given in Fig. 4.26. One programmable logic block Block1 supports the generation of the sum for given logic values of the inputs A, B and C. The information needed to configure Block1 as a SUM function (three-input XOR) is given in Table 4.6. Notice that we only use the LUT in this programmable logic block. The Dreg is not active, and we only exploit the output of the LUT Fout, which is configured as the SUM. The signal SUM propagates outside the block to the output interface region by exploiting the interconnect resources and switching matrix. The other programmable logic block Block2 supports the generation of

Field-Programmable Gate Array

85

Fig. 4.26 SUM and CARRY functions to realize full-adder in FPGA (fpgaFullAdder.SCH)

the signal CARRY, from the same inputs A, B and C. The programming of Block2 is also given in Table 4.6. The result CARRY is exported to the output interface region as for the SUM signal. Again, in this block, only the LUT is active. Table 4.6 Serial data used to configure the logic blocks 1 & 2 as SUM and CARRY Block 1 (Sum of F0, F1 and F2) Cycle 1

2

3

4

5

6

7

8

9

10

DataIn Nq

Datain Fout

Val[7]

Val[6]

Val[5]

Val[4]

Val[3]

Val[2]

Val[1]

Val[0]

0

1

0

0

1

0

1

1

0

0

Block 2 (Carry of F0, F1 and F2) Cycle 1

2

3

4

5

6

7

8

9

10

DataIn Nq

Datain Fout

Val[7]

Val[6]

Val[5]

Val[4]

Val[3]

Val[2]

Val[1]

Val[0]

0

1

1

1

0

1

1

0

0

0

The programming sequence is contained in the piece-wise-linear symbols ProgBlock1 and ProgBlock2. As seen in the chronograms of Fig. 4.28, the program clock ClockPgm is only active at the initialization phase, to shift the logic information to the memory points inside the blocks which configure each multiplexor. The routing of the signals A, B and C as well as Sum and Carry has been done manually in the circuit shown in Fig. 4.27. In reality, specific placement/routing tools are provided to generate the electrical structure automatically from the initial schematic diagram, which avoids manual errors and limits conflicts or omissions.

86

Advanced CMOS Cell Design

Fig. 4.27 Simulation of the full-adder implemented in two configurable blocks (fpgaFullAdder.SCH)

Fig. 4.28 Chronograms of the full-adder FPGA (fpgaFullAdder.SCH)

Field-Programmable Gate Array

87

Clock Divider Example A second example is proposed as an application of the FPGA circuits. It concerns clock division. We recall in Fig. 4.29 the general structure and the typical chronograms of the clock division by four, which requires two Dreg cells, with a feedback from the output ~Q to the input D.

Fig. 4.29 Diagram and typical simulation of the clock divider by four (ClockDiv4.SCH)

The general diagram of the clock divider implementation is given in Fig. 4.30. Each programmable logic block is configured as a single-stage clock divider. The information needed to configure Block1 as a simple Dreg function is given in Table 4.7. This serial data information creates a direct path from DataIn to input D of the Dreg cell, while nQ propagates to DataOut, as detailed in Fig. 4.31.

Fig. 4.30

Implementation of the clock divider in two configurable blocks (FpgaDiv4.SCH)

88

Advanced CMOS Cell Design

Fig. 4.31 Use of the configurable block as a DReg (FpgaDiv4.SCH)

Table 4.7 Serial data used to configure the logic blocks 1 & 2 as clock dividers (FpgaDiv4.SCH) Block 1 (DataOut=nQ, D=DataIn) Cycle 1

2

3

4

5

6

7

8

9

10

DataIn Nq

Datain Fout

Val[7]

Val[6]

Val[5]

Val[4]

Val[3]

Val[2]

Val[1]

Val[0]

0

0

0

0

0

0

0

0

0

1

Block 2 (DataOut=nQ, D=DataIn) Cycle 1

2

3

4

5

6

7

8

9

10

DataIn Nq

Datain Fout

Val[7]

Val[6]

Val[5]

Val[4]

Val[3]

Val[2]

Val[1]

Val[0]

0

0

0

0

0

0

0

0

0

1

Outside the programmable block, the signal nQ propagates to the input DataIn. Notice that the LUT is inactive in this configuration. The other programmable logic block Block2 is also programmed as a Dreg circuit with a feedback from nQ to DataIn (Fig. 4.31). The simulation of the counter is proposed in Fig. 4.33. The first nanoseconds are dedicated to the programming of the blocks. Once properly configured, the counter starts to work according to the

Field-Programmable Gate Array

89

Fig. 4.32 Routing of the clock divider in two configurable blocks (FpgaDiv4.SCH)

Fig. 4.33 Chronograms of the clock divider circuit (ClockDiv4.SCH)

specifications of Fig. 4.29. Notice the very important delay in responding to the active edges. This is due to the intrinsic complexity of the configuration block, and to the long interconnect delay through the connection points and switching matrix.

90

Advanced CMOS Cell Design

4.5 Conclusion In this chapter, we have given a brief introduction to field programmable gate arrays, from the point of view of cell design. Firstly, the use of multiplexor and look-up-tables for building configurable logic circuits has been illustrated. Secondly, the programming of memory points using chained registers and fuse has been described. Thirdly, we have described the programmable interconnect points and switching matrix, with their implementation in DSCH. Finally, the implementation of a full adder and a clock divider have been performed using two configurable logic blocks, programmable interconnect points and switching matrix.

References [1] Michael. J.S. Smith, Application Specific Integrated Circuits, Addison Wesley, 0-201-50022-1. [2] A.K. Sharma, Semiconductor Memories, Technology, Testing and Reliability, IEEE Press, 1997, ISBN 0-7803-1000-4. [3] John P. Uyemura, Chip Design for Submicron VLSI: CMOS Layout and Simulation, 2006, ISBN 0-534-46629-X.

EXERCISES 4.1 Using DSCH, configure the 16 switching matrix in order to connect: switch1 to lights L3 and L5, switch2 to lights L1 and L6, switch3 to lights L2 and L4, switch4 to lights L7 and L8.

Fig. 4.34 Routing exercise

4.2 Store the following eight bits (01110111) (reading from left to right) in the LUT, as in Fig. 4.4. How many active edges on ClockProg do you need to configure the LUT? Which logical function have you realized?

Field-Programmable Gate Array

Answer: (a) 8

91

(b) Fout = F 2 + F1 ⋅ F 0

4.3 Store the eight bits (00000111) (reading from left to right) in the LUT of Fig. 4.8. Demonstrate that you have realized the following logical function: F 2 + ( F1 ⋅ F 0) ) . Using two LUTs and one inverter, create the D_Latch shown below. Give the serial data sequence for DataProg.

Fig. 4.35 Implementing a D_latch in FPGA

Answer: (a) F 0 ⋅ F1 ⋅ F 2 + F 0 ⋅ F1 ⋅ F 2 + F 0 ⋅ F1 ⋅ F 2 = F 2 + ( F1 ⋅ F 0 ) (b) LUT N°1: F 0 = Data, F1 = Clock , F 2 = nQ , DataProg = 00000111, LUT N°2: F 0 = Data, F1 = Clock , F 2 = Q , DataProg = 00000111 4.4 How many programmable logic blocks do you need to create the following asynchronous counter (Fig. 4.36)? Give the programmable sequences for each block.

Fig. 4.36 Implementing an asynchronous counter in FPGA

Answer: Three blocks

92

Advanced CMOS Cell Design

4.5 Using programmable logic blocks create a one-bit comparator. The truth-table is given below. Table 4.8 The comparator A

B

A>B

A
A=B

0

0

0

0

1

0

1

0

1

0

1

0

1

0

0

1

1

0

0

1

Radio-Frequency Circuits

93

5 Radio-Frequency Circuits This chapter describes the general context of radio-frequency circuit design, integrated LC resonators, power amplifiers, high performance oscillators and frequency up/down converters.

5.1 Target Radio-Frequencies Wireless communication systems such as those described in Table 5.1 require specific radio-frequency ICs, which mean optimum performances. The radio-frequency ICs have to deal with traditional requirements such as low power consumption or high speed, and also with low process-variation influence, power efficiency, linearity, low temperature influence, and low noise sensitivity. Table 5.1 Selected applications using radio-frequency ICs Application GSM

DCS

UMTS

Bluetooth

IEEE 802.11a

IEEE 802.11b

Description

Mobile Mobile Mobile phone phone phone 1st generation 2nd generation 3rd generation

Wireless network

Very high High rate rate wireless wireless networking networking

Frequency (MHz)

890–915

1800–1900

1910–2200

2450

5200

Data rate

12 Kb/s

100 Kb/s

0.1–2 Mb/s

36–723 Kb/s 6–18 Mb/s

1–5 Mb/s

Output Power

1–2 Watts

1 Watt

1 Watt

100 mW

0.1-1 Watt

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

0.1–1 Watt

2450

94

Advanced CMOS Cell Design

Modern radio-frequency equipments operate at frequency ranges officially called ultra-high frequencies (UHF) ranging from 300 MHz to 3 GHz, and super-high frequencies (SHF) ranging from 3 GHz to 30 GHz. The “HF” bandwidth designates the bandwidth 3-30 MHz. Mobile phones and wireless networking have been the driving applications of radio-frequency ICs, as described in Fig. 5.1.

Fig. 5.1 Some key radio-frequency applications

The general diagram of a mobile phone is given in Fig. 5.2. The circuits detailed in this chapter refer mainly to the oscillators (VCO), the amplifiers and the filters.

5.2 Inductor Inductors are commonly used for filtering, amplifying, or for creating resonant circuits used in radiofrequency applications. The inductance symbol in DSCH and MICROWIND is as follows (Fig. 5.3). The layout of an on-chip inductor is typically a square spiral, since standard CMOS processes constrain all angles to be 90° (Fig. 5.4). When possible, a polygon spiral using 45° tracks is used to increase the electrical performances of the inductor. There exist a huge number of inductance calculation techniques, as detailed in the review from [6]. A very interesting discussion about square planar spiral inductor may be found in [4]. The inductance formula used in MICROWIND (Equation 5.1) is one of the most widely known approximations, proposed as early as 1928 by [5], which is said to be still accurate for the evaluation of the on-chip inductor. With five turns, a conductor width of 20 µm, a spacing of 5 µm and a hollow of 100 µm, we get L = 11.6 nH.

Radio-Frequency Circuits

Fig. 5.2 Generic diagram of the mobile phone structure

Fig. 5.3 The inductance symbol

Fig. 5.4 An integrated inductor

L = 37.5 m0 .

n2 . a 2 (22.r − 14. a )

with r = n .( w + s) m0 = 4p.10–7 n = number of turns

(Eq. 5.1)

95

96

Advanced CMOS Cell Design

w = conductor width (m) s = conductor spacing (m) r = radius of the coil (m) a = square spiral’s mean radius (m) The quality factor Q is a very important metric to quantify the resonance effect. A high quality factor Q means low parasitic effects compared to the inductor effect. The formulation of the quality factor is not as easy as it appears. An extensive discussion about the formulation of Q depending on the coil model is given in [4]. We consider the coil as a serial inductor L1, a parasitic serial resistor R1, and two parasitic capacitors C1 and C2 to the ground, as shown in Fig. 5.5. Consequently, the Q factor is approximately given by Equation 5.2.

Q =

L1 (C1 + C 2) R1

(Eq. 5.2)

Fig. 5.5 Equivalent model of 12 nH default coil and approximation of quality factor Q

5.2.1 Inductor Design in MICROWIND We investigate here the design of a rectangular on-chip inductor, the layout options and the consequences on the inductor quality factor. The inductor can be generated automatically by MICROWIND using the command Edit → Generate → Inductor. The inductance value as well as the parasitic resistance and the resulting quality factor Q appear at the bottom of the window. Using the default parameters, the coil inductance approaches 12 nH, with a quality factor of 1.15. The corresponding layout is shown in Fig. 5.6. Notice the virtual inductance (L1) and resistance (R1) symbols placed in the layout. These symbols indicate to the extraction that three separate electrical nodes are requested (A, B and C), with a serial inductor between A and B, and a serial resistance between B and C. If these symbols were omitted, the whole inductor would be considered as a single electrical node. Only the capacitance (C1, C2) would be properly extracted.

Radio-Frequency Circuits

97

Fig. 5.6 The inductor generated by default (inductor12nH.MSK)

5.2.2 Inductor Impedance On-chip inductance has a typical value ranging from 0.1 nH to 100 nH, which give an equivalent impedance between 10 and 1000 ohm, within the radio-frequency range 300 MHz-3 GHz (Fig. 5.7), by applying the formula of impedance versus frequency (Equation 5.3). ZL = jLw

(Eq. 5.3)

At frequencies lower than 100 kHz, discrete off-chips are used because of the high inductor values (from 1 µH to 100 µH) to keep the impedance between 10 and 1000 Ω. Such high inductances cannot be integrated in a reasonable silicon area. Around 1 GHz, a 10 nH on-chip inductor matches the standard 50 W impedance of most input and output stages in very high frequency applications. 5.2.3 High-Quality Inductor A high quality factor Q is attractive because it permits high voltage gain, and high selectivity in the frequency domain. The usual value for Q is between three and 30. The main limiting factors for Q are the serial resistance R1 of the wire and the substrate-coupling capacitors C1 and C2. From Equation 5.2, it is clear that R1, C1 and C2 should be kept as low as possible to increase Q. There are several ways to improve the coil quality factor. The first one consists of using the upper metal layer (metal6 in 0.12 µm) which features a smaller sheet resistance together with a smaller capacitance. Unfortunately, the quality factor is only increased to two.

98

Advanced CMOS Cell Design

Fig. 5.7 The inductor impedance versus frequency

A significant improvement consists in using metal layers in parallel (Fig. 5.8). The selection of metal2, metal3, up to metal6 reduces the parasitic resistance of R1 by a significant factor, while the capacitance of C1 and C2 are not changed significantly. The result is a quality factor near 13, for a 3 nH inductor. Even when the conductor width is increased to further reduce R1, or if the number of turns and the coil shape are changed, the maximum Q is almost invariably below 20.

Radio-Frequency Circuits

99

Fig. 5.8 A 3D view of a high Q inductor using metal layers in parallel (Inductor3nHighQ.MSK)

5.2.4 Resonance The coil can be considered as an RLC resonant circuit. At very low frequencies, the inductor is a short circuit, and the capacitor is an open circuit (Fig. 5.9 left). This means that the voltage at node C is almost equal to A, if no load is connected to node C, as almost no current flows through R1. At very high frequencies, the inductor is an open circuit, the capacitor a short circuit (Fig. 5.9 right). Consequently, the link between C and A tends towards an open circuit.

Fig. 5.9 The behavior of an RLC circuit at low and high frequencies (Inductor.SCH)

At a very specific frequency the LC circuit features a resonance effect. The theoretical formulation of this frequency is given by Equation 5.4.

100 Advanced CMOS Cell Design

fr =

1 2p L1(C1 + C 2)

(Eq. 5.4)

The variation of the resonant frequency with the capacitor and inductor is proposed in Fig. 5.10. Onchip coil inductances are within the range of 0.1 nH to 100 nH. As the capacitance may vary from one pF to one nF, the range of the resonant frequency is around 100 MHz to 10 GHz, which includes most of the radio-frequency designs.

Fig. 5.10 The resonant frequency depending on capacitance and inductance

In the Analysis menu, the command Resonant Frequency includes a resonant frequency calculator, as shown in Fig. 5.11. For a given value of inductance and capacitance (3 nH and 1.4 pF in this example), the resonant frequency is directly computed in mega-hertz (MHz). For a target frequency of 2.45 GHz, and a given inductance value of 3.0 nH, we must choose a capacitance close to 1.4 pF. 5.2.5 Simulation of the Coil In the case of L1 = 3 nH (design corresponding to Fig. 5.12), the total capacitance is around 7 pF. From the abacus given in Fig. 5.10, we obtain a resonant frequency around 1 GHz. We may see the resonance effect of the coil and an illustration of the quality factor using the following procedure. The node A is controlled by a sinusoidal waveform with increased frequency (also called “chirp” signal). We specify a very small amplitude (0.1 V), and a zero offset. The resonance can be observed when the voltage at

Radio-Frequency Circuits 101

nodes B and C is higher than the input voltage A. The ratio between B and A is equal to the quality factor Q.

Fig. 5.11 Microwind can compute the resonant frequency corresponding to user-defined L and C values

Fig. 5.12 Using a sinusoidal waveform with increased frequency (Inductor3nHighQ.MSK)

102 Advanced CMOS Cell Design

Fig. 5.13 The behavior of an RLC circuit near resonance (Inductor3nHighQ.MSK)

The frequency corresponding to the resonance is around 2.4 GHz (Fig. 5.13), as predicted by the theoretical formulation. However, some mismatch between the prediction and the simulation may appear: first of all, the sinusoidal generator forces node A to a given voltage, which inhibits the role of capacitor C1. The resonance is only based on L1, R1 and C2, which shifts the frequency to higher levels. Secondly, the simulation of the inductor effect requires a significant amount of computation, with a high precision, otherwise the simulation becomes unstable. In 0.12 µm, the simulation step is fixed to 0.3 ps, which is a good compromise between accuracy and speed. However, when dealing with an inductor, this step should be reduced. If we increase the step to one ps (Fig. 5.14a), an important parasitic instability effect appears and the output tends to oscillate. With a small simulation step (0.1 ps in the case of Fig. 5.14b), the simulation converges but the computation is significantly slowed down.

5.3 Power Amplifier The power amplifier (PA) is part of the radio-frequency transmitter, and is used to amplify the signal being transmitted to an antenna so that it can be received at the desired distance (Fig. 5.15). The PA is at the end of a signal processing chain made of several blocks. Numerical or analog information is processed at low frequency, and converted to an appropriate sinusoidal waveform combined with a modulation. The high-frequency converter transforms the low-frequency signal flow to a high-frequency signal fhigh. The shape of fhigh is identical to flow except that the frequency is one or two orders of magnitude higher.

Radio-Frequency Circuits 103

Fig. 5.14 Numerical instability appears in inductor simulation when using a large integration interval (one ps) which is removed when lowering this interval to 0.1 ps

Details about this circuit are provided later in this chapter. The amplitude of fhigh is usually small (10–100 mV). A power amplifier is required to multiply the amplitude of the signal in order to transmit enough power to the emitting antenna.

Fig. 5.15 Power amplifier in a typical radio-frequency system

104 Advanced CMOS Cell Design

5.3.1 Antenna Model We can consider an antenna as a load that, in a first order approximation, can be considered as a simple resistance. The antenna resistance Ra accounts for the power absorbed by the antenna. This power is mainly radiated by the antenna. Most mobile phone antennas are resonant monopoles [1] for which the antenna resistance Ra varies from 20 Ω (ground plane width w = 0) to 40 Ω (infinite ground plane width). The monopole radiates mainly on X and Y directions (Fig. 5.16). The length of the antenna is often chosen close to λ/4, where λ is the wavelength of the emitted signal. That length corresponds to the first maximum in the sinusoidal wave. From an electrical point of view, we can consider the antenna as a pure resistive load. The value of 50 Ω is commonly used for Ra in simulations, as most equipments are “50 Ω adapted”.

Fig. 5.16 In first approximation, the antenna can be approximated as a load resistance 20–40 Ω.

5.3.2 Power Amplifier Principles Most CMOS power amplifiers are based on a single MOS device, loaded with a “Radio-Frequency Choke” inductor LRFC, as shown in Fig. 5.17. The inductor serves as a load for the MOS device (at a given frequency f, the inductor is equivalent to a resistance L.2π.f ), with two significant advantages as compared to the resistor: the inductor does not consume DC power, and the combination of the inductor and the load capacitor CL creates a resonance. The power is delivered to the load RL, which is often represented by 50 Ω resistance. This load is, for example, the antenna monopole, which can be assimilated to a radiation resistance, as described in the previous section. The resonance effect is obtained between LRFC and CL. The formula (Equation 5.5) for resonance is given below. fresonance =

1 2p LRFC CL

(Eq. 5.5)

For example, a power amplifier designed for Bluetooth operation should resonate around 2.4 GHz. If we assume that the inductance has a value of 3 nH, the corresponding capacitor is around 1.5 pF.

Radio-Frequency Circuits 105

Fig. 5.17 The basic diagram of a power amplifier (PowerAmp.SCH)

5.3.3 Power Amplifier MOS The MOS devices used in power amplifier designs must have very huge current capabilities to be able to deliver strong power on the load. This leads to very unusual constraints on the width of the transistor so that devices with a width larger than 1000 µm are commonly implemented [7]. The radio frequency choke inductor has a resonant effect which induces an important voltage swing of node Vout. Consequently, high voltage MOS devices are used to handle large overvoltages. A MOS device with a very large width is not drawn directly, but is obtained by connecting medium-sized MOS devices in parallel. In MICROWIND, we generate multiple-finger MOS devices easily, thanks to the MOS generator command (Fig. 5.18). The high voltage option is selected, and the number of fingers is fixed to 10.

Fig. 5.18 Generating a transistor with large current capabilities (PowerAmplifier.MSK)

106 Advanced CMOS Cell Design

The layout generated by MICROWIND is completed by adding a polarization ring to VSS, and metal2 contacts to the gate (Signal VRF_In) and the drain (Signal Vout). The result is shown in Fig. 5.19. The maximum current is close to 40 mA. A convenient way of generating the polarization ring consists in using the Path generator command, and in selecting the option Metal and p-diffusion. Then the location for the polarization contacts must be drawn in order to complete the ring.

Vout

VRF_in

Fig. 5.19 The layout of power MOS also includes a polarization ring, and the contacts to metal2 connections to VRF_in and Vout (PowerAmplifier.MSK)

An example of 160 mA power device is shown in Fig. 5.20. Four devices are connected in parallel. The output node drives a large current and must be designed as wide as possible, with a short connection to the output pad to limit the serial resistance and parasitic capacitor to ground. The ESD protection is removed in some cases to enhance the power amplifier performances [7]. In the characteristics Id/Vd, the maximum Ion current is close to 170 mA (Fig. 5.21). The ground connection also drives a strong current and must be carefully connected to the ground supply.

Radio-Frequency Circuits 107

Fig. 5.20 The layout of 160 mA a power MOS using four large MOS in parallel (PowerAmplifierBig.MSK)

Fig. 5.21 Static characteristics of the 160 mA power MOS (PowerAmplifierBig.MSK)

108 Advanced CMOS Cell Design

5.3.4 Power Amplifier Efficiency One of the most important characteristics of the power amplifier is the power efficiency, also called ‘drain efficiency’ [4]. The definition of drain efficiency is given by Equation 5.6. The power efficiency is a ratio between the power delivered to the load and the supply power. The power efficiency (PE) is usually given in %. Typically, PE ranges from 25 to 50%. The PE of CMOS power amplifiers is usually lower than 30%. Higher efficiency is obtained with bipolar or GalliumArsenide (GaAs) semiconductors. PE = h =

PRF_out PDC

(Eq. 5.6)

Where PRF_out is the RF output power (in Watt) PDC is the total power delivered from the supply (in Watt) We may evaluate the power efficiency of the power amplifier with MICROWIND, using the following simulation procedure. The power amplifier is designed with a virtual load (RL = 50 Ω in the case of Fig. 5.22). Notice that the connection of the RL virtual load is unusual: one end of the resistor is connected to VDD rather than VSS. Connecting RL to ground would add a very important standby DC current, flowing through RL even without RF input. In reality, the RL resistor represents the antenna radiation resistance which has no direct path to ground. To avoid the parasitic DC contribution, we connect one end of RL to VDD.

Fig. 5.22 Evaluation of the power amplifier efficiency (PowerAmp.SCH)

The layout corresponding to the power amplifier is shown in Fig. 5.23. The inductor is virtual, as well as the 50 Ω load. By default, the power PDC is computed at each simulation and appears at the right-lower corner of the simulation window of MICROWIND. In the simulation window corresponding to the mode Current And Voltage vs. Time, we select the current flowing in R (50 Ω). At the end of the simulation (Fig. 5.24), the evaluation of the power efficiency is also displayed.

Radio-Frequency Circuits 109

Fig. 5.23 Evaluation of the power amplifier efficiency (PowerAmplifierEfficiency.MSK)

Fig. 5.24 Evaluation of the power amplifier efficiency is accessible in ‘Voltage and Current vs. Time’ mode, by selecting virtual load (PowerAmplifierEfficiency.MSK)

110 Advanced CMOS Cell Design

From the simulation of the simple power amplifier, we obtain a power efficiency of 2.5%, which is extremely low (Fig. 5.24). In other words, 97.5% of the supply energy is dissipated and lost in the circuit, with only 2.5% delivered to the load. There are several techniques to improve the power efficiency: increasing the MOS size, modifying the amplitude of the input sinusoidal wave, and modifying the DC offset of the input sinusoidal wave. Another metric for the power amplifier efficiency is the power-added efficiency or PAE [7]. The PAE is very similar to Equation 5.6. It includes the input power PRF_in as given in Equation 5.7. MICROWIND does not evaluate this parameter directly. ⎛ PRF_out − PRF_in ⎞ PAE = ⎜ ⎟ PDC ⎝ ⎠

(Eq. 5.7)

Where PRF_out is the RF output power (in Watt) PRF_in is the RF input power (in Watt) PDC is the total power delivered from the supply (in Watt) 5.3.5 Class A Power Amplifier The distinction between class A, B, AB, and such amplifiers is mainly given with regard to the polarization of the input signal. A Class A amplifier is polarized in such a way that the transistor is always conducting. The MOS device operates almost linearly. An example of power amplifier polarized in Class A is shown in Fig. 5.25. The power MOS is designed to be very big in order to improve the power efficiency.

Fig. 5.25 The Class A amplifier design with a very large MOS device (PowerAmplifierClassA.MSK)

Radio-Frequency Circuits 111

The sinusoidal input offset is 1.3 V, the amplitude is 0.4 V. The power MOS functional-point trajectory is plotted in Fig. 5.26 and is obtained using the command Simulate on Layout. We see the evolution of the functional point with the voltage parameters: as Vgs varies from 0.9 V to 1.7 V, Ids fluctuates between 20 mA to 70 mA. The MOS device is always conducting, which corresponds to Class A amplifiers.

Fig. 5.26 The Class A amplifier has a sinusoidal input (PowerAmplifierClassA.MSK)

The main drawback of Class A amplifiers is the high bias current, leading to a poor efficiency. In other words, most of the power delivered by the supply is dissipated inefficiently. The power efficiency is around 11% in this layout. The main advantage is the amplifier linearity, which is illustrated by a quasisinusoidal output Vout, as seen in Fig. 5.27. 5.3.6 Class B Amplifier In Class B, the MOS device only conducts for half a cycle. The monitoring of the current flowing in the power MOS shows a peak of current over the first half of the input period. Over the other half, the power MOS is off, and the LC resonator transmits the power to the 50 Ω load. The power efficiency rises to 20%. The main drawback is the severe distortion of the output voltage (Fig. 5.28), which was much less visible with the Class A polarization. The intermediate class, called AB, corresponds to a conduction between half the cycle and the full cycle. An evaluation of the spectral contents of the output node may be performed by the Fourier transform, with a plot in logarithmic scale. The Fourier transform is accessible on the simulation menu, through the button FFT. The fast Fourier transform translates the voltage waveform of the selected node into an evaluation of the energy of the signal versus its frequency. The energy plot shown in Fig. 5.29 reveals a peak near 2.5 GHz. A noticeable energy is found on the second harmonic (2 f“0” = 4900 MHz) and third harmonic (3f“0” ). This is the consequence of a non-linear amplifier device.

112 Advanced CMOS Cell Design

Fig. 5.27 The Class A amplifier simulation (PowerAmplifierClassA.MSK)

Fig. 5.28 The class distinction for power amplifier is linked to DC value of the input signal

Radio-Frequency Circuits 113

Fig. 5.29 The Class B amplifier is less linear than the Class A amplifier (PowerAmplifierClassB.MSK)

5.3.7 Other Classes In Class C, the conduction occurs for less than half the cycle. The increase of efficiency obtained by reducing the conduction period is achieved at the expense of a reduced output power delivered to the load. The Class E amplifier schematic diagram is shown in Fig. 5.30. A band-pass filter (LHF, CHF) is added to the output stage and fitted to the VRFin input frequency. The effect of this passive circuit is to decrease the amplitude of the parasitic harmonics due to the non-linear nature of the amplifier, and to pass the desired frequency contribution. In some particular cases, the 3rd or even 5th harmonic is the desired one (such as in 77 GHz automotive radars for example, where the amplifier also serves as a frequency shifter).

Fig. 5.30 Class E amplifier (PowerAmpl.SCH)

114 Advanced CMOS Cell Design

The power stage is coupled to the resonator through a coupling capacitor Cc. The role of Cc is to transfer the energy to the load, without any DC path between the supply and the load. The MOS drain can reach very high values when the switch is OFF. Consequently, a high breakdown voltage transistor is required. The theoretical efficiency of class E amplifier is higher than 50%. 5.3.8 Self Heating Self heating refers to the temperature rise that can occur in power devices, due to excessive heat energy accumulated before being dissipated through the substrate, the package and ultimately through the air. The thermal time constant is in the order of one micro-second. Simulations usually consider a typical temperature of 25°C. This is realistic in the case of low power dissipation (some milli-Watts). In the case of hundreds of milli-Watts, the simulation should take into account a significant temperature rise near the device. For example, a temperature of 80°C is commonly considered in medium-power devices (below 1 W). In some cases, the IC may operate up to 250°C. In MICROWIND, the operating temperature may be changed in the menu Simulator Parameters of the simulation menu. In the window shown in Fig. 5.31, the temperature is fixed to 85°C.

Fig. 5.31 Setting up a high temperature for analog simulation

5.4 Oscillators The role of oscillators is to create a periodic logic or analog signal with a stable and predictable frequency. Oscillators are required to generate the carrying signals for radio-frequency transmission, and also the main clocks of processors. 5.4.1 Ring Oscillator The ring oscillator is a very simple oscillator circuit, based on the switching delay existing between the input and output of an inverter. If we connect an odd chain of inverters, we obtain a natural oscillation,

Radio-Frequency Circuits 115

with a period which corresponds roughly to the number of elementary delays per gate. The fastest oscillation is obtained with three inverters (one single inverter connected to itself does not oscillate). The usual implementation consists in a series of five to one hundred chained inverters. Usually, one inverter in the chain is replaced by a NAND gate to enable the oscillation (Fig. 5.32).

Fig. 5.32 A ring oscillator is based on an odd number of inverters (Inv3.SCH)

Fig. 5.33 The implementation of a three-inverter oscillator (Inv3.MSK)

The three-inverter ring oscillator layout is shown in Fig. 5.33. The right-most inverter output is connect the left-most inverter input by a metal bridge to create the desired feedback. Notice that no clock is assigned in this layout as the oscillation appears naturally, because of an intrinsic instability. The simulation of Fig. 5.34 shows the “warm-up” of the inverter circuit followed by a stable frequency oscillation.

116 Advanced CMOS Cell Design

Fig. 5.34 Simulation of the three-inverter ring oscillator (Inv3.MSK)

The main problem of this type of oscillator is the very strong dependence of the output frequency on virtually all process parameters and operating conditions. As an example, the power supply voltage VDD has a very significant influence on the oscillating frequency (Fig. 5.35). This dependency can be analyzed using the Parametric Analysis in the Analysis menu. Several simulations are performed with VDD varying from 0.8 V to 1.4 V with a 50 mV step. We clearly observe a very significant increase in the output frequency with VDD (almost a factor of two between the lower and upper bounds). This means that any supply fluctuation has a significant impact on the oscillator frequency.

Fig. 5.35 Oscillator frequency variation with the power supply (Inv3.MSK)

Radio-Frequency Circuits 117

The oscillation frequency of the ring oscillator is neither stable, nor controllable, and not even precisely predictable, as it is based on the switching characteristics of logic gates which may fluctuate +/–20%. A Monte-Carlo analysis is performed in Fig. 5.36, to observe the technology variation influence on the oscillator frequency. The basic principle of this analysis is to sort a set of technological parameters in a random way, and to conduct the complete analog simulation for each random set. Each point in the Xaxis corresponds to one simulation, with a specific set of parameters.

Fig. 5.36 The process variations also have a direct impact on the switching frequency (Inv3.MSK)

There is no correlation between adjacent points, because of the random nature of different conditions. We observe again the significant fluctuation of the oscillator frequency. As a conclusion, ring oscillators have poor performances, and may only be used in low-performance clocking systems, or for a dynamic characterization of the technology. The design of several ring oscillators on CMOS test chips was also done to tune MICROWIND simulations with real-case ring oscillator measurements, and a good correlation between measured and simulated oscillator frequency was obtained. 5.4.2 Random Simulation In Microwind, the threshold and mobility parameters vary with a Gaussian distribution, with a typical variation of 10%. The normal distribution of the threshold voltage Vt corresponds to a density of probability following the Equation 5.8. The aspect of f versus Vt is given in Fig. 5.37. fNt =

− 1 e 2ps

(Vt − Vt0 )2 2s 2

(Eq. 5.8)

118 Advanced CMOS Cell Design

where f is the density of probability for a given value of the threshold voltage Vt (zero to one) s = 0.1 (equivalent to 10% typical fluctuation of the parameter) Vt0 = typical threshold voltage (0.4 V) Vt = variable threshold value (V)

Fig. 5.37 The normal distribution of Vt, with a typical variation of 10%

5.4.3 LC Oscillator The LC oscillator proposed in this paragraph is not based on logic delay, as with the ring oscillator, but on the resonant effect of a passive inductor and capacitor circuit. In the schematic diagram of Fig. 5.38, the inductor L1 resonates with the capacitor C1 connected to S1, combined with C2 connected to S2. Fig. 5.38

A differential oscillator using an inductor and companion capacitor (OscillatorDiff.SCH)

Radio-Frequency Circuits 119

The layout implementation is performed using a three nH virtual inductor and two one pF capacitors (Fig. 5.39). Notice the large width of active devices to ensure a sufficient current to charge and discharge the huge capacitance of the output node at the desired frequency. Using virtual capacitors instead of on-chip physical coils is recommended during the development phase. It allows an easy tuning of the inductor and capacitor elements in order to achieve the correct behavior. Once the circuit has been validated, the L and C symbols can be replaced by physical components. The time-domain simulation (Fig. 5.40) shows a warm-up period around one ns where the DC supply rises to its nominal value, and where the oscillator effect reaches a permanent state after some nanoseconds. The measured frequency approaches 3.75 GHz with a three nH inductor L1 of and one pF capacitors C1 and C2. The Fourier transform of the output s1 reveals a main sinusoidal contribution at f 0 = 3.725 GHz as expected, and some harmonics at 2xf 0 and 3xf 0 (Fig. 5.41). The remarkable property of this circuit is its ability to remain in a stable frequency even if we change the supply voltage or the temperature, which features a significant improvement as compared to the ring oscillator. Furthermore, the variations of the MOS model para-meters have almost no effect on the frequency. Fig. 5.39

Vss+ Vdd+

s2

Vss– A differential oscillator using three nH inductor (OscillatorDiff.MSK)

Fig. 5.40 Simulation of the differential oscillator (OscillatorDiff.MSK)

120 Advanced CMOS Cell Design

For example, we may investigate the effect of VDD on the resonating frequency by manually lowering VDD from 1.2 V down to 0.9 V in the menu Simulate → Simulation parameters. The result is a significant increase of the warm-up phase, while the final oscillation frequency remains unchanged. A parametric analysis on VDD, from 0.7 to 1.4 V, confirms that the LC oscillator performs much better than the ringinverter oscillator, as it turns out to be almost immune to supply voltage fluctuations. Unfortunately, the inductance of an on-chip coil is not perfectly constant, as the material resistance, conductor width and oxide thickness may vary by several percent. The capacitance of a poly/poly2 structure, used for implementing the passive capacitor, may also vary due to the process fluctuation’s impact on the inter-layer oxide. The temperature also has an influence on the capacitance value [9].

Fig. 5.41

Frequency spectrum of the oscillator (OscillatorDiff.MSK)

In MICROWIND, the Monte-Carlo simulation mode also impacts the value of all virtual elements in a similar way as for the threshold voltage and the mobility: before the simulation starts, the L and C values are assigned a value that fluctuates by +/–10% with a normal distribution around the user-defined impedance. The result is a significant variation of the oscillator frequency with the process parameters (Fig. 5.42).

Fig. 5.42

Frequency of the LC oscillator varies with process parameters, mainly due to capacitor and inductor process dependence (OscillatorDiff.MSK)

Radio-Frequency Circuits 121

It can be concluded that a predictable and stable frequency oscillation is very hard to obtain on-chip, without any external high precision component. In radio-frequency applications, a base frequency is always delivered by a quartz, which is the best discrete device to create an almost-perfect oscillation circuit. 5.4.4 Voltage-Controlled Oscillator The voltage-controlled oscillator (VCO) generates a clock with a controllable frequency. The VCO is commonly used for clock generation in phase-lock loop circuits, as described later in this chapter. The clock may vary typically by +/–50% of its central frequency. A current-starved VCO is shown in Fig. 5.43 [2] . The current-starved inverter chain uses a voltage control Vcontrol to modify the current that flows in the N1, P1 branch. The current through N1 is mirrored by N2, N3 and N4. The same current flows in P1. The current through P1 is mirrored by P2, P3, and P4. Consequently, the change in Vcontrol induces a global change in the inverter currents, and acts directly on the delay. Usually more than three inverters are in the loop. A higher odd number of stages is commonly implemented, depending on the target oscillating frequency and consumption constraints.

Fig. 5.43 Schematic diagram of a VCO (VCOMos.SCH)

The implementation of the current-starved VCO for a five-inverter chain is given in Fig. 5.44. The current mirror is situated on the left. Five inverters have been designed to create the basic ring oscillator. Then a buffer inverter is situated on the right side of the layout. The VCO circuit frequency variation with Vcontrol is accessed by using the layout shown in Fig. 5.44. A convenient simulation mode is directly accessible (Fig. 5.45), to display the frequency variations versus time together with the voltage variations. The frequency is evaluated on the selected node, which is the output node Vhigh in this case. No oscillation is observed for an input voltage Vcontrol lower than 0.5 V. Then the VCO starts to oscillate, but the frequency variation is clearly not linear. The maximum frequency is obtained for the

122 Advanced CMOS Cell Design

Fig. 5.44 A VCO implementation using five chained inverters (VCO.MSK)

Fig. 5.45 The access to Frequency vs. Time simulation mode

highest value of Vcontrol, around 8.4 GHz (Fig. 5.46). By increasing the number of inverters and altering the size of the MOS current sources, we may modify the oscillating frequency easily. 5.4.5 High-Performance VCO A VCO with good linearity is shown in Fig. 5.47. This circuit has been implemented in several testchips with successful results in 0.8, 0.35, and down to 0.18 µm technologies. The principle of this VCO is a delay-cell with linear delay dependence on the control voltage [8]. The delay-cell consists of a p-channel MOS in series, controlled by Vcontrol, and a pull-down n-channel MOS, controlled by Vplage. The delay dependence on Vcontrol is almost linear for the fall edge. The key point is to design an

Radio-Frequency Circuits 123

Fig. 5.46 Frequency variations versus control voltage shows a non-linear dependence (VCO.MSK)

inverter just after the delay-cell with a very low commutation point Vc. The rise edge is almost unchanged. To delay both the rise and fall edge of the oscillator, two delay-cells are connected, as shown in the schematic diagram.

Fig. 5.47 The layout implementation of a high-performance VCO circuit (VCOLinear.MSK)

124 Advanced CMOS Cell Design

The layout of the VCO is a little unusual due to the need for a very low commutation point for the inverter situated immediately after the delay-cells. This is done by implementing a large n-channel MOS (N3 in Fig. 5.48) with high drive capabilities and a tiny p-channel MOS with low drive capabilities (P3 in Fig. 5.48).

P3

N3

Fig. 5.48 Layout implementation of a high-performance VCO circuit (VCOLinear.MSK)

The simulation of a high-performance VCO circuit is given in Fig. 5.49. A quasi-linear dependence of the oscillating frequency on the input voltage control is observed within the range 0.0.6 V. Although not

Fig. 5.49 Simulation of a high-performance VCO circuit (VCOLinear.MSK)

Radio-Frequency Circuits 125

displayed in the simulation, the voltage of Vplage has a strong influence on the oscillating frequency range. A high value of Vplage (close to VDD) corresponds to a high-frequency oscillation, while a low value (close to the threshold voltage Vt) corresponds to a low-frequency oscillation. The main drawback of this type of oscillator is the great influence of temperature and VDD supply on the stability of the oscillation. If we change the temperature, the device current changes, and consequently the oscillation frequency is modified. Such oscillators are rarely used for high-stability frequencygenerators.

5.5 Phase-Lock Loop The phase lock loop (PLL) is commonly used in microprocessors to generate a clock at high frequency (Fout = 2 GHz for example) from an external clock at low frequency (Fref = 100 MHz for example). Clock signals in the range of one GHz are very difficult to import from outside the IC because of low pass effect of the printed circuit board tracks and package leads. The PLL is also used as a clock-recovery circuit to generate a clock signal from a series of bits transmitted in serial without a synchronization clock (Fig. 5.50). The PLL may also be found in frequency-demodulation circuits, to transform a frequency-varying waveform into a voltage.

Fig. 5.50 Principles of PLL

126 Advanced CMOS Cell Design

The PLL uses a high-frequency oscillator with varying speed, a counter, a phase detector and a filter (Fig. 5.51). The PLL includes a feedback loop which lines up the output clock ClkOut with the input clock ClkIn through a phase locking stabilization process. When locked, the high input frequency fout is exactly N × fin. A variation of the input frequency fin is transformed by the phase detector into a pulse signal which is converted into a variation of the analog signal Vc. This signal changes the VCO frequency which is divided by the counter and changes clkDiv according to fin.

Fig. 5.51 Block diagram of a typical PLL

5.5.1 Phase Detector The simplest phase detector is the XOR gate. The XOR gate output produces a regular square oscillation VPD when the input clkIn and the signal divIn have one quarter of period shift (or 90° or p/2). For other angles, the output is no more regular. In Fig. 5.53, two clocks with slightly different periods are used in DSCH to illustrate the phase detection. At initialization, (Fig. 5.52) the average value of the XOR output VPD is close to zero. When the phase between clkDiv and clkIn is around p/2, VPD is VDD/2. Then it increases up to VDD. Consequently, VPD and the phase difference are linked by expression 5.10. For example, when D f = p/2, VPD is VDD/2. VPD =

VDD .,f

p

(Eq. 5.10)

Radio-Frequency Circuits 127

ClkIn

DivIn

Xor

0

0

0

0

1

1

1

0

1

1

1

0

Fig. 5.52 XOR symbol and truth-table

Fig. 5.53 XOR phase detector at work (PhaseDetectXor.SCH)

The gain of the phase detector is the ratio between VPD and ∆f. The gain is often written as KPD, with an expression derived from Equation 5.10, which is valid for ∆f between zero and p, as drawn in Fig. 5.54. KPD =

VDD

p

(Eq. 5.11)

When the phase difference is larger than p, the slope sign is negative until 2p. When locked, the phase difference should be close to p/2. 5.5.2 Filter The filter is used to transform the instantaneous phase difference VPD into an analog voltage Vc which is equivalent to the average voltage VPD . The rapid variations of the phase detector output are converted into a slow varying signal Vc which will later control the VCO. Without filtering, the VCO control would have very rapid changes which would lead to instability. The filter may simply be a large capacitor C, charged and discharged through the Ron resistance of the switch. The Ron.C delay creates a low-pass

128 Advanced CMOS Cell Design

Fig. 5.54 XOR phase detector at work (PhaseDetectXor.SCH)

filter. Figure 5.55 shows an XOR gate with the output charged with a large poly/poly2 capacitor and a serial resistance to create the desired analog voltage control Vc.

Fig. 5.55 Large load capacitance and weak XOR output stage to act as a filter (phaseDetectAndFilter.SCH)

In Fig. 5.56, the filtered version of the XOR gate output VPD is shown. It can be seen that VPD is around VDD/2 when the phase difference is p/2 or –p/2. The duty cycle of the phase detector output should be as close as possible to 50%, so that Vc is very close to VDD/2 when the inputs are in phase. If this is not the case, the PLL would have problems locking or would not produce a stable output clock. The XOR gate layout has been modified so that the output voltage Vc is very close to VDD/2 when one input in fixed to ground and the other input is a regular clock. 5.5.3 Voltage-Controlled Oscillator for PLL Important characteristics of the PLL are: • The oscillating frequency should be restricted to the required bandwidth. For example, in European mobile phone applications, the VCO frequency should be varying between flow = 1700 MHz and fhigh = 1800 MHz (Fig. 5.57).

Radio-Frequency Circuits 129

Fig. 5.56 Response of the phase detector to slightly different input clocks (phaseDetect.MSK)

• Due to process variations, the VCO frequency range should be extended to fmin, fmax, typically 10% higher and lower than the request range (Fig. 5.57). • When the control voltage Vc is equal to VDD/2, the VCO clock should be centered in the middle of the desired frequency range.

130 Advanced CMOS Cell Design

• The duty cycle of the VCO clock output should be as close as possible to 50% [9]. If this is not the case, the PLL would have problems locking, or would not produce a stable output clock.

Fig. 5.57 Requirements for VCO used in PLL

The current-starved oscillator can be used as a VCO for the PLL, with a modification of its voltage control circuit so that the center frequency is 2450 MHz at Vc = VDD/2, and the frequency range does not exceed 2800 MHz and does not drop lower than 1800 MHz. The modification consists in providing a permanent current path through Rvdd2 to VDD/2 (Fig. 5.58), which helps in keeping Vc around VDD/2. When VPD reaches VDD, Vc is increased and the VCO frequency is close to fmax. When VPD is 0, Vc is lowered and the VCO frequency is close to fmin.

Fig. 5.58 Connecting current-starved VCO to phase detector (Pll.SCH)

Radio-Frequency Circuits 131

A second important sub-circuit added in the PLL is the precharge to VDD/2. The nMOS device controlled by Vc_Prech helps the big capacitor Cfilter to reach VDD/2 during the first nanoseconds. This precharge circuit speeds up the locking of the PLL. 5.5.4 Complete Phase Lock Loop The implementation of the PLL shown in Fig. 5.59 is a direct copy of the schematic diagram of Fig. 5.51. Notice that the resistor Rfilter (1000 Ω) and Rvdd2 (5000 Ω) have been implemented using virtual elements and not physical resistance. The same can be said for the capacitor Cfilter (0.3 pF). However, the resistance and capacitance are easy to integrate on-chip.

Fig. 5.59 Connecting current-starved VCO to phase detector (VCOPll.SCH)

The input frequency is fixed to 2.44 GHz. During the initialization phase (simulation of Fig. 5.60), the precharge is active. This rapidly pushes the voltage of Vc to around VDD/2. The VCO oscillation is started and the phase detector starts operating erratically. The output Xnor is an interesting indication of what happens inside the phase detector. We see that the phase difference is very important during the first 10 nanoseconds. Then, the VCO output starts to converge to the reference clock. In terms of voltage control, Vc tends to oscillate and then converge to a stable state where the PLL is locked and stable. The output is equal to the input, and the phase difference is equal to one fourth of the period (p/2) according to the phase detector principles. 5.5.5 Frequency Demodulation The PLL may be used to transform a frequency into a voltage. When clocks with a small frequency difference are applied serially to the input of the PLL using a multiplexor, the Vc voltage changes accordingly. A fast clock leads to a high Vc voltage, and a slow clock to a low Vc voltage (Fig. 5.61).

132 Advanced CMOS Cell Design

Fig. 5.60 Simulation of PLL showing locking time (VCOPll.SCH)

Fig. 5.61 Using PLL for frequency demodulation (Pllm.SCH)

Multiplexing clocks is done using a CMOS multiplexor, with a switching from one clock to the next once every 25 ns. The simulation of Fig. 5.62 shows that Vc is decreased when the input frequency is decreased. Over a certain range of input frequencies, the circuit is able to convert a frequency into a voltage in a linear way, which is the basic of frequency demodulation. 5.5.6 Frequency Synthesis One very important application of PLL in microprocessors and controllers consists of generating a fast on-chip clock from a slow external clock, usually by fixing a quartz. The fast clock signal is synthesized

Radio-Frequency Circuits 133

Fig. 5.62 Frequency demodulation using PLL (PllFm.MSK)

by the VCO and its stability is controlled by the PLL. The new feature is a clock divider circuit on the path of the feedback loop, as shown in Fig. 5.63. The fast clock is divided by N, which can be programmed by the user. For example, a 100 MHz external clock may be used to create a 500, 600, 700 or 800 MHz internal clock. In that case, N is 5, 6, 7 or 8. The VCO should cover the range 500-800 MHz with sufficient margin due to process variations.

Fig. 5.63 Frequency synthesis to generate a fast clock (PllDigital.SCH)

134 Advanced CMOS Cell Design

Fig. 5.64 Programmable clock divider for Digital PLL (PllDivn.SCH)

A schematic diagram for the frequency division by N is given in Fig. 5.63. The number N is fixed on the keyboard and the circuit performs the division of the input clock Clock1 by N, with a result appearing on ClkOut. In the logic simulation, once the Reset is inactive, the number N is fixed on the keyboard. When the asynchronous counter attains the desired value N, an Equal pulse appears in the loop thanks to the XNOR comparators and the AND gate. This provokes an asynchronous Reset and restarts the counter (Fig. 5.65). An implementation of the frequency synthesizer into layout is proposed for N fixed to 8. The reference clock is 100 MHz, the VCO target clock is 800 MHz. A three-stage clock divider is implemented on the

Fig. 5.65 Simulation of clock divider for various values of N (PllDivn.SCH)

Radio-Frequency Circuits 135

layout (Fig. 5.66) to divide the VCO clock by eight before entering the phase comparator. The VCO transistor sizing has been rearranged to produce a 800 MHz oscillation around VDD/2, which eases the locking. Furthermore, the filter capacitance has been increased to 2 pF to avoid instability in the VCO.

Fig. 5.66 A 100 MHz external reference clock used to control an 800 MHz on-chip clock (PllDigital.MSK)

In the simulation of Fig. 5.67, the first 20 ns correspond to the initialization phase. The VCO is warmed up and the clock divider starts to produce the signal ClockDivN, which is equal to Vhigh divided by eight. Around 80 ns are required to lock the VCO to the desired frequency (100 × 8 = 800 MHz). What we observe in the output signal is a phenomenon called jitter: the output frequency is not stable as Vc fluctuates around 0.6 V. The VCO output exhibits a spread of frequency around 800 MHz. The requirements in terms of jitter are very severe in most PLL designs. For example, the PLL used in mobile phones should produce a very stable frequency around 800 MHz with less than 100 KHz jittering. In some cases however, a voluntary jitter is added to the PLL, which creates a small fluctuation in the synthesized clock (Fig. 5.68). This technique is found in some clock synthesis circuits in micro-controllers, to transform the perfect synchronous clocking into a slightly asynchronous clocking, which lowers the peaks of parasitic interferences.

136 Advanced CMOS Cell Design

Fig. 5.67 Simulation of the 800 MHz PLL controlled by a 100 MHz external clock (PllDigital.MSK)

Fig. 5.68 The VCO frequency jitters around 800 MHz (PllDigital.MSK)

Several techniques exist to lower the jitter, that are described in [9]: • Lowering the gain of the VCO. The effect of Vc is less important than the VCO frequency. The drawback is the reduction in frequencies’ lock range.

Radio-Frequency Circuits 137

• Increasing Rfilter and Cfilter. The effect of VPD change is less important on the VCO control. The drawback is a larger locking time, and a larger silicon area required to implement the capacitance. A very high value for the on-chip resistance is not recommended as it generates a parasitic noise proportional to the resistance. • Reducing the gain of the phase detector. The sizing of the XOR gate may be changed to produce less current. However, the MOS parasitic noise is increased and the design is more sensitive to supply and substrate noise, which tends to cancel the benefits of a lower phase-gain.

5.6 Frequency Converter 5.6.1 Principles In many situations, for radio-frequency emitters and receivers, there is a need for shifting an input waveform into a lower or higher frequency waveform. From an emission point of view, most of the signal processing is done within the range 10–100 MHz. However, the emission bandwidth may be significantly higher (900 MHz, 1.8 GHz for mobile, and 2.4 GHz, 5 GHz for wireless local area networks). A direct generation of the desired signal at such a high frequency would consume too much power. A low-power frequency translator circuit is preferred. In the case of Fig. 5.69, the frequency converter shifts the original signal (say fin = 100 MHz) to the desired emission frequency fhigh = 900 MHz. The operation which translates a highfrequency signal into a low-frequency signal is called down-conversion. In frequency domain, it consists in shifting a high frequency information contained in frequency fin to a lower frequency flow, as illustrated in Fig. 5.70. The information contained in the original signal fin (which may include an amplitude, frequency or phase variation) is preserved in the resulting signal fout.

Fig. 5.69 The principles of frequency conversion

5.6.2 Adding Sinusoidal Waves Adding sinusoidal waves is very easy. A simple circuit containing three resistors produces the addition of two sinusoidal waves, as shown in Fig. 5.71. The formulation is easily demonstrated using the superposition theorem. 1 (Eq. 5.12) [cos w1t + cos w 2t ] 3 The Fourier transform of the signal s1 + s2 reveals two harmonics (Fig. 5.73), one at the frequency of signal1, the other at the frequency of signal2, as suggested in Equation 5.12. Clearly, no frequency-shift may be obtained using sinusoidal addition.

Vout =

138 Advanced CMOS Cell Design

Fig. 5.70 Illustration of up and down-frequency conversion

Fig. 5.71 Adding sinusoidal waves is easy. A set of three resistors is sufficient to build the sum (AddSinus.MSK)

5.6.3 Multiplying Sinusoidal Waves At the core of up/down frequency-conversion is the multiplication of two sinusoidal waves in the time domain [4]. The result of that multiplication is the generation of two new frequencies: one at the sum of the frequencies and one at the difference (Equation 5.13).

Radio-Frequency Circuits 139

Fig. 5.72 Simulation of the sinusoidal wave adder (AddSinus.MSK)

Fig. 5.73 The Fourier transform of s1 + s2 reveals two harmonics: one at 100 MHz and the other at 1 GHz (AddSinus.MSK)

sin (w1t ).sin (w 2 t ) =

1 [sin(w1 − w 2 ) t − sin (w1 + w 2 ) t ] 2

(Eq. 5.13)

140 Advanced CMOS Cell Design

where w1 = 2p . f 1 w2 = 2p . f 2 f 1 = frequency of signal 1 (Hz) f 2 = frequency of signal 2 (Hz) If we consider a low frequency fin, and a high frequency fOsc and only consider absolute values, the multiplication of these two sinusoidal signals creates two new sinusoidal contributions: one at fOsc – fin, one at fOsc + fin (Fig. 5.72). Using an LC resonant circuit, we only keep the desired frequency contribution. In the case of Fig. 5.74, the L and C values are tuned to highlight the fOsc + fin contribution which fits with the emission bandwidth. The LC resonator also serves as a filter of undesired harmonics, such as fosc – fin and fosc.

Fig. 5.74 The multiplication of two frequencies creates new frequency components

5.6.4 Using a MOS for Sinus Multiplication The process for multiplying signals with CMOS devices is far from being simple. The nMOS and pMOS are non-linear devices. The best example is the long-channel nMOS which gives approximately a square law dependence between Vgs – Vt and IDS, as illustrated in Fig. 5.75. A linear device would give a linear dependence between IDS and VGS, which is almost the case for short-channel devices. See Chapter three of [11] for more details about device modeling. The idea is as follows (Fig. 5.76): the two sinusoidal inputs fin and fosc are added on the gate VGS. The current IDS is a non linear function of VGS. The static characteristics of the device (W = 50 µm, L = 0.5 µm) show a ‘quadratic’ dependence: each VGS step induces a square increase of IDS. This can be written as:

Radio-Frequency Circuits 141

Fig. 5.75 Long-channel MOS characteristics exhibit a square dependence of IDS vs. VGS (MixerMos.MSK)

IDS ≈ k .(VGS − Vt )2

(Eq. 5.14)

where k depends on the design and technology Vt is the threshold voltage (around 0.35 V)

Fig. 5.76

The IDS current exhibits several harmonics, including the desired high frequency fosc + fin (MixerMos.MSK)

If VGS is a sum of sinusoidal waveforms, as we saw in the previous section, the current may be written as in Equation 5.15, which can be arranged as Equation 5.16.

142 Advanced CMOS Cell Design

IDS ≈ k .[Vbias + vin .sin (w in t ) + vosc sin (w osc t ) − Vt ]

(Eq. 5.15)

IDS ≈ I DS 0 + k1 .[vin . vosc (sin w osc t .sin w in t )]

(Eq. 5.16)

2

IDS ≈ I DS 0 +

k1 .[vin . vosc sin (w osc + w in ) t − sin (w osc − w in ) t ] 2

(Eq. 5.17)

The most important result beyond the approximation of Equation 5.17 is that the input signal and the oscillator signal are indeed multiplied and create the desired harmonics. In other words, passing a sum of sinusoidal waveforms into a non-linear device create several harmonics, from which fin + fosc and fin–fosc are the most important. The desired harmonic is underlined in Equation 5.17 by rearranging the product of sinus into a sum of sinus. The term IDS0 also contains the original input signal, the oscillator signal and all their respective harmonics too, which lead to quite a complex output. A band-pass filter is mandatory to eliminate undesired harmonics and amplify the desired signal. The circuit is called a single-balanced mixer. 5.6.5 Layout Implementation The n-channel MOS implemented in the mixer layout must have a large length to eliminate short channel effects and exhibit a square law dependence between VGS and IDS. This is the case of MOS devices with a length larger than 0.5 µm. A resistance load is mandatory to perform amplification. The resistor is matched to the Ron resistance of the nMOS device. The input is the sum of two sinusoidal components, through a resistor bridge (Fig. 5.77).

Fig. 5.77 Building a single-balanced mixer with an n-channel MOS device (MixerMos.SCH)

Radio-Frequency Circuits 143

Fig. 5.78

Design of a mixer using a large-width, large-length nMOS device, with a sum of sinusoidal waves at the input (MixerMos.MSK)

Notice the unusual aspect of the MOS device in the layout shown in Fig. 5.78. Five gates are connected in parallel, which is equivalent to one single MOS with the sum of channel widths, but at the same time, the length is enlarged to obtain a sufficient quadratic dependence between voltage and current, which is the main origin of harmonics. According to the theory, the time-domain simulation of the mixer reveals that the signal Vout has a very complex aspect (Fig. 5.79). The Fourier transform is obtained by a click on ‘FFT’ in the simulation window (Fig. 5.80). The 450 MHz input signal, the 2 GHz oscillator signal, as well as the harmonics and products are present in the spectrum. The only desired harmonic is the 2.45 GHz contribution, corresponding to fin + fosc. 5.6.6 Mixer with LC Resonator The mixer shown in Fig. 5.81 has two important features: the serial resistor is replaced by an inductor LHF of 3 nH, and a capacitor CHF = 1.2 pF is added to the output. The LC resonator formed by the

144 Advanced CMOS Cell Design

Fig. 5.79 Simulation of the mixer with 450 MHz and 2 GHz added inputs (MixerMos.MSK)

Fig. 5.80

The output voltage includes fin, fosc and their corresponding harmonics. The desired signal is at fosc + fin (MixerMos.MSK)

Radio-Frequency Circuits 145

inductor LHF and the capacitor CHF matches the target frequency of 2.45 GHz (use the resonant frequency evaluator in the Analysis menu to confirm). The serial resistor RL accounts for the finite quality of the inductor, and corresponds to the long metal-wire resistance of the physical inductor. Removing this serial resistor would create overestimated oscillations, possibly numerical instability, and the results could not be exploited.

Fig. 5.81 The schematic diagram of a mixer with an LC resonator tuned to 2.45 GHz

The mixer implementation has not been completed in the layout of Fig. 5.82. For simplicity’s sake we used virtual L and C rather than a physical inductor. The 3 nH inductor is placed in series with a parasitic resistance which accounts for the physical serial-resistance of the on-chip inductor, and limits the LC resonance effect. The capacitor 1.2 pF is also virtual, and is placed near Vout.

Fig. 5.82 The mixer with a tuned LC resonator targeted to 2.45 GHz (MixerLC.MSK)

146 Advanced CMOS Cell Design

The Fourier transform of the time-domain simulation is proposed in Fig. 5.83, and corresponds to the output node Vout. The desired signal at 2.45 GHz appears much more clearly than in Fig. 5.80, because of the pass-band nature of the passive resonator centered around that frequency. Unfortunately, the selectivity of the LC circuit is not high enough to erase the oscillator frequency at 2 GHz. Residues of other harmonics also appear in the Fourier transform: 1.6 GHz and 4 GHz.

Fig. 5.83

The Fourier transform of Vout shows a main contribution at the desired frequency and residues of other harmonics (MixerLC.MSK)

An increase in the input frequency fin is translated into a corresponding increase in the output frequency. For example, a slow increase in fin shifts the main peak to the right in a proportional way. Also, an increase of the amplitude of fin induces a corresponding increase of the 2.45 GHz harmonic. This property is illustrated in Fig. 5.84 by adding a regularly increasing sinusoidal input (parameter Increase f). The evolution of the FFT of Vout shows a shift in the peak resonance, due to the fact that the input sinusoidal wave has also shifted toward high frequencies (Fig. 5.85). This illustrates an important property of mixers that are conservative in terms of amplitude and frequency variations, except that the output frequency is situated at a fixed distance from the input frequency. 5.6.7 Double-balanced Mixer The main drawback of the mixer output provided by the LC mixer is the significant amount of parasitic signals added to the desired signal. The undesired signals 2.55 GHz ( fosc – fin ), 2 GHz ( fosc ), 2.9 GHz ( fosc+2.fin), 4 GHz (2.fosc), appear in the spectrum and should be eliminated. A very brilliant idea would be to create two signals where all harmonics would be in the opposite phase except the desired harmonics which would be in phase. Adding these two signals would create a miraculous signal with fosc + fin and fosc – fin.

Radio-Frequency Circuits 147

Fig. 5.84 Input SinusIn starts from 400 MHz and slowly rises to 500 MHz (MixerLC.MSK)

Fig. 5.85

A small increase in input frequency is translated into increase in the main harmonic (MixerLC.MSK)

148 Advanced CMOS Cell Design

Fig. 5.86 Implementation of the double-balanced mixer (MixerDoubleLC.SCH)

A circuit that realizes this function is proposed in Fig. 5.86. The signals Vin and Vosc are combined as seen previously, in the left branch of the mixer, on the gate of the nMOS device. The current that flows through the nMOS device situated on the left is IDS1, which can be approximated by Equation 5.18. In the right branch of the mixer, the signals ~Vin and ~Vosc, representing the same signals as Vin and Vosc but with an opposite phase, are combined on the gate of the second nMOS device. The current that flows on the right nMOS device is IDS2 which can be approximated by Equation 5.19. IDS1 ≈ k .[Vbias + vin .sin (w in t ) + vosc sin (w osc t ) − Vt ]

(Eq. 5.18)

IDS2 = k .[Vbias − vin .sin (w in t ) − vosc sin (w osc t ) − Vt ]

(Eq. 5.19)

2

2

Developing Equations 5.18 and 5.19, the sum can be arranged as: IDS1 + IDS2 = k[ I DS0 + 2vin .sin (w osc + w in ) t + 2vin .sin (w osc − w in ) t ]

(Eq. 5.20)

The remarkable point that can be seen in Equation 5.20 is that the sum of the currents IDS1 + IDS2 that flows in the 50 Ω load resistor RL mainly includes a constant value IDS0 and the mixer products at frequencies fosc + fin and fosc – fin, which was exactly the goal of the mixer. The layout implementation (Fig. 5.87) makes extensive use of virtual R, L, C elements. This technique is recommended for the tuning of the circuit, but one should remember that the final goal is a complete layout implementation. The simulation performed in Fig. 5.86 confirms the theoretical assumption: the Fourier transform clearly includes the two main contributions near 1500 MHz and 2500 MHz, without fosc in between. Removing the undesired harmonics is quite easy, in order to keep the desired 2500 MHz contribution.

Radio-Frequency Circuits 149

Fig. 5.87 Layout of the double-balanced mixer (MixerDoubleLC.MSK)

Fig. 5.88 Fourier transform of the double-balanced mixer output (MixerDoubleLC.MSK)

Gilbert Mixer The double-balanced mixer is not implemented using a resistor-based voltage adder, as suggested in the schematic diagram shown previously (Fig. 5.86). Most mixers use the Gilbert cell [10] which consists of only six transistors, and performs a high-quality multiplication of the sinusoidal waves [4]. The schematic diagram shown in Fig. 5.89 uses the tuned inductor as load, so that Vout and ~Vout oscillate around the supply VDD. The implementation shown in Fig. 5.90 again makes an extensive use of virtual R, L and C elements. The 3 nH inductor is in series with a parasitic 5 Ω resistance, on both branches (Fig. 5.89). The timedomain simulation reveals a transient period from 0.0 to 8 ns during which the inductor and capacitor

150 Advanced CMOS Cell Design

Fig. 5.89 The Gilbert mixer (MixerGilbert.SCH)

Fig. 5.90 The Gilbert mixer implementation with virtual R, L and C (MixerGilbert.MSK)

Radio-Frequency Circuits 151

Fig. 5.91 Time-domain simulation of the Gilbert mixer (MixerGilbert.MSK)

warm up. This initialization period is not of key interest. The most interesting part starts from 8 ns, when the outputs Vout and Vout2 are stabilized and oscillate in opposite phase around 2.5 V. The Fourier transforms of nodes Vout and Vout2 are almost identical (Fig. 5.92). We present the Fourier transform in logarithm scale to reveal the small harmonic contributions. As expected, the 2 GHz fosc signal and the 450 MHz fin signals have disappeared, thanks to the cancellation of contributions. The two major contributors are fosc + fin and fosc – fin. Notice that the simulation time has an influence on the Fourier transform result: a short simulation (5 ns) would lead to a poor precision in our frequency range of interest, but a high precision on very high frequencies (above 10 GHz). In our case, it is preferable to perform the time-domain simulation over a large time (50 ns). This will give a high precision at low frequencies (from DC to 5 GHz), but limit the Fourier spectrum to around 10 GHz. As the target frequency is around 2.5 GHz, a 50 ns simulation gives the best results. In Fig. 5.93, a complete implementation of the Gilbert mixer has been realized, so that the virtual R, L and C components are replaced by physical elements. The coils have a target 3 nH inductance, and their associated parasitic resistance approaches 6 Ω when the combination of metal6, metal5 and metal4 are used. The tuning capacitor is added to the parasitic coil capacitor to perform the best resonance at the desired 2.5 GHz frequency. The design relies on good models for the inductor and capacitor, which is

152 Advanced CMOS Cell Design

Fig. 5.92 Fourier transform of the Gilbert mixer output (MixerGilbert.MSK)

Fig. 5.93 Complete implementation of a Gilbert mixer circuit (MixerGilbert2.MSK)

Radio-Frequency Circuits 153

not the case in the MICROWIND software which uses first order approximations of parasitic resistance, capacitance and coil inductance. In a real case implementation, we may expect significant differences between measurements and simulations. Having accurate predictions of such circuits is quite challenging.

5.7 Sub-sampling Frequency Converter Let us recall that the frequency down-conversion consists of shifting the input signal with a frequency fin down to a lower frequency fout, without altering its amplitude or modulated information. One interesting solution consists in using a transmission gate with a very accurate tuning of the gate clock.

Fig. 5.94 Principles of down-conversion

As an illustration, we use a 1.9 GHz sinusoidal wave (DataIn), and a 1.818 GHz sampling signal (Enable). The expected output frequency is therefore 1.9 – 1.818 = 0.082 GHz, that is 82 MHz. The layout of the sample circuit is a simple transmission-gate with an RC filter (Fig. 5.95). The sampling signal Enable operates at a rate slightly lower than the input frequency. This leads to a low frequency signal at the output of the transmission-gate (Fig. 5.96). With a simple RC filtering, the output signal becomes a sinusoidal wave with a frequency equal to the difference between fDataIn and the gate frequency fEnable. When simulating over a 20 ns time interval (Fig. 5.94), and viewing an evaluation of the frequency of the output subSample, we obtain 82 MHz, as expected.

5.8 Conclusion In the first part of this chapter, we described the role of an on-chip inductor for resonant circuits. Then, we detailed the power amplifiers and the associated notions of power efficiency and amplifier class. Thirdly, we presented some circuits used to generate oscillations, based on ring oscillation and passive LC networks. Next, we described the main parts of the PLL and illustrated three applications in the GHz range. The frequency conversion was presented through the addition of sinusoidal waveforms in non-

154 Advanced CMOS Cell Design

Fig. 5.95 Layout of the transmission-gate and RC filter used for down-conversion (DownConverter.MSK)

Fig. 5.96 Down-conversion of the 1.9 GHz input sinusoidal wave to a low frequency near 82 MHz

Radio-Frequency Circuits 155

linear devices. We also presented the Gilbert mixer and looked at the frequency conversion performances in frequency-domain thanks to the Fourier transform. Finally, the sub-sampling principles were applied to frequency down-conversion through a simple example.

References [1] T. Macnamara, Handbook of Antennas for EMC, Artech House Publishers, ISBN 0-89006-549-7. [2] N.H.E. Weste and K. Eschraghian, Principles of CMOS VLSI Design, Addison-Wesley, 1993, ISBN 0-201-53376-6. [3] M. Niknejad Ali and R.G. Meyer, Design, Simulation and Applications of Inductors and Transfromers for Si RF ICs, Kluwer, 2000, ISBN 0-7923-7986-1. [4] T.H. Lee, The Design of Radio-Frequency Integrated Circuits, Cambridge University Press, 1998, ISBN 0-521-63061-4. [5] H.A. Wheeler, “Simple Inductance Formulas for Radio Coils”, Proceedings of the IRE, Oct. 1928, pp 1398-1400. [6] M. Thompson, “Inductance Calculation Techniques – Part II: Approximations and Handbook Methods”, Power Control & Intelligent Motion, Dec 1999. [7] M.M. Hella and M. Ismail, RF CMOS power Amplifiers: Theory, Design and Implementation, Kluwer Academic Publishers, 2002, ISBN 0-7923-7628-5. [8] B. Vrignon, S. Bendhia, E. Lamoureux and E. Sicard, “Characterization and modeling of parasitic emission in deep submicron CMOS”, IEEE transaction on EMC, Vol. 47, No. 2, pp 382–387. [9] R.J. Baker, H. W. Li, and D. E. Boyce, CMOS Circuit Design, Layout and Simulation, New York: Wiley-IEEE Press, 2004, ISBN 047170055X. [10] B. Gilbert, “A high-performance monolithic multiplier using active feedback”, IEEE J. Solidstate Circuits, Vol. SC-9, No. 6, pp. 364–373, Dec. 1974. [11] E. Sicard and S. Bendhia, Basic CMOS Cell Design, Tata McGraw-Hill, 2005, ISBN 0-07-059933-5.

EXERCISES 5.1 Design a 10 mW power amplifier operating near 1.9 GHz (UMTS frequency range). Add a second power MOS device to be able to tune the output power from 10 mW to 30 mW, using logic controls. 5.2 Optimize the power amplifier for a maximum power efficiency delivered to a 30 Ω load, as the radiating resistance is often closer to that value than the standard 50 Ω. 5.3 Design an LC oscillator targeted to 5.4 GHz, corresponding to the frequency used in wirelesss area network protocols such as IEEE 802.11a.

156 Advanced CMOS Cell Design

5.4 Redesign the high-performance VCO to oscillate around 5.4 GHz, with a span of 0.5 GHz (corresponds to IEEE 802.11a). 5.5 The main problem with the XOR phase detector is that the ‘ideal’ position corresponds to a phase difference of p/2. In high-performance PLL applications, another type of phase detector is used, as shown in Fig. 5.97. Implement the phase detector and extract the effect of the phase difference between clkDiv and clkIn to the voltage Vc.

Fig. 5.97 The D-Latch phase detector at work (PhaseDetectD.SCH)

Converters and Sensors 157

6 Converters and Sensors In this chapter, we shall discuss the basic principles of data converters and give an overview of their architecture. The design and implementation of data converters will also be discussed, with emphasis on basic building-blocks, the use of comparators, voltage reference and sampling structures. We shall also discuss the implementation of temperature and light sensors, compatible with the CMOS standard process.

6.1 Introduction Our environment is full of analog signals that we need to monitor, capture, treat, store, modify and transmit. The signal origin may be sound, temperature, humidity, light, radio-frequency waves or acceleration. A modern way of treating analog signals is to convert them into digital signals. The advantages of using digital techniques known as signal processing are the programmability, stability, repeatability, accuracy, noise immunity, and also the ability to implement special functions such as linear phase filters or error-correcting codes. The digital signal is a variable whose possible values are zero or one, which correspond to a low or a high voltage. The opposite of an analog signal is a continuous time signal whose response with respect to time is uninterrupted. The analog-to-digital converters (ADC) and digital-to-analog converters (DAC) are the main links between the analog signals and the digital world of signal processing. The ADC and DAC, viewed as black boxes, are shown in Fig. 6.1. On the right-side, the ADC takes an analog input signal Vin and converts it into a digital output signal A. The digital signal A is a binary-coded representation of the analog signal using N bits: AN-1 … A0. The maximum number of codes for N bits is 2N. The digital signal is usually treated by a microprocessor unit (MPU) or by a specific digital signal processor (DSP) before being restituted as an output B. In this example, B has the same dimension as the input signal A. Then, the DAC, which has the opposite function as compared to the ADC, converts the digital signal to the final analog output signal Vout.

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

158 Advanced CMOS Cell Design

Fig. 6.1 Basic principle of N-bit analog-to-digital and digital-to-analog converters

A typical function of this signal-processing circuit is to filter the high-frequency components of the input Vin. Consequently, Vout only contains the slow-varying portion of Vin. Figure 6.2 below shows some target applications of ADC and DAC converters, with the frequency range in the X-axis and the converter resolution in the Y-axis [1][3]. Low-frequency, low-resolution data-conversion mainly concerns low-quality voice, as found in phones. Mobiles phones typically operate at 8000 Hz and 12 bits. Digital audio in CD players work with 20-bit converters at a frequency of 44 kHz (Fig. 6.2). Digital video and high-speed Internet have created specific demand for high-speed data converters. Finally, sampling rates around 1 GHz are still a specificity of very-high-speed instrumentation such as GHz bandwidth oscilloscopes.

Fig. 6.2 Speed and resolution requirements of ADCs

6.2 Digital-Analog Converter Architectures Digital-to-analog converters (DAC) convert a digital number B into an analog signal Vout*. The output of the DAC is not as smooth as we would wish, due to a finite number of available analog levels. A low-

Converters and Sensors 159

pass filter eliminates the higher-order harmonics caused by the conversion on the signal Vout*, and returns an analog signal Vout (Fig. 6.3).

Fig. 6.3 A four-bit digital conversion including a filter module

A wide variety of DAC architectures exists, from very simple to complex ones. Each of them has its own merits and limits. The digital signal can be provided in many different codes, depending on the final application: binary, thermometer, Gray, two’s complement, offset binary, and so on. These concepts are presented in the following paragraphs. 6.2.1 Resistor-String Converter The most basic DAC is based on a resistance ladder. This type of DAC consists of a simple resistorstring of 2N identical resistors, and a binary switch-array, whose input is a binary word. The analog output is the voltage-division of the resistors flowing via pass-switches. In the example shown in Fig. 6.4, the resistance ladder includes eight identical resistors which generate eight reference voltages equally distributed between zero and Vdac.

Fig. 6.4 Schematic diagram of the digital-analog converter (DAC3bit.SCH)

160 Advanced CMOS Cell Design

The digital-analog converter uses the three-bit input (B2, B1, B0) to control the transmission-gate network, which selects one of the voltage references (a portion of Vdac). This voltage reference is transferred to the output Vout. Let us consider Vdac = 1.2 V, which corresponds to the core-voltage of the CMOS 0.12 µm process. The voltage-step can be expressed as in Equation 6.1. ∆V =

Vdac 1.2 = = 0.15V 2N 8

(Eq. 6.1)

The correspondence between the input B and the output Vout* is given in Fig. 6.5, considering Vdac = 1.2 V.

B2

B1

B0

Vout*

Analog output Vout* (V) with Vdac = 1.2 V

0

0

0

0/8 Vdac

0.0

0

0

1

1/8 Vdac

0.15

0

1

0

2/8 Vdac

0.3

0

1

1

3/8 Vdac

0.45

1

0

0

4/8 Vdac

0.6

1

0

1

5/8 Vdac

0.75

1

1

0

6/8 Vdac

0.9

1

1

1

7/8 Vdac

1.05

Fig. 6.5 Specifications of a three-bit digital-to-analog converter

Layout Considerations A long path of polysilicon between VDD and VSS may give intermediate voltage references required for the DAC circuit. Unfortunately, the polysilicon has a low resistance due to a surface deposit of metal, called salicidation. The resistance is counted in a very convenient unit called ‘ohm per square’ noted Ω/o. The value 4 Ω/o means 4 Ω per square area. The resistance per square is quite small (around 4 Ω per square) due to this thin metal coat, as seen in the cross section of Fig. 6.6. In order to increase the sheetresistance value, the polysilicon resistor must be surrounded by the specific ‘Option’ layer that may be

Converters and Sensors 161

found in the upper part of the layers palette. The salicide is removed, and the sheet- resistance is increased to 40 Ω per square (Fig. 6.6 right).

Fig. 6.6 Removing salicidation to increase sheet-resistance

Following a double-click in this layer, we activate the property ‘remove salicide to increase resistance’ (Fig. 6.7). Consequently, the resistor value is multiplied by 10 and can be used to design an area-efficient resistor network.

High values of polysilicon resistance are obtained

The option layer removes salicide in all boxes included in the option box, which increases their resistance

Fig. 6.7 Sheet-resistance is increased by removing salicide deposit, thanks to an option layer (ADC.MSK)

162 Advanced CMOS Cell Design

The resistor-ladder generates intermediate voltage references used by the voltage comparators as input signals. By default, MICROWIND does not take into account any serial resistor. This means that the resistorladder layout on the left of Fig. 6.8 is considered as a shortcut between VDD and VSS.

Fig. 6.8 Virtual resistor symbols split the polysilicon path into separate electrical regions (ADCRes.MSK)

To account for the serial resistance distributed along the polysilicon path, a virtual resistance symbol must be added, which will force MICROWIND to split the ladder into separate electrical nodes, and to extract the corresponding polysilicon resistance (Fig. 6.8, middle and right). The virtual resistor may be found in the upper part of the palette. Once inserted, the menu shown in Fig. 6.9 appears. It is recommended in this case to select the option Poly resistance. At extraction, MICROWIND will evaluate the equivalent resistance on both sides of the virtual symbol and update the resistance automatically according to the design and technological options. Fig. 6.9

Adding a virtual resistor symbol to extract the polysilicon resistance

Converters and Sensors 163

The resistance symbol is inserted in the layout to indicate to the simulator that an equivalent resistance must be taken into account for the next analog simulation. The layout of a three-bit digital-to-analog converter is shown in Fig. 6.10. The three-inverter circuit generates the signals ~B2, ~B1 and ~B0 from signals B2, B1 and B0. The transmission gates use minimum MOS device size. The total resistance approaches 24 KΩ, which means a stand-by current near 50 µA on a 1.2 V supply power. Lower DC currents may be obtained by increasing the length of the polysilicon path.

Vout

Fig. 6.10 Layout of the digital-analog converter (DAC.MSK)

The simulation of the R-ladder DAC (Fig. 6.11) shows a regular increase in the output voltage Vout with the input combinations, from 000 (0 V) to 111 (1.2 V). Each input change provokes a capacitancenetwork charge and discharge. Notice the fluctuation of the reference voltage Vref 5 (One of the eight reference voltages) too. This is due to the weak link to VDD and VSS through a highly-resistive path. The analog level Vout increases regularly with increasing digital input B. The converter is monotonic. However, it must be noticed that near t = 2.0 ns for a very short period of time, the internal node

164 Advanced CMOS Cell Design

Fig. 6.11 Simulation of the digital-analog converter (DAC.MSK)

discharge leads to a voltage overshoot close to one voltage step ∆V. Also notice that, according to the schematic diagram of Fig. 6.4, the output is connected to N On switches and N Off switches. 6.2.2 Converter Non-linearity Due to the non-ideal behavior of switches, process fluctuations, leakages and various gradient effects, there is a small difference between the ideal analog output Vout_ideal and the actual analog output Vout. The deviation of Vout from the ideal value Vout_ideal is called the integral non-linearity (INL). The normalized INL, according to [3], can be expressed using Eq. 6.2. The INL is illustrated in Fig. 6.12: when Vout is exactly equal to the ideal output, the INL is equal to zero. However, for several values of B, a small difference is usually observed. INLi =

Vout_ i − Vout_ideal

,V

(Eq. 6.2)

where INLi = the integral non-linearity for input i (relative error between –1 and 1) Vout_i = the real DAC output for input i (V) Vout_ideal = the ideal DAC output for input i (V) ∆V = ideal voltage-step (V) The difference between two adjacent analog outputs may be significantly different from the theoretical voltage-step. This deviation is called the differential non-linearity (DNL). In Fig. 6.1, the DNL is the

Converters and Sensors 165

Fig. 6.12 Illustration of integral and differential non-linearity

vertical mismatch between the ideal voltage-step ∆V and the measured step between input i and input i+1. The normalized DNL includes the voltage-step ∆V to get the relative error, and can be described by Equation 6.3. DNLi =

Vout i +1 − Vout i − ,V ,V

(Eq. 6.3)

where DNLi = the differential non-linearity for input i (relative error, usually between –1 and 1) Vouti+1 = the real DAC output for input i + 1 (V) Vout i = the real DAC output for input i (V) ∆V = ideal voltage-step (V) The illustration of INL is given in the simulation of the three-bit DAC. The ideal reference voltages are placed separately in the layout, as shown in Fig. 6.13. In the simulation mode Voltage, Current vs. Time, all voltage values are placed in the same window. The ladder of reference voltages appears, as well as the DAC output Vout. From the simulation shown in Fig. 6.14, it appears clearly that an INL exists which corresponds to the difference between the ideal and actual value of the output, for some values of B. The origin of this non-linearity is the resistor-ladder design which does not create perfectly regular resistance values. Remember that process fluctuation may affect the value of the resistance, which is another source of non-linearity.

166 Advanced CMOS Cell Design

V1.05+

V0.75+

V0.45+

V0.9+

V0.6+

V0.3+

V0.15+

These voltage references are used to display the non-linearities

Fig. 6.13 The illustration of integral and differential non-linearity (DacNonLinearity.MSK)

Fig. 6.14 A zoom at the analog voltage Vout reveals non-negligible INL (DacNonLinearity.MSK)

6.2.3 R-2R Ladder Converter It is not easy to construct a resistor-based DAC with a high resolution, due to the resistance spread and the needs for 2N serial resistors. A more compact choice is the R-2R ladder [3]. Its configuration consists of a network of resistors alternating between R and 2R. For an N-bit DAC, only N cells based on two resistors R and 2R in series are required. The four-bit and eight-bit implementation of this circuit are reported in Fig. 6.15. At the right end of the network is the output voltage Vout. Seven resistors were used for the four-bit implementation of the R-2R DAC. This is half the number required in the previous R-ladder. The difference is even more significant in the eight-bit circuit, with only 15 resistors, while the simple ladder would require 255 resistors in series. In the four-bit implementation of the DAC, the digital inputs (B3, B2, B1, B0) determine whether each cell is switched to ground or tied to Vdac. Each cell’s output voltage is a ratio of Vdac because of the ladder network voltage division. The final output voltage Vout depends on the value of B (0 to 15), following Eq. 6.4. (2 N − B) Vout = Vdac × (Eq. 6.4) 2N

Converters and Sensors 167

Fig. 6.15 Four-bit and eight-bit DAC converter using the R-2R ladder (DACR2R.SCH)

On this principle, Table 6.1 gives the value of Vout versus the input code, with Vdac equal to 1.2 V. Table 6.1 Output voltage produced by the four-bit R-2R DAC versus input code B B3

B2

B1

B0

Vout

0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

1.2 1.125 1.05 0.975 0.9 0.825 0.75 0.675 0.6 0.525 0.45 0.375 0.3 0.225 0.15 0.075

168 Advanced CMOS Cell Design

Layout Considerations The resolution of the R-2R DAC is linked with the accuracy of the resistors and of the resistance of the switches, which must be negligible to avoid a voltage drop and associated non-linearity. It is important to implement a low Ron switch (large-width, minimum-length), together with large resistors. In Fig. 6.16, the design of a four-bit digital-to-analog converter is reported. The elementary resistor pattern has a fixed value of 500 Ω.

Fig. 6.16 A four-bit R-2R DAC (DacR2R4Bit.MSK)

The simulation of the four-bit R-2R DAC (Fig. 6.17) shows a regular decrease in the output voltage Vout. As the Ron of the MOS devices is not negligible, the final value Vout (B = 1111) is higher than the predicted 0.075 V (Table 6.1). This non-linearity may be corrected by enlarging the MOS switch and increasing the length of the serpentine resistor. Alternatively, a dummy switch, whose pass resistance is half Ron, may be inserted inside each cell in serial with R. 6.2.4 Switched Capacitors A very popular DAC architecture used in CMOS technology is based on the switched capacitor [1]. An array of capacitors is connected to switches, in parallel, as described in Fig. 6.18. The capacitors are connected in parallel and share one common electrode, which is connected to a follower-amplifier. Notice that the capacitors are binarily weighted, which means that C, 2C and 4C capacitances are implemented. The capacitor-array totals 2NC. Figure 6.18 gives an example of a three-bit (B0, B1, B2) charge-scaling DAC. The first step is to discharge all capacitors using the Reset switch connected to the ground. Then the switch is disconnected. After initialization, the digital switches B0, B1 and B2 connect each capacitor to

Converters and Sensors 169

Fig. 6.17 Simulation of the four-bit R-2R DAC (DacR2R4Bit.MSK)

Fig. 6.18 The charge-scaling DAC (DacCapacitor.SCH)

VDD or to VSS, according to the logic value. The output voltage Vout is then a function of the voltage division between capacitors. As an example, if the number to convert is B = 011, B2 is connected to VSS, B1 and B0 are connected to VDD as shown in Fig. 6.19. The equivalent capacitor-divider corresponds to a value of the output equal to 5/8VDD. The conversion table gives the value of Vout versus the input code.

Layout Considerations The most important problem is the design of precise capacitors, with values from C to 2N × C. As the number of bits increases, the ratio of MSB to LSB capacitor becomes difficult to control. Moreover, high-value capacitors occupy a significant size on the chip. Using metal plates to create them is not realistic as the capacitance value per µm2 is very low. The solution is to use passive double polysilicon capacitors if available in the CMOS process, as they have good matching accuracy and high capacitance value per µm2. In 0.12 µm CMOS technology, the capacitance between metals is around 50 aF/µm2 and rises to 2000 aF /µm2 between poly and poly2.

170 Advanced CMOS Cell Design

B1

B0

Vout/VDD

0

0

0

0

0

0

1

1/8

0

1

0

2/8

0

1

1

3/8

1

0

0

4/8

1

0

1

5/8

1

1

0

6/8

1

1

1

7/8

B2

Fig. 6.19 The DAC at work for B = 011 (DacCapacitor.SCH)

The implementation of the three-bit DAC requires eight sets of capacitors, regrouped in 4 × C, 2 × C, and two separate C. In MICROWIND, the command Edit → Generate → Capacitor gives access to a specific menu for generating capacitors (Fig. 6.20). The default capacitor is made with poly/poly2. The typical capacitance value for C is around 1 pF. As may been seen in the layout shown in Fig. 6.21, a 100 fF value for C already leads to a large layout.

6.3 Sample and Hold Circuits Sample-and-Hold (S/H) circuits are critical in converting analog signals into digital signals. The sampleand-hold main function is to capture the signal value at a given instant and hold it until the ADC has processed the information (Fig. 6.22). The operation is repeated in time with a regular sampling period. We can notice that during the sampling period, the S/H circuit operates alternatively in dynamic mode (sample) and in static mode (hold), as shown in Fig. 6.23. In dynamic mode, the switch lets the input signal flow through the pass transistor and settle Vin* within the required accuracy. Several parasitic effects may be observed. When the switch is turned off, a parasitic offset may appear due to capacitance coupling, which modifies the voltage Vin*. Also, after some nanoseconds, the stored voltage may be altered by parasitic discharge, appearing as an unpredictable droop.

Converters and Sensors 171

Fig. 6.20 The generator menu handles the design of poly/poly2 capacitor and inter-metal capacitors

Fig. 6.21 Implementation of an array of 100 fF capacitor for the three-bit DAC (DacCapacitor.MSK)

172 Advanced CMOS Cell Design

Fig. 6.22 The sample-and-hold circuit

Fig. 6.23 Sampling of an analog voltage (sample and hold modes)

The transmission gate can be used as an S/H circuit. The schematic diagram of the S/H circuit is proposed in Fig. 6.24. It corresponds to the classical transmission gate. The only important supplement is the storage capacitor, called Cstore, appearing at the output Vin*, the sampled version of Vin. The capacitor retains the voltage information during the conversion phase. By default, a parasitic capacitance always exists due to diffusion areas of the p-channel MOS and n-channel MOS devices. However, Cstore includes a supplementary capacitor connected to the node Vin*, with a capacitance value sufficiently high to counterbalance the effects of leakage currents.

Fig. 6.24 Schematic diagram of the S/H circuit (SampleHold.SCH)

Converters and Sensors 173

The layout of the transmission gate is is reported in Fig. 6.25. The sample/hold command is situated on the left, and controls the transmission gate. The inverter is required for the pMOS device. The Vin* signal is connected to a 10 fF capacitor made of poly/poly2. The effect of sample and hold is illustrated in Fig. 6.26. The voltage curves have been superimposed by using the simulation mode Current and Voltage vs. Time. When sampling, the transmission gate is turned on so that the sampled data Vin* reaches the value of the sinusoidal wave Vin. nEnable Cstore (10 fF)

Vin

Sample/Hold Inverter

Vin

Transmission gate Used as sample/Hold circuit

Fig. 6.25 The transmission gate used to sample analog signals (SampleHold.MSK)

Fig. 6.26 Effect of sampling (SampleHold.MSK)

When the gate is off, the value of the sampled data Vin* remains constant. This is mainly due to the parasitic capacitance of the node which has a value of 10 fF as extracted in a CMOS 0.12 µm process (Fig. 6.27). This is sufficient to retain the information for several nanoseconds, even in the presence of leakage currents. Higher values of storage capacitance are required if the duration of the analog-to-digital

174 Advanced CMOS Cell Design

conversion is of the order of the µ-second, with a large voltage precision. In all cases, the sampled voltage must not fluctuate more that by 30% of the least significant bit over the whole sampling cycle.

Fig. 6.27 The hold effect is related to parasitic capacitance of node Vin*

Layout Considerations If the width and length of the nMOS and pMOS devices are not identical, an error appears between the sampled Vin* and the original voltage Vin. This voltage difference is a parasitic offset, which depends on the value of Vin in a non-linear way, as shown in the measured transfer characteristics shown in Fig. 6.28 (standard sample circuit). The offset is minimized if the nMOS and pMOS sizes are identical, as it is strongly dependent on the parasitic capacitance between the gate and the drain. With identical channel size, CGD of the pMOS is equal to CGD of the nMOS. However, as the size of the nMOS is identical to the pMOS, the pMOS switching is slower. The sampling circuit can be improved by switching the pMOS before the nMOS device, in contrast to the circuit proposed in Fig. 6.27, and by adding so-called ‘dummy transistors’ on the storage node. The curves shown in Fig. 6.27 are compiled from measured characteristics on a 0.18 µm test chip [5], and exhibit a significant reduction of the offset. We often experienced a negative offset (10–20 mV) for VinVdd /2. Unfortunately, MICROWIND does not modelize these characteristics accurately due to a simplification of coupling capacitance models for MOS device.

Converters and Sensors 175

Fig. 6.28 Offset reduction techniques (SampleHold.SCH)

6.3.1 Shannon’s Sampling Theorem The critical element when capturing the analog input voltage accurately is the number of sampled data in the considered time window. We can also talk of the sampling frequency compared to the input voltage frequency. Shannon’s sampling theorem gives the minimum frequency required to represent the

Fig. 6.29 The sampling frequency is fast enough to comply with Shannon’s theorem (SampleHoldShannon.MSK)

176 Advanced CMOS Cell Design

analog input voltage accurately. The minimum sampling frequency fsample must be greater than twice the highest frequency component of the original signal fsignal (Eq. 6.5).

fsample > 2. fsignal

(Eq. 6.5)

Figure 6.30 shows the sampling of a 500 MHz sinusoidal input wave (fsignal) with a sampling frequency fsample of 2.5 GHz which complies largely with Shannon’s theorem. In Fig. 6.30 the sampling frequency fsample is too low (600 MHz), and consequently the sampled output Vin* is significantly different from Vin.

Fig. 6.30 The sampling frequency is too slow: Vin* differs from Vin (SampleHoldShannon.MSK)

6.4 Analog-Digital Converter Architectures The analog-to-digital converter is considered as an encoding device, where an analog sample is converted into a digital quantity with N number of bits. Figure 6.31 shows the complete chain from the analog signal to the digital data using a sampled-and-hold module and a four-bit ADC. ADCs can be implemented by employing a variety of architectures. In the following chapters, we describe the flash converter and successive-approach converters.

Fig. 6.31 A four-bit digital conversion of a sampled analog voltage

6.4.1 The Flash Converter Principles The two-bit analog-digital converter converts an analog value Vin into a two-bit digital value A coded on two bits A1, A0. The flash converter uses three amplifiers, which produce results C0, C1 and C2, connected

Converters and Sensors 177

to a coding logic to produce A1 and A0 in a very short delay (Fig. 6.32). The flash converters are widely used for very high sampling rates, at the cost of very important power dissipation. Table 6.3 The specifications for a two-bit flash ADC converter Analog Input Vin

C2

C1

C0

A1

A0

Vin
0

0

0

0

0

Vref0
0

0

1

0

1

Vref1
0

1

1

1

0

Vin>Vref2

1

1

1

1

1

Fig. 6.32 Schematic diagram of the two-bit flash ADC converter (AdcFlash2bits.SCH)

A schematic diagram for the two-bit flash converter is proposed in Fig. 6.32. The resistor scale produces reference voltages Vref0, Vref1 and Vref2. Three comparator circuits compute the difference between Vin and the reference voltage. Their outputs C2, C1 and C0 are almost logical signals as the comparators are connected in open-loop. The main problem of the comparator-based architecture is that the output A1, A0 is not directly available from C2, C1 and C0. The comparator outputs represent the ‘thermometer coding’ of the input. The ones propagate from C0 to C2 as the input Vin rises, as specified in Table 6.3. A conversion circuit from thermometer code to binary code is needed. In the case of a two-bit flash converter, the circuit is quite

178 Advanced CMOS Cell Design

Fig. 6.33 The thermometer-to-binary coder (AdcFlash2bits_coder.SCH)

simple (Fig. 6.33), and can be efficiently implemented using one inverter (A1) and a complex gate (A0). For a three-bit flash converter, the logic circuit starts to rise in complexity. The thermometer code is described in Table 6.4. Table 6.4 Specifications for a three-bit flash ADC converter Analog Input Vin

C6

C5

C4

C3

C2

C1

C0

A2

A1

A0

Vin
0

0

0

0

0

0

0

0

0

0

Vref0
0

0

0

0

0

0

1

0

0

1

Vref1
0

0

0

0

0

1

1

0

1

0

Vref2
0

0

0

0

1

1

1

0

1

1

Vref3
0

0

0

1

1

1

1

1

0

0

Vref4
0

0

1

1

1

1

1

1

0

1

Vref5
0

1

1

1

1

1

1

1

1

0

Vin>Vref6

1

1

1

1

1

1

1

1

1

1

A three-bit flash converter requires seven converters and a complex logic circuit which converts the thermometer code into a binary code, as specified in Table 6.4. The eight-bit flash converter would require 255 comparators and a very complex logic decoder. An interesting approach for the encoding consists in using a small memory array, taking into account the specific condition of the thermometer

Converters and Sensors 179

coder. For example, a zero on C4 and a one on C3 means that Vref3
Fig. 6.34 The three-bit thermometer coder using a logic array (AdcFlash3bits_coder.SCH)

6.4.2 Flash Converter Implementation The resistor-ladder generates intermediate voltage references used by the voltage comparators located in the middle of the layout (Fig. 6.35). An unsalicide option layer multiplies the sheet resistance of the polysilicon ladder for an area-efficient implementation. The resistance symbol R(poly) is inserted in the layout to indicate to the simulator that an equivalent resistance must be taken into account for the analog simulation. Open-loop amplifiers are used as voltage comparators. The comparators address the decoding logic situated to the right, and that provides correct A0 and A1 coding. In the simulation shown in Fig. 6.36, the comparators C0 and C1 work well but the comparator C0 is used in the lower limit of the voltage input range. The generation of combinations ‘01’, ‘10’ and ‘11’ is produced rapidly but the generation of ‘00’ is slow. The comparator C0 may be modified to provide a faster response in comparison with low voltage, by changing the biasing conditions. An alternative is to reduce the input voltage range, which means that the resistance scale would be supplied by Vdac– larger than VSS and Vdac+ smaller than VDD.

180 Advanced CMOS Cell Design

Fig. 6.35 Design of the ADC (ADC.MSK)

The main drawback of flash converters is the silicon area and the power consumption: every bit increase in resolution almost doubles the size of the ADC circuit and significantly increases the power consumption. 6.4.3 Low-speed ADC Converters The most common low-speed analog-to-digital converter is the iterative converter. As shown in Fig. 6.37, it consists of a digital-to-analog converter, a counter and an analog comparator. Starting with the lowest voltage, the counter is increased until the DAC voltage Vdac is higher than the input voltage Vin. In the particular example shown in the figure, we suppose that Vin is a little higher that Vref/2. The counter has reached the value 5 (101), which corresponds to the transfer of the reference voltage Vref × 5/8 to Vdac. As Vin is lower that this reference, the comparator produces a zero, which stops the counter clock CountClk.

Converters and Sensors 181

Fig. 6.36 Simulation of the ADC (ADC.MSK)

Fig. 6.37 Iterative converter using a DAC (ADCIterative.SCH)

182 Advanced CMOS Cell Design

This converter is very simple to design but slow. Up to 2N clock cycles are necessary to complete the conversion, where N is the resolution of the DAC and of the ADC converter. For example, with a 16-bit data converter with a 100 MHz clock frequency, the conversion rate is as low as 750 Hz. The implementation of Fig. 6.37 corresponds to a three-bit converter. With a high-resolution DAC and a high-precision amplifier, high-resolution ADC converters may be constructed. A better solution consists of examining the most significant bit an–1 first and then in determining whether Vin is larger or smaller than VDD/2. The comparator gives the value of that bit directly. Then, the comparison is performed for the next bit, and so on until all bits are extracted, finishing with the least-significant bit a0. This type of converter is called successive-approximation converter. The complete process is faster than the iterative converter as only N comparisons are necessary. The algorithm is detailed in Fig. 6.38.

Fig. 6.38 Iterative converter algorithm and typical implementation for a 10-bit converter

The schematic diagram of a successive-approach converter is given at the right of Fig. 6.38. The analog input signal Vin is retained by an S/H circuit during the data conversion process. In the first clock cycle, the most significant bit ai of the successive-approximation register (SAR) is set to one. The DAC converts the SAR value to an analog voltage Vdac that is compared to Vin. If Vdac is smaller than Vin, the bit ai is validated at one, and the SAR register is unchanged. Conversely, if Vdac > Vin, ai is set to zero. The DAC generation and comparison processes are repeated for N clock cycles to complete the conversion. In many industrial 32-bit micro-controllers, eight-bit to 16-bit successive-approach converters are implemented with four to 16 channel multiplexors. In the particular case of a 10-bit converter supplied by a voltage reference Vref, as much as 210 reference values (1024) are used. An example of conversion

Converters and Sensors 183

is given in Fig. 6.39. For the first bit, Vdac is set to Vref/2. As Vdac > Vin, a9 is set to zero. Then Vdac is set to Vref/4. As Vdac < Vin, a8 is set to one.

Fig. 6.39 The successive-approach converter at work

6.4.4 Pipeline ADC Converters The pipelined analog-to-digital converter consists of two or more stages connected in serial, each containing a low resolution ADC and DAC converter. The schematic diagram of Fig. 6.40 corresponds to a four-bit ADC, based on a two-bit flash ADC. The role of the first stage is to generate the most significant bits A3 and A2. Then the difference between Vdac and Vin is computed. In order to convert the residue, the voltage difference is amplified and sent to a second two-bit DAC which calculates the leastsignificant bits A1 and A0. The pipelining approach is very powerful and may be applied to a large

Fig. 6.40 Pipeline ADC converter

184 Advanced CMOS Cell Design

number of stages. This enables the conversion of analog signals with a high resolution without the need for designing high resolution DAC or ADC blocks. An extensive study of pipeline ADC may be found in [3]. The successive-approach or pipeline converter layout area increases almost linearly with the converter resolution N, as well as with the power consumption, which represents a key advantage over flash converters.

6.5 Temperature Sensor One of the simplest temperature-sensing elements is the pn diode [4]. The classical model of the diode is given by Eq. 6.7. The expression of the current includes two strongly temperature-dependent parameters, the exponential term and the reverse saturation current. ⎛ ⎡ q ⎤ ⎞ I ak = I sat S ⎜ exp ⎢ Vak ⎥ − 1⎟ ⎣ kT ⎦ ⎠ ⎝

(Eq. 6.7)

with Isat = reverse saturation current per µm2 (A/µm2) S = surface of the diode (µm2) q = electric charge k = Boltzmann’s constant T = absolute temperature (°K) Vak = diode voltage (V) For temperature sensing, the np diode is forward-biased by a small constant current, and the diode voltage Vak serves as a measure of the temperature. The proposed circuit is given in Fig. 6.41. The pMOS device serves as a load while the P+/N well diode serves as a temperature-sensor. The pMOS device itself is sensitive to temperature, but its dependence is negligible as compared to the diode. In this circuit, Vak is equal to Vref . The implementation of the diode in forward-biasing condition cannot be done with a N+/Psubstrate diode, as the substrate is connected to ground, which would imply a negative-biasing of the N+ diffusion. The only remaining solution consists in using an nwell region connected to ground, and a P+ diffusion, which creates a P+/Nwell diode.

Fig. 6.41

Principles of temperature-sensor based on junction diode (SensorTemperature.SCH)

Converters and Sensors 185

The implementation of the current-sensor is reported in Fig. 6.42. The pMOS channel length is large so as to reduce the DC current and to avoid short-channel limiting effects. The simulation of the temperature influence is performed using the parametric analysis, in order to plot the diode voltage Vref versus the temperature in °C directly. Invoke the command Analysis → Parametric Analysis, click inside the layout corresponding to Vref, and the screen shown in Fig. 6.43 appears. Select the item ‘Temp.’, and the measurement ‘Final voltage Vref’. Click Start Analysis to perform the iterative simulation from

Fig. 6.42 Implementing the temperature-sensor (SensorTemperature.MSK)

Fig. 6.43 Simulation of the voltage dependence on temperature (SensorTemperature.MSK)

186 Advanced CMOS Cell Design

–40°C to 120°C with a step of 20°C. At the end of each simulation, the final value of Vref is added to the data array. It can be seen from the result of Fig. 6.43 that Vref decreases nearly linearly with temperature, with a slope of around –1.2 mV/°C. Measured results presented in [4] give around –2.4 mV, using a stable current reference of 10 µA instead of the diode-connected on-chip pMOS device. The main problem of this type of sensor is its strong dependence on process variation which requires a calibration procedure to obtain an exact value for the temperature.

6.6 Image Sensors Recently, much attention has been paid to CMOS image sensors, due to the proliferation of low-cost video cameras such as webcams and video in mobile phones. Most high-quality cameras use chargecoupled-device image sensors, which feature superior light-sensing characteristics over CMOS sensors. However, the use of standard CMOS technology for image-sensing offers two key advantages over CCD technology: the analog signal processing can be realized on the same chip, and the price of CMOS imagers is significantly lower than its CCD counterpart, thanks to the availability of deep sub-micron CMOS foundries all over the world. 6.6.1 The Diode Detector In CMOS technology, one of the simplest light-detection devices is the PN junction. The diode is photoresistive, which means that its characteristic I/V is sensitive to light. The photo diode can be considered as a variable resistor. The most common photo diode consists of an N+ diffusion area in the P-substrate. By default, with no light, the diode is polarized in reverse-mode, so that almost no current flows from the grounded P-substrate to the N+ diffusion region (Fig. 6.44). The light photons are converted in the

Fig. 6.44 The diode as a light detector

Converters and Sensors 187

P region into electrons which are attracted by the N+ diffusion and generate a photo current. This current is almost linearly proportional to the incident light. Consequently, the resistance of the diode decreases linearly with the intensity of the light. Silicon photo-resistors are sensitive to the spectral band corresponding to a wavelength from 350 to 1100 nm, which covers the visible spectrum (Fig. 6.45). Our eyes are sensitive to light that corresponds to a wavelength range of 400 nm (blue color) to 700 nm (red color). The efficiency of converting photons into electrons is limited by reflection on the dielectric materials which cover the surface of the IC and by the loss and recombination of electrons during their travel from the substrate to the electrode. The efficiency of the light-sensor is maximum near 700 nm, with a peak value around 20%. The current that flows on the photodiode is very small, approximately 1 nA for a 10 × 10 µm diode area.

Fig. 6.45 Typical performance of photo-diode versus incident wavelength

Notice that the photo-receptor is subject to an important time delay to reach equilibrium after a change in illumination level. This time constant is in the order of micro-seconds. 6.6.2 Diode Detector Setup The basic setup for the passive light-sensor is shown in Fig. 6.46. The circuit consists of diode and a capacitor. First, the capacitor Cstore is charged to a high-voltage VDD, using the precharge switch. Second, the precharge is deactivated and the light-sensor starts to discharge the capacitor, thanks to the reverse photo-current Inp. Without light, the current Inp is equivalent to the reverse mode current, which is in the range of pico-amperes (10–12 A) for a small PN junction. With a strong light, the current Inp enters the nano-ampere range (10–9 A).

188 Advanced CMOS Cell Design

The basic diagram for an array of passive lightsensors is shown in Fig. 6.47. Each passive pixelsensor (PPS) converts the photons into an electrical charge which is carried off by the sensor through pass-transistors which are laid out as in a memory array. The charges flow through the vertical lines to a voltage amplifier with a significant gain. The main problem with the passive sensor is the noise that appears in the resulting image and limits its use to low-quality image-sensing. Furthermore, a large pixel matrix induces significant leakage in the vertical bit lines and limits the use of the PPS to very lowresolution devices. The goal of active pixel-sensors (APS) is to reduce the noise associated with passive sensors, and to amplify the light-induced charges at each pixel location. The APS approach improves the lightsensor performance significantly, allowing the design of large pixel arrays, with fast read access, while keeping power dissipation low. The transistor N1 sets the photodiode voltage Vdiode to a high value when Set = 1. Once N1 is cut off, the photodiode discharges V diode , depending on the light intensity. The source follower N2 buffers the photodiode voltage to the bus, when the row select transistor N3 is active (Sel = 1), as illustrated in Fig. 6.48. In MICROWIND, the photodiode effect is modeled by a virtual-resistor added between Vdiode and the ground. Values ranging from 1 MΩ to 9 MΩ are added in the 2 × 2 pixel array to account for variable photocurrent (Fig. 6.49). However, the real-case photocurrent is very small (around 10–100 pA/µm2 in the best case). When a 10 × 10 µm pixel is designed, the maximum current is around 10 nA, which discharges the pixel capacitance within several microseconds, with an equivalent leakage resistance of hundreds of mega-ohms. Using a mega-ohms resistor speeds up the simulation but accelerates real-case chronograms by three orders of magnitude.

Fig. 6.46

Light-detector setup including a precharge circuit (ImageSensor.SCH)

Fig. 6.47 A passive pixel array made with photodiode and pass-transistors (ImageSensor.SCH)

Converters and Sensors 189

Fig. 6.48 The APS with a photodiode (ImageSensorActive.SCH)

Fig. 6.49 A matrix of four active pixels (ImageSensor2×2.MSK)

In the simulation reported in Fig. 6.50, the role of photocurrents is clearly demonstrated: the diode voltage of each pixel is set to around VDD – Vt, then each photocurrent discharges the pixel voltage with a slope, depending on the light intensity. The final voltage may be sampled at time 50 ns. In real-case light sensors, 100 µs up to 1 ms are required before sampling the pixel voltage and converting it into image information. Color filter materials are used to selectively capture the blue, green and red components of the incident image. The transmission of light has a general shape as shown in Fig. 6.51. The blue-filter passes

190 Advanced CMOS Cell Design

Fig. 6.50 Simulation of the active pixels (ImageSensor 2 × 2. MSK)

electromagnetic waves around a 475 nm wavelength, the green-filter passes mainly 510 nm waves, and the red-filter passes 650 nm waves. Color filters may be placed mechanically on the top of the pixel array, in order to assign one color to one pixel, according to a regular assignation pattern such as the one shown in Fig. 6.51.

Fig. 6.51 Filters used to assign one color to one pixel

Converters and Sensors 191

6.7 Conclusion In this chapter we have introduced the basis of digital-to-analog signal conversion. The implementation of resistor ladders has been detailed, with an illustration of non-linearity effects. The analog-to-digital converter principles have also been described, with the example of the flash converter and successive approach converter. Finally, temperature and light sensors have been briefly introduced.

References [1] R.J. Backer, H.W. Li and D.E. Boyce CMOS Design, Layout and Simulation, IEEE Press, 1998, www.ieee.org [2] B. Razavi, Design of Analog CMOS Integrated Circuits, McGraw-Hill, 2001, ISBN 0-07-238032-2, www.mhhe.com [3] Mikael Gustavsson, J. Jacob Wikner and Nianxiong Nick Tan, CMOS Data Converters for Communications, Kluwer Academic Publishers, 2000, ISBN 0-7923-7780-X. [4] Ljubisa Ristic, Sensor Technology and Devices, Artech House, 1994, ISBN 0-89006-532-2. [5] S. Ben Dhia, F. Caignet and E. Sicard, “A New Method for Measuring Signal Integrity in CMOS ICs”, Microelectronics International, Vol. 17, No. 1, January 2000, pp 17–21.

EXERCISES 6.1 Design a three-bit thermometer-to-binary coder according to the schematic diagram shown in Fig. 6.34. 6.2 Design a three-bit flash converter using the thermometer coder and a set of seven comparators. 6.3 Design an iterative converter using a four-bit counter and the R-2R four-bit DAC. 6.4 Design a successive-approach converter using a specific register and the R-2R four-bit DAC.

192 Advanced CMOS Cell Design

7 Input/Output Interfacing This chapter is dedicated to the interface between the IC and the external word. After a brief justification of the power supply decrease, the input/output (I/O) pads used to import and export signals are dealt with. Then, the input pad protections against electrostatic discharge and voltage overstress are described. The design of output buffers is also presented, with focus on current drive. Specific aspects of I/O floorplan, supply clamp and interfacing with packages are also introduced, followed by a short description of IBIS standard for modeling I/O behavior, and of the signal transport between ICs.

7.1 Power Supply The power supply of ICs has continuously decreased with progress in process integration. Figure 7.1 shows the evolution of the supply voltage with technology scale-down. A difference is made between the external supply and the internal supply. The external supply, usually 5 V, 3.3 V or 2.5 V concerns the I/O interface. For compatibility reasons, the chip interface is kept at these high standard voltages, which eases the exchanges with other ICs. The low internal supply concerns the core logic. Using a low voltage is attractive for low-power applications and to prevent the very thin gate

Fig. 7.1 Power supply decrease with technology scale-down

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Input /Output Interfacing 193

oxide for overstress and possible destruction. Classically, for reliable operations, the maximum voltage handled by the gate oxide is 0.7 V/nm. In 0.12 µm CMOS technology, the external supply VDDH is 2.5 V and the core supply VDD is 1.2 V. The I/O structures work at high voltage by making extensive use of specific MOS devices with thick oxide, while the internal devices work at low voltage with optimum performances. Remember that the oxide thickness is 2 nm for the core MOS devices in 0.12 µm, and that the breakdown-voltage for highquality SiO2 may reach 1.0 V/nm, which means that the device can handle up to 2 V. The voltage translator ensures bi-directional conversion between high and low voltage signals (Fig. 7.2). In the die, several functions are supplied at high voltage, such as on-chip regulators, PLLs, ADC and DAC converters, radio-frequency circuits, and power amplifiers.

Fig. 7.2 Multiple power supply in deep submicron technology

7.2 The Bonding Pad The bonding pad is the interface between the IC die and the package. The pad has a very large surface (almost gigantic compared to the size of logic cells) because it is the place where a connection wire or a solder ball is attached to bond the electrical link to the outside word, as shown in Fig. 7.3. The pad size is approximately 80 µm × 80 µm. The basic design rules for the pad are shown in Fig. 7.4. The pad-to-pad spacing, (Rp02), is also around 80 µm (Table 7.1). New technologies such as 90 nm enable the implementation of pad structures with 50 × 50 µm openings. The cross section shown in Fig. 7.4 illustrates the opening in passivation and associated design rule Rp04 on top of the metal and via stack. The thick oxide used for passivation is removed so that a bonding wire or a bonding ball can be connected by molding it to the package.

194 Advanced CMOS Cell Design

Fig. 7.3

The bonding pad is used to connect a wire (called the bonding) which builds electrical connection between the die and package

Fig. 7.4 Bonding pad design rules

The pad can be generated by MICROWIND using the command Edit → Generate → I/O pads. The menu shown in Fig. 7.5 gives access to a single pad, with a default size given by the technology (around 80 µm in this case), or to a complete pad rind, as detailed later. Select the item Single pad and click Generate. Then give the location in the layout for the pad. As the pad is gigantic compared to the usual design scale, click View All to see the global pad layout.

Input /Output Interfacing 195

Table 7.1 The bonding pad design rules Design rule

Description

rp01

Pad width:

80 µm

rp02

Between two pads 100 µm

80 µm

rp03

Border of via vs. Metal

2 µm

rp04

Opening in passivation vs. last metal

5 µm

rp05

Between pad and unrelated active area

20 µm

Fig. 7.5

Value in 0.12 µm

The bonding pad generated by Microwind and its cross-section. On top of metal6, the passivation has been removed (IOPad.MSK)

The cross section of the pad is reported in Fig. 7.5. We see the stack of metal layers, the vias and the passivation opening for the bonding to the package. In some CMOS technologies, there may exist a constraint on the via size that forces the designer to split the large via surface into an array of elementary

196 Advanced CMOS Cell Design

vias with a size of two lambda and four lambda spacing. This type of design rule has not been implemented in MICROWIND.

7.3 The Pad Ring The pad ring consists of several pads on each of the four sides of the IC, to interface with the outside world. The default menu for automatic generation of a pad ring is shown in Fig. 7.6. The proposed architecture is based on five pads on each side, that is, a total of 20 pads.

Fig. 7.6 Menu for generating the pad ring and corresponding architecture

The layout of the default pad ring generated in 0.12 µm is shown in Fig. 7.7. Two pairs of supply VDD/VSS are automatically added to the pad structure. The first pair is fixed on the west-side, and the second pair on the east-side. More VDD/VSS pairs may be generated. Usually one VDD/VSS pair is needed for eight to ten active I/O pads. Each I/O pad includes an over-voltage protection circuit which appear near the inner supply ring. These structures are justified and described later in this chapter. The way the supply pads are connected to the internal rings is detailed in Fig. 7.8. All VSS pads are connected to the outer ring, while the VDD pads are connected to the inner ring. The more power the circuit needs, the larger the number of supply pads. In a 800 I/O IC, nearly 100 pads are dedicated to the voltage supply, split equally between the VSS and VDD supply pads. 7.3.1 The Supply Rails The supply voltage may be 5 V, 3.3 V, 2.5 V, 1.8 V or 1.2 V. Most designs in 0.12 µm use 1.2 V for the internal core supply and 2.5 V for the interfacing. This is because the logic circuits of the core operate at

Input /Output Interfacing 197

Fig. 7.7

The default pad ring generated in 0.12 µm, with 20 pads, including two pairs of VDD/VSS supply pins (padRing.MSK)

Fig. 7.8 Connecting to the VSS and VDD internal ring (PadRing.MSK)

198 Advanced CMOS Cell Design

low-voltage to reduce power consumption, and the I/O structures operate at high-voltage for external compatibility and higher immunity to external perturbations. Usually, an on-chip voltage regulator converts the high-voltage into an internal low-voltage. In most cases, the IC uses two separate supply pads, one for the high-voltage, one for the low-voltage. Consequently, the IC has four supply rings: VSS for I/Os (0.0 V), VDD for I/Os (2.5 V), VSS for the core (0.0 V), and VDD for the core (1.2 V). An example of a four-ring design is shown in Fig. 7.10.

Fig. 7.9 Zoom-in view of the supply rings VSS and VDD for I/Os and the core (PadRingDoubleGrid)

7.3.2 Supply Rail Design A metal wire cannot drive an unlimited amount of current. When the average current density is higher than 2.109 A/m2 [2], the grains of the polycrystalline aluminum interconnect start to migrate (the phenomenon is called electro-migration) and the conductor ultimately melts. To handle very high current flows, the supply metal lines must be enlarged. A typical rule of thumb is 2 mA/µm width for aluminum supply lines, and 5 mA/µm for copper, which means that a copper interconnect is superior to aluminum in sustaining large currents. A complex logic core may consume amperes of current. In that case, the supply lines must be enlarged in order to handle very large currents properly. The usual design approach consists of creating a regular grid structure, as illustrated in Fig. 7.10, which provides the supply current at all points of the IC. In that test circuit, the VDD supply is assigned to metal5, and VSS to metal6. These upper layers are thicker than

Input /Output Interfacing 199

Fig. 7.10

The supply rails are routed in metal5 and metal6 with a regular grid to provide power supply in all regions of the IC

200 Advanced CMOS Cell Design

the lower metal layers and consequently can drive larger currents. The grid spacing is around 100 µm, and each supply conductor width is around 5 µm. Enlarging the supply width would reduce the routing area. Reducing the supply width would limit the current drive. 7.3.3 Metal Slit Certain precautions should be taken when enlarging metal lines larger that 20 µm. The coefficients of thermal expansion of SiO2 and metal are significantly different: 0.5 ppm/°K for SiO2, 2.8 for silicon and 23 for aluminum. The stress accumulated during significant temperature variations may break the large metal traces and crack the oxides [2]. Metal tracks less than 20 µm width do no suffer such problems. The automatic layout generation tools in MICROWIND do not take this effect into account. The correct design in the case of large metal tracks is shown in Fig. 7.11. Holes are inserted regularly in the metal to split the box into parallel conductors. This discourages delamination and oxide cracks.

Fig. 7.11 Very large metal lines are designed with slits to prevent damages caused by mechanical thermal stress

7.3.4 Power Line Connection The layout design of power line connections must be handled with care because of the strong current flow. An example of poor design is shown in Fig. 7.12 (a). Metal5 is connected to metal6 using via5. In (a), an unreliable connection is formed between metal5 and metal6 because the number of vias is too small for the amount of current. Remember that each via can only drive approximately 5 mA current on an average. A better approach (b) consists of placing contacts on the border of the intersection area. The best approach (c) is to fill the crossing area with contacts. An in-depth study of power line design may be found in Clein’s book on IC layout [6].

Input /Output Interfacing 201

Fig. 7.12 The interlayer connection using vias (ConnectLayers.MSK)

7.4 Input Structures The input pad includes some over-voltage and under-voltage protections due to external voltage stress, electrostatic discharge (ESD), coupling with external electromagnetic sources, and so on. Such protections are required as the oxide of the gate connected to the input can easily be destroyed by over-voltage. The electrostatic discharges may attain 1000 to 5000 volts, with a general shape similar to that shown in Fig. 7.13. The design of I/Os which can handle such high voltages and dissipate such high energy safely, requires specific techniques which are introduced in the following sections.

Fig. 7.13

Example of electrostatic discharge on the I/O pin of an IC

One of the simplest ESD protections is made up of one resistance and two diodes (Fig. 7.14). The resistor helps to dissipate the parasitic energy and reduces the amplitude of the voltage overstress. One diode handles the negative voltage flowing inside the circuit (N+/Psubstrate diode), and the other diode (P+/Nwell) handles the positive voltage. The combination of the serial-resistor and the diode-bridge represents an acceptable protection circuit against transient voltage overstress around +/–50 V. 7.4.1 Resistor Design Two types of resistors are available for ESD protection: polysilicon and Nwell. The polysilicon resistor is completely embedded in oxide, and can therefore handle very high voltage stress. However the polysilicon is thin and its current capabilities are limited. The default salicidation of the polysilicon layer must be removed to obtain a high resistivity. The design of an ESD resistor is different from usual

202 Advanced CMOS Cell Design

Fig. 7.14 Input protection circuit (IOPadIn.SCH)

polysilicon resistors, mainly because of the high current, requiring an enlarged layer-width and multiple contacts (Fig. 7.15). The usual ESD resistance value ranges between 50 and 2000 Ω. In MICROWIND, the polysilicon resistor is generated by fixing the target resistance value, and enlarging the width to several µm to handle strong currents.

Fig. 7.15 Polysilicon resistor used in an input protection circuit (ResPoly.MSK)

The major drawback of the polysilicon resistor is the poor heat dissipation. The oxide has a positive role in isolating the polysilicon conductor from the rest of the circuit for over-voltages expressed in kilovolts. However, the same oxide plays a negative role by thermally isolating the polysilicon material which would ultimately melt because of overheating. An alternative solution consists in using the Nwell as a resistor. In that case, N+ diffusion contacts are used for the electrical connection between the upper metal and the lower well area. The thermal conduction

Input /Output Interfacing 203

is excellent but the voltage isolation is limited by the breakdown of the junction between the Nwell resistor and the Psubstrate (the order of 10–20 V). 7.4.2 Diode Design Diodes are essential parts of the ESD protection. Used since the infancy stage of microelectronics, diodes are still widely used because of their efficiency and simplicity [1]. The native diodes in CMOS technology consist of an N+ diffusion in the Psubstrate and a P+ diffusion in the Nwell, as shown in Fig. 7.16.

nwell N+ over Psubstrate

P+ over nwell

P+ diffusion

N+ diffusion

n+

p+ n–

Psubstrate (0 V)

Nwell

Fig. 7.16 The simplest diodes in a CMOS process (Diode.MSK)

As the substrate is at 0 V, the P–/N+ diode is turned on only if the N+ voltage is significantly negative (–0.5 V). If the N+ voltage is higher than 0 V (which is the usual case), the diode has no effect, and can be considered as a parasitic capacitance. As the Nwell is usually connected to a high-voltage VDDH, the P+/Nwell diode is turned on only if the P+ voltage is higher than the I/O supply voltage VDDH (VDDH + 0.5 V, that is around 3 V in 0.12 µm).

204 Advanced CMOS Cell Design

Fig. 7.17 The diode-generating menu in Microwind (by default a P+/Nwell diode)

The command used to generate a protection diode in Microwind is Edit → Generate → Diode. Click either the P+/Nwell diode or the N+/Psubstrate diode. By default, the diode is quite large, and connected to the upper metal by a row of 10 contacts. The N+ diode region is surrounded by a polarization ring made of P+ diffusion, as shown in Fig. 7.18. The large number of rows ensures a large current capability, which is very important in the case of ESD protection devices. P+/Nwell diode

N+/Psubstrate diode

P+ polarization

N+ polarization P+

N+ Details on diffusions

Details on diffusions

Fig. 7.18 Generating a default protection diode (IODiode.MSK)

Input /Output Interfacing 205

A protection circuit example is given in Fig. 7.19. It consists of a 50 × 50 µm pad, a serial resistor around 200 Ω, and two diodes. When a very high sinusoidal waveform (+/– 10 V) which corresponds to an electrical overstress is injected, the diodes exhibit a clamping effect both for the positive and negative voltages. The best simulation mode is Voltage and Currents. The voltage scale may be changed using the arrows on the left-side of the lower-voltage window. The internal voltage remains within the voltage range 0 to VDDH while the voltage near the pad is between –10 and +10 V. Notice that the current flowing in the diodes is around 1 mA (Fig. 7.20).

Fig. 7.19 A test-case to evaluate the role of diodes (IoPadIN.MSK)

Considering a real-case electrostatic discharge, the voltage may rise to 1000 V–5000 V, which corresponds to a diode current more than 100 times larger, that is 100–500 mA. Around 100 contacts would be needed for minimum reliability. In industrial-case ESD protections, the diode length is approximately 50 µm. Notice that the lateral surface of the diode is more important than its surface, as the current flows mainly horizontally. The design shown in Fig. 7.21(b) should be preferred to the one of Fig. 7.21(a) because the lateral surface is larger, meaning a better current efficiency.

206 Advanced CMOS Cell Design

Fig. 7.20

The diodes clamp the positive and negative over-stress so that internal voltage keeps close to the voltage range 0 to VDDH (IoPadIN.MSK)

Fig. 7.21 Lateral surface of the diode should be maximized for better efficiency (IODiodes2.MSK)

Input /Output Interfacing 207

7.4.3 Clamp MOS Devices An interesting device for electrostatic discharge prevention, called the gate-coupled NMOS is described in [1]. The schematic diagram of the circuit is shown in Fig. 7.22. It consists of two stages of protection. The first one handles the majority of the current, and the second one assists the first stage with relaxed stress constraints. Such protections can handle 5–7 kV (kilo-volt) ESD stress, with several design iterations and experimental testing. The C1-R1 circuit is a high-pass filter. By default the voltage of node Ng is zero, due to the weak tie to VSS. A very sharp over-voltage such as the one created by an electrostatic discharge induces a positive voltage on node Ng which turns on the clamp MOS by capacitance-coupling. A current path is established between the pad and the ground, until the voltage of Ng goes below the threshold voltage. Finally, the clamp is turned off and the ESD charges are eliminated. Note that a nominal rise edge from zero to VDD should not turn on the clamp. Therefore, C1 and R1 must be chosen to eliminate the ESD pulse while keeping quiet in the presence of logic edges.

Fig. 7.22

The ground-connected MOS turns on and short-cuts the over-voltage at a very sharp rise edge of input voltage, such as in an ESD pulse (IOPadIn.SCH)

A lot of subtle layout issues arise with the implementation of high-performance electrostatic discharge protections, as described in [4]. The clamp MOS is a good example of specific layout techniques for optimized behavior when faced with overstress. The key idea is to route the parasitic current flow straight from the input pad to the ground. A double-oxide MOS device (see next section) is used to handle strong voltage stress. The diffusion connected to the pad is enlarged to create a small serial resistance which is used as a ballast (Fig. 7.23). The salicidation of the drain and source is generally removed to increase this ballasting effect.

208 Advanced CMOS Cell Design

Fig. 7.23 Specific layout of the gate-grounded MOS used in advanced input pads (ggMos.MSK)

7.4.4 Zener Diode The Zener diode is equivalent to a normal diode, but has a different behavior in invert mode, as it turns on for a very negative VPN voltage, that is, a very high padIn voltage. The characteristics of the Zener diode are shown in Fig. 7.24. For positive VPN, the diode is in direct mode, for negative VPN, the diode is off. However, when VPN is strongly negative (less than VZ), the diode is turned on again, with a socalled Zener effect. The diode layout is also shown in Fig. 7.24. An option layer configured to extract the diode surrounds the diode area (dotted rectangle in the layout). The diode model parameters are derived from the BSIM4 model, and are not accessible to the user. In contrast, the surface of the diode has a direct impact on its characteristics. Turning back to the input protection circuit, a significant increase in the pad voltage corresponds to a negative increase in VPN. When passing the VK limit, the Zener diode is turned on and the charges start to flow through the substrate to the ground. The simulation of the Zener diode as a protection circuit is proposed in the schematic diagrams of Fig. 7.25. The simulation setup proposed on the left is incorrect as the direct connection of the voltage source to the Zener diode also forces the output voltage so that no clamp effect can be seen. In contrast,

Input /Output Interfacing 209

Fig. 7.24 The Zener diode (IOPadZener.MSK)

Fig. 7.25 The Zener effect can be seen for positive over-stress (IOPadZener.MSK)

210 Advanced CMOS Cell Design

the serial resistor Rdif in the right-side figure creates the required impedance between the voltage source and the output to enable the observation of the Zener effect. The diode model used in MICROWIND includes the Zener effect if the simulation is performed in BSIM4 mode. In the simulation of Fig. 7.25, the large positive voltage provokes the necessary conditions for a negative VPN and consequently a Zener clamp. The diode in direct mode is observed for negative input values, which corresponds to positive VPN. An input protection circuit which combines the Zener diode as a primary protection circuit and the ground-connected MOS as a secondary protection circuit is proposed in Fig. 7.26. Such structures may handle severe ESD stress, as well as other parasitic transient pulses found in industrial applications.

Fig. 7.26 An input pad protected with Zener diode and diffusion resistor (IOPadIn.SCH)

7.4.5 High-voltage MOS The general diagram of an input structure is given in Fig. 7.27. A high-voltage buffer is used to handle voltage overstress issued from electrostatic discharges. The logic signal is then converted into a lowvoltage signal to be used in the core logic. For interfacing with I/O, specific high-voltage MOS are introduced. These MOS devices are called high-voltage MOS. They use a double gate-oxide to handle the high-voltage of the I/Os. The thin oxide used for internal logic devices would be damaged by the high I/O voltage. In DSCH, the high-voltage devices are drawn with a double-line. The symbol Vdd_HV represents the I/O voltage, which is usually 2.5 V in 0.12 µm CMOS. The high-voltage MOS layout differs slightly from the normal MOS, as shown in the comparative layout view of Fig. 7.28. The high-voltage MOS uses a gate-width which is much larger than that of the regular MOS. Usually, the lateral drain diffusion, which aims at limiting the hot-carrier effect and boosting the device lifetime, is removed in high-voltage MOS devices. It has been shown that lateral drain diffusion degrades the ESD protection performances [4]. One reason is the lower efficiency of LDD devices in enabling strong currents to flow in the channel (Fig. 7.29). Consequently, LDD device are slower to evacuate the parasitic energy. The gate-oxide thickness is twice that of the oxide of the core logic (Fig. 7.31). In 0.12 µm, the gate-oxide of the high-voltage MOS is around 5 nm, while the core MOS is 2 nm.

Input /Output Interfacing 211

Fig. 7.27

Basic principles of an input circuit, including ESD protection and voltage translator (IOPadIn.SCH)

Fig. 7.28 High-voltage MOS device versus Normal MOS (MOSHighVoltage.MSK)

A bird’s view of the layout (Fig. 7.31) reveals that the polysilicon gate is not the usual two-lambda length. In the case of high-voltage MOS devices, the minimum length is four lambda. The two lambda sizing is not compatible with the double gate-oxide and the high-voltage operation. The gate-oxide is twice as thick as that in the low-voltage MOS. The high-voltage device performance corresponds approximately to a 0.25 µm MOS device. To turn a normal MOS into a high-voltage MOS, the designer

212 Advanced CMOS Cell Design

Fig. 7.29 Static characteristics of high-voltage MOS (MOSHighVoltage.MSK)

Fig. 7.30

The particulars of MOS devices used in input pads: removed LDD and double gate-oxide (IOPadMos.MSK)

must add an option layer (the dotted rectangle in Fig. 7.31). Selecting the option High voltage MOS (Fig. 7.32) assigns high-voltage properties to the device: double oxide, removed LDD, different rules for minimum length, and different MOS model parameters.

Input /Output Interfacing 213

Fig. 7.31 Layout of the input MOS device (IOPadMos.MSK)

Fig. 7.32 Handling high-voltage property through the option layer (IOPadMOS.MSK)

214 Advanced CMOS Cell Design

The simulation of the complete input pad is proposed in the schematic diagram of Fig. 7.33. A slow sinusoidal waveform DataIn (10 MHz) is generated between 0 and 2.5 V, with an additive noise. The noise is a random number mainly concentrated between –1 and +1 V, with a Gaussian distribution. The noise contains virtually all frequencies spread uniformly from very low to very high frequencies. The noise input passes through the serial polysilicon resistor of around 330 Ω, then through the two diodes, and serves as the input for the high-voltage inverter. The output SinHV is connected to the low-voltage inverter.

Fig. 7.33 Simulation of an input structure with voltage translation from high to low voltage (IOPadIn.SCH)

Click Add Noise in the sinusoidal parameter window to activate the noise generation in addition to the desired signal. The Gaussian noise model gives a good approximation of ambient noise (Fig. 7.34). The RMS amplitude is derived from an evaluation of the Root-Mean-Square of the noise sampled data noise[n]. The RMS voltage is given by Eq. 7.1. It gives a good indication of the envelope of the noise, as the exact amplitude may not be determined due to the random nature of the signal.

Fig. 7.34 Adding random gaussian noise to the sinusoidal voltage source

Input /Output Interfacing 215

n

∑ (noise[i]) Vrms =

i=0

n

2

(Eq. 7.1)

The simulation shown in Fig. 7.35 gives an interesting insight into the signal propagation within the input pad. First, the noisy sinusoidal waveform is significantly filtered by the serial resistance and the parasitic diode capacitance. The noise amplitude of signal Sin is greatly reduced. However, due to the slow rise and fall of the input signal, a risk of parasitic glitch may appear. The signal Sin_HV gives a logic translation of the input voltage which is converted into a low-swing signal Inv_buff by the lowvoltage inverter.

Fig. 7.35 Simulation of a noisy input and response of the input buffer (IOPadInInv.MSK)

7.4.6 Input Pad with Schmitt Trigger Using a Schmitt trigger instead of an inverter helps to transform a very noisy input signal into a clean logic signal. The Schmitt trigger circuit switches at different thresholds, in order to increase the noise margin of the input buffer. The main difference between the inverter and the Schmitt trigger appears in the simulation shown in Fig. 7.36. While the inverter may transform a noisy input into several glitches at the output near the commutation point of the inverter, the Schmitt trigger produces a single commutation.

216 Advanced CMOS Cell Design

Fig. 7.36 The filtering effect of Schmitt trigger with a noisy input signal (TriggerCompInv.MSK)

The schematic diagram of the trigger is proposed in Fig. 7.37 [3]. A brilliant idea lies beyond this circuit—it is based on a modification of the commutation point, thanks to feedback MOS devices. The pMOS feedback device adds a path to ground when Trigger_Out is low. Consequently, the thresholdvoltage is lowered to a commutation point Vc_low, lower than the commutation point of a regular inverter VC. The nMOS feedback device adds a path to VDD_HV when Trigger_Out is high. Consequently, the threshold voltage has risen to Vc_high, higher than the commutation point VC. The layout of the trigger is shown in Fig. 7.38. The feedback MOS devices are situated on the right of the trigger core. An inverter is added for comparison. The most demonstrative simulation is probably the compared static characteristics of the inverter and the trigger (Fig. 7.39). The static simulation is available in Voltage vs. voltage mode. First, the X-scale must be adjusted to zero to VDD_HV. Second, the hysteresis mode must be activated: at each simulation, the input signal is either decreased or increased. Finally, the trigger characteristics may be added to the inverter by changing the selected output.

7.5 Digital Output Structures The role of the output buffer is to ensure that the signal coming out of the IC (IC1 in Fig. 7.40) is propagated safely to the receiver which is usually the input of a second IC (IC2). The emitter signal comes from a low-voltage inverter Inv_out1. In 0.12 µm, the voltage range is zero to 1.2 V. Most I/O interfaces operate at high voltage (2.5 or 3.3 V) for compatibility and speed reasons, as well as robustness to parasitic interference. The signal is transformed into a high-voltage signal through the inverter Inv_Out2

Input /Output Interfacing 217

Fig. 7.37 Schematic diagram of the trigger (Trigger.SCH)

[zero to 2.5 V], which is directly connected to the pad and to the outside world. The signal goes through the package, the printed circuit board, and the the other package which is represented by a transmission line. At the far-end of the transmission line, we find the input structure of a receiving IC, IC2, consisting of an inverter Inv_In1, working at high-voltage, and of Inv_In2, working at low-voltage.

218 Advanced CMOS Cell Design

Fig. 7.38 Layout of the trigger and inverter, using high-voltage MOS devices (TriggerCompInv.MSK)

Fig. 7.39 Static characteristics of the trigger compared to the inverter (Trigger.MSK)

Input /Output Interfacing 219

Fig. 7.40 The signal propagation between ICs, IC1 and IC2

7.5.1 Output Buffer The schematic diagram of the basic output buffer is given in Fig. 7.41. A very simple structure is used to protect the output buffer from electrostatic discharge, and more generally from any over- or undervoltage. A Zener diode may be used, or a set of two diodes, as for the input pad.

Fig. 7.41 The output buffer design including protection against electrostatic discharge (IOPadOut.SCH)

7.5.2 Level Shifter The role of the level shifter is to translate the low-voltage logic signal Data_Out into a high-voltage logic signal which controls the buffer devices. An immediate solution would consist of using a highvoltage inverter as a level shifter. The signal Data_Out has a 1.2 V voltage amplitude, as shown in Fig. 7.42. In the simulation shown in Fig. 7.43, the output signal PadOut is almost correct, except that the low level is not exactly zero. This is due to the input voltage being limited to 1.2 V. In the inverter characteristics

220 Advanced CMOS Cell Design

Fig. 7.42 The inverter used as a level shifter (LevelShiftBad.MSK)

(Fig. 7.43 left), the input voltage 1.2 V is not sufficient to obtain a ‘good’ zero on the output. A notable consequence of this incomplete switching is the large DC dissipation on a high level of DataOut, in the range of 200 micro-watts. Since this parasitic consumption appears at a low logic level, the sum of DC currents in the case of a 1000 pin IC would approach a fraction of an ampere, which is not acceptable.

Fig. 7.43 Analog simulation of the inverter showing the parasitic DC consumption (LevelShiftBad.MSK)

Input /Output Interfacing 221

Figure 7.44 gives the schematic diagram of a level shifter circuit which solves the problem of parasitic DC power dissipation. The circuit consists of a low-voltage inverter, the level shifter itself, and the buffer. The circuit has two power supplies: a low-voltage VDD for the left-most inverter, and a highvoltage VddHV for the rest of the circuit.

Fig. 7.44 Schematic diagram of a level shifter (IOPadOut.SCH)

The layout of the level shifter is shown in Fig. 7.45. The left part works at a low voltage of 1.2 V, and the right part works with high-voltage MOS devices, at a supply of 2.5 V (VddHigh). The data signal Data_Out has a 0–1.2 V voltage swing. The output Vout has a 0–2.5 V voltage swing. In this case, no DC consumption appears except during transitions of the logic signals, as shown in the simulation of Fig. 7.45.

VddHigh++ VddLow Data_out

S1

Vout

Vss

Do_Inv

Level shifter translates low voltage clock into high voltage clock

(Contd)

222 Advanced CMOS Cell Design

Fig. 7.45 Layout and simulation of the level shifter (LevelShift.MSK)

7.5.3 Output MOS Devices The role of the output buffer is to amplify the logic signal generated by the level shifter in order to switch at the appropriate speed, which depends on the target application. Usually, the buffer stage is built from several MOS devices in parallel, in order to achieve maximum Ion current of 2, 4, 8 or 16 mA.

Fig. 7.46 Schematic diagram of the buffer stage (IOPadOut.SCH)

Input /Output Interfacing 223

The buffer stage is connected to the output of the high-voltage inverter, called Out_HV in Fig. 7.46. Usually, several MOS devices are connected in parallel to achieve a large current flow in order to efficiently drive the large capacitance of the output node. The MOS devices with parallel fingers can be generated by MICROWIND, using the MOS generator shown in Fig. 7.47. Assuming that the target current is 4 mA, what we need are two fingers, the high-voltage MOS option, the minimum length of 0.24 µm for this type of device, and a width adjusted to 2.5 µm. Notice that MICROWIND also generates a set of polarization contacts, appearing on the left-side of the MOS device, for a good connection to ground. The maximum current of typical MOS devices (high-voltage option) is listed in Table 7.2.

Fig. 7.47

The MOS device generated for output pads is high-voltage, has multiple fingers and a large width to produce the desired maximum current

Table 7.2 Maximum current of basic nMOS and pMOS I/O devices (0.12 µm CMOS) MOS width

MOS length

ION_n (mA)

ION_p (mA)

1.2 µm

0.24 µm

0.85

0.65

2 µm

0.24 µm

1.3

1.0

10 µm

0.24 µm

7.0

5.0

2 µm

0.5 µm

1.1

0.6

10 µm

0.5 µm

5.5

3.0

224 Advanced CMOS Cell Design

Fig. 7.48 Generating multiple-finger MOS devices for high current generation (IOPadMos.MSK)

The usual current drive of an output pad is 4 mA. The nMOS and pMOS device that can switch 4 mA are shown in Fig. 7.48. A high current-drive is mandatory to ensure rapid charge and discharge of the parasitic output node capacitance. 7.5.4 Output Buffer Simulation Let us assume that the output pad structure drives a 5 pF load, which is quite low. The DataOut signal is shifted again by the level shifter to a zero to 2.5 V signal, and serves as a command for the MOS devices in parallel, which ensures the charge and discharge of the load (Fig. 7.49). The virtual capacitor of 5 pF is added to the layout using the capacitor icon in the palette. The simulation reported in Fig. 7.50 shows a charge and discharge of this capacitor within around 3 ns, which is sufficient in the case of mediumspeed applications. In the case of a high-speed signal transport, as in a memory bus, the current-drive must be increased. 7.5.5 Output MOS Protection to ESD To improve the robustness of the output structure to electrostatic discharge and voltage over-stress, the MOS layout can be improved. Firstly, the salicidation of the drain should be removed to increase the sheet resistance from the channel to the output pad and to enhance the ballasting resistance effect which has a positive influence on ESD protection performance. The ballasting resistance is mainly used to dissipate over-voltage. The salicidation of the gate and source have no impact on ESD protection. The only region that should be resistive is the drain diffusion area connected to the pad. A specific option

Input /Output Interfacing 225

Fig. 7.49

The schematic diagram of the output pad with the buffer stage, using MOS devices in parallel (LevelShiftBuff.MSK)

layer is added in the MOS design (see Fig. 7.51) to increase the serial resistance which aims at protecting the output MOS device from parasitic stress. The default salicidation causes a reduction by a factor of 10 of the serial drain resistance from the pad contact to the gate, resulting in significant protection degradation. An option layer is added to the layout

226 Advanced CMOS Cell Design

Fig. 7.50 The analog simulation of the output pad loaded with 5pF (LevelShiftBuff.MSK)

Fig. 7.51 Cross section of the MOS devices used in I/O pads (IOPadMos.MSK)

covering the ballasting region, with the activation of the property Remove Salicide which blocks the titanium salicidation, and keeps the parasitic serial resistance intact (Fig. 7.52). 7.5.6 Three-state and Programmable Drive Buffer The programmable drive buffer is used to adapt the current drive to the load, through a programming logic circuit. The circuit shown in Fig. 7.53 is capable of switching the output signal Pad_Out with a 2 mA, 4 mA or 6 mA current. The nMOS and pMOS drivers are controlled independently. If all Enables are inactive (Fig. 7.54 topleft) and all switches are off, the pad is in high-impedance state. This is convenient to realize a three-

Input /Output Interfacing 227

Fig. 7.52

MOS devices used in I/O pads: no LDD, double gate-oxide, and a ballast region (IOPadMos.MSK)

Fig. 7.53 The programable drive buffer (IOOutProgDrive.SCH)

228 Advanced CMOS Cell Design

Fig. 7.54 The simulation of the programmable drive buffer (IOOutProgDrive.SCH)

state buffer. If Enable_2mA is asserted, the 2 mA buffer is activated (top-right and bottom-left). When both Enable_2mA and Enable_4mA are active, both buffers work in parallel, adding their currents to a total of 6 mA. These specifications are summarized in Table 7.3. Table 7.3 The output current depends on the enable signals (IOOutProgDrive.SCH) Enable_4mA

Enable_2mA

PadOut current

0

0

none (3-state)

0

1

2 mA

1

0

4 mA

1

1

6 mA

7.6 Pull-up, Pull-down It might be interesting to add the possibility of a weak tie to a defined voltage, particularly in the case of shared data buses. The role of the pull-up resistor shown in the schematic design of Fig. 7.55 is to recall

Input /Output Interfacing 229

Fig. 7.55 The three-state output pad with a 10 kohm pull-up (IOPadOut.SCH)

the output node to VDD_HV when the connection is floating. Using a MOS device as a 10 kΩ switched resistance is efficient in terms of silicon area. The only drawback of this pull-up resistance is that the active logic level ‘0’ is a little higher than 0.0 V due to the leakage to VDD_HV, which leads to a nonnegligible DC current consumption. Similarly, a high-resistance nMOS device may create a weak tie to ground, which is equivalent to a pull-down device. 7.6.1 I/O Pad The I/O pad structure is a combination of input and output pad structures. The I/O pad contains one input stage together with one output stage, usually with extensive programmable functionality. In Fig. 7.56, the output stage can be turned off when the signal Out_Enable is inactive. The pull-up and pulldown devices can be activated through PullUp_Enable and PullDown_Enable signals. More complex I/O pads may include programmable drives, as described previously.

Fig. 7.56 Design of an I/O pad (IOPad.MSK)

230 Advanced CMOS Cell Design

7.7 Low Voltage Differential Swing The main speed limitation in signal propagation is the time required to reach the logic state “1” or “0”. Working at low supply voltage reduces the voltage swing and thus decreases the time required for the signal to reach the final logic area. Unfortunately, decreasing the logic cell supply also reduces the Ion current of the MOS devices which drives the output line. The differential swing logic circuit [7] uses two information signals vin and ~vin instead of one, where ~vin represents the logical complement of vin (Fig. 7.57). The receiver works in differential mode, with signals that have a low swing amplitude.

Fig. 7.57 Low-voltage differential swing logic to improve signal transport

Low-voltage differential swing (LVDS) circuits operate at much higher frequencies than conventional CMOS drivers. However, LVDS circuits dissipate a significant amount of DC current, even when there is no switching activity. In the simulation of Fig. 7.58, a biasing current of 100 µA appears in the upper window. Combined with the fact that LVDS circuits require two interconnects instead of one, differential circuits are limited to very high speed functions such as fast data buses. From a layout design point of view (Fig. 7.59), the sizing of nMOS and pMOS devices has a strong influence on the LVDS buffer operation. The specification of the LVDS signal has a direct impact on the width ratio between nMOS and pMOS devices. In the case of a voltage swing of 400 mV, from 0.4 V to 0.8 V, the pMOS should be significantly smaller than the nMOS to ensure reliable working conditions.

Input /Output Interfacing 231

Fig. 7.58 Low voltage differential swing simulation (Lvds.MSK)

Fig. 7.59 Low-voltage differential swing layout (Lvds.MSK)

232 Advanced CMOS Cell Design

7.8 Power Clamp The power clamp is an efficient circuit that protects the logic core from oxide destruction, as a result of an electrostatic discharge appearing on the supply line. A simple circuit for this power clamp is proposed in Fig. 7.60, which has clear similarities with the ground-gated MOS placed in input pads. The nMOS clamp is normally off, as R1 ties the gate to a zero voltage, and C1 is charged. Assuming that an ESD pulse appears at the VDD supply, the nMOS transistor is turned on at a violent rise of VDD, which induces a positive voltage peak on Vg by capacitance-coupling through C1. Consequently the MOS clamp is turned on, and the over-voltage is limited. The MOS device width and the values for R1 and C1 are optimized to limit the core supply over-voltage below the gate-oxide breakdown, without being sensitive to small VDD fluctuations. In 0.12 µm, the oxide breakdown of the 2.5 V logic is around 6 V, but decreases to 3.0 V for the 0.12 µm devices.

Fig. 7.60 Design of a power clamp (PowerClamp.SCH)

7.9 Core/Pad Limitation When the active area of the chip is the main limiting factor, the pad structure may be designed in such a way that the width is large but the height is as small as possible. In that case, the excess size due to the pads is minimized. Protections are placed on both sides of the pad area. This situation is often called ‘Core Limited’, and corresponds to the design shown in Fig. 7.61. In most pad libraries, the core limited structures have a minimum height, because the protection circuits are placed on both sides of the pad. When the number of pads of the chip is the main limiting factor, the situation is called ‘Pad Limited’, and corresponds to the design shown in Fig. 7.62. The pad structure may be designed in such a way that the width is small but the height is large. In that case, the excess size due to the pads is minimized. Protections are placed under the pad area.

Input /Output Interfacing 233

Fig. 7.61 Chip-size fixed by the core

Fig. 7.62 Chip-size fixed by the number of pads

The spared silicon area may be avoided by using a double-pair of I/O pads, as illustrated in Fig. 7.63. This attractive feature has been made available starting 0.25 µm technology. An example of a test-chip using a double pad ring is shown in Fig. 7.63, which corresponds to a CMOS 0.18 µm test-chip fabricated by ST-Microelectronics for research purposes [8]. The pad pitch is significantly reduced thanks to the

234 Advanced CMOS Cell Design

double-row of bonding pads. The pad pitch for a single row is the sum of the minimum pad-width Rp01 and of the pad-distance Rp02. In the double-ring structure, the pad-pitch is divided by a factor of two.

Fig. 7.63 An example of a double-ring test-chip in 0.18 µm technology [8]

There exist possibilities of placing three rows of bonding pads in some circuits, such as in some state-of-the art processors and microcontrollers. Future trends may include the use of a matrix of bonding balls all over the surface of the chip. This technique, called chipscale packing, is already in use for some high performance ICs.

7.10 I/O Pad Description Using IBIS IBIS is a standard for electronic behavioral specifications of IC input/output analog characteristics. In order to enable an industrystandard method to transport IBIS Modeling Data electronically between semiconductor vendors, simulation vendors, and endcustomers, a format has been proposed by the IBIS group. Version 3.2 of IBIS was finalized as an international standard by a wide group of industry experts representing various companies and interests. A complete backup of slides and meeting notes for the latest IBIS open forum is available on the IBIS web site [5].

Fig. 7.64

Controlling the I/O pin assignment by an IBIS description file

Input /Output Interfacing 235

MICROWIND uses IBIS to pilot the generation of I/O pads, when compiling a VERILOG file (Fig. 7.65). Click the button Load in front of the check box Fixed I/Os, in the VERILOG menu. The default IBIS file is default.IBS. The screen shown in Fig. 7.65 appears.

Fig. 7.65 The IBIS description file loaded for controlling the pin assignment

It can be seen that IBIS is a text file, with a simple structure based on keywords. We only use a very reduced set of the available keywords, listed in Table 7.4. Table 7.4 The IBIS keywords understood by MICROWIND [IBIS Ver]

Specifies the IBIS template version. This keyword informs electronic parsers of the kinds of data types that are present in the file.

[File Rev]

Tracks the revision level of a particular .ibs file. Revision level is set at the discretion of the engineer defining the file.

[Component]

Marks the beginning of the IBIS description of the integrated circuit named after the keyword.

[Manufacturer]

Specifies the manufacturer’s name of the component. Each manufacturer must use a consistent name in all .ibs files.

[Package]

Defines a range of values for the default packaging resistance, inductance, and capacitance of the component pins. Sub-Parameters are named R_pkg, L_pkg, C_pkg

[Pin]

Associates the component’s I/O models to its various external pin names and signal names. Each line must contain either three or six columns. A pin line with three columns associates the pin’s signal and model. Six columns can be used to override the default package values. In that case headers R_pin, L_pin, and C_pin appear.

236 Advanced CMOS Cell Design

At a click on Generate Pad, the layout of Fig. 7.67 is created, which corresponds to the list of pins declared in the IBIS file, as seen in Fig. 7.66.

Fig. 7.66 The I/O pad generation constructed using the IBIS file default.IBS

7.11 Connecting to the Package The IC is usually connected to the package by bonding-wires or solder-balls. In the first case, the bonding wires are made of gold, with a usual diameter of 25 µm. The wires build the link between the pads and the package leads. An example of package connection using bonding wires is shown in Fig. 7.67. As the complexity of ICs constantly increased, a new type of link was invented. This can create all the connections between the die and the package in one single step. This technology, called ‘ball-grid array’, was introduced some years ago and is now commonly used for ICs with more than 100 pins. The cross-section of a ball-grid array is proposed in Fig. 7.68. The die of the IC is flipped and connected using small solder-balls to a specific package. The package serves as a routing matrix from the IC pads (pitch close to 100 µm) to the ball-grid array (pitch between 500 µm and 2 mm). The package is a complex network of very thin copper conductors embedded in an insulator. The BGA substrate may include from two to six metal layers to achieve the routing of general-purpose signals and the distribution of power supply.

Input /Output Interfacing 237

Fig. 7.67 The structure of a Quad flat pack (QFP)

Fig. 7.68 The assembly of the die to the package using micro balls

238 Advanced CMOS Cell Design

7.11.1 Stacked Integrated Circuits To minimize the surface of the electronic systems, the trend is to stack ICs within a single package, also called System-in-Package (SiP). The result of this technique is a much more compact system, at the price of a more complex assembly, and thermal dissipation. One example of stacked ICs is shown in Fig. 7.69(a). Stacked ICs are particularly attractive when mixing processors, memories, power management, actuators, sensors and radio-frequency elements. Due to cost and reliability issues, the stacking of heterogeneous ICs may be preferred to a single all-integrated die solution. The Chip-Scale Packaging (CSP) shown in Fig. 7.69(b), consists of connecting the chip directly to the printed circuit board (PCB) without any intermediate package substrate. The die is flipped and electrically connected to the board via solder balls. The routing constraints in the PCB are very severe as the ball pitch may be as low as 200 µm.

Fig. 7.69 The stacking of two different dies in the same package (a) and the chip-scale package (b)

Input /Output Interfacing 239

7.12 Signal Propagation Between Integrated Circuits The communication between two ICs raises a set of issues that are briefly introduced in this paragraph. The emitter signal comes from a logic circuit IC1 (Fig. 7.70). To be exported outside the IC, the signal is buffered by an inverter. The path between two ICs is a conductor that can be a few millimeters to several meters long. Typically, the distance between two ICs in standard PCBs is of the order of some centimeters. Finally, the signal enters a buffer of the receiver situated in IC2.

Fig. 7.70 The signal propagation between ICs

The transmission line effect can be seen in the simulation of Fig. 7.71. The original signal is quite clean and looks like a standard square waveform. However, the transport of the signal outside the IC, through the package, all the way along the interconnect and then inside the other IC has a significant impact on the shape of the resulting signal: propagation delay, overshoot and ringing. These effects are illustrated in Fig. 7.71.

Fig. 7.71 The signal is modified by the propagation within the package and interconnects

Some limits are defined in Fig. 7.71 for low and high logic levels. These levels differ for the emitter and the receiver. The delay finds its origin in the flight time, linked to the light speed. In a normal epoxy PCB (also called FR4, with a permittivity of 4.2), the signal propagates according to Eq. 7.1:

240 Advanced CMOS Cell Design

v =

c

(Eq. 7.1)

Ar

c = speed of light 3 × 108 m/s er = relative permittivity (no unit) The resulting propagation time is: tprop =

Ar 2.1 = ≅ 140 mm/ns 300,000 km/s c

The overshoot and ringing are due to the inductive and capacitive behavior of the interconnection situated between the emitter and the receiver. The combination of C and L provokes resonance effects. As each portion of conductor has its own inductance and capacitance, several resonance effects may be observed at different frequencies. There are various standards for I/O supply voltages, as shown in Table 7.4. The TTL standard and the low voltage TTL standard (LVTTL) work with non-symmetrical low and high levels. All CMOS standards are almost symmetrical. Table 7.5 The format of some basic I/O standards in TTL and CMOS ICs Standard

VSS (V)

VDD(V)

Vin_low

Vin_high

Vout_low

Vout_high

TTL

0.0

5.0

0.8

2.0

0.4

2.4

LVTTL

0.0

3.3

0.8

2.0

0.4

2.4

LVCMOS2V5

0.0

2.5

0.7

1.7

0.2

2.1

LVCMOS1V8

0.0

1.8

0.63

1.17

0.45

1.35

LVCMOS1V2

0.0

1.2

0.43

0.78

0.30

0.9

LVCMOS1V0

0.0

1.0

0.35

0.65

0.25

0.75

The main limitation of a conventional I/O is the full swing of the voltage output, at the cost of a significant delay in the signal switching at the far end of the receiver. A very interesting idea consists of limiting the full voltage swing of the signal (some volts) to only hundreds of mV. The flight time linked to the light speed remains unchanged but the charge and discharge time of the complete interconnect is significantly reduced. Details of small swing voltage standards (SSTL) are given in Table 7.6. The SSTL circuit at work is illustrated for a 2.5 V voltage supply (SSTL2), and a 1.25 V voltage reference. A high-speed bus using SSTL drivers is shown in Fig. 7.72. Notice the four SSTL drivers and receivers, plus the voltage reference Vref. The input data must be higher than Vout_high or lower than Vout_low, on the near end, close to the emitter.

Input /Output Interfacing 241

Table 7.6 The format of high-speed differential I/O standards used in recent CMOS ICs Standard

VDD(V)

Vref (V)

Vin_low

SSTL3

3.3

1.65

Vref-0.2

SSTL2

2.5

1.25

SSTL18

1.8

0.9

Vin_high

Vout_low

Vin_high

Vref+0.2

0.90

2.10

Vref-0.15

Vref+0.15

0.65

1.85

Vref-0.125

Vref+0.125

0.40

1.30

Fig. 7.72 The SSTL bus used for double data-rate RAM interfaces with high-speed microprocessors (Iosstl.SCH)

The received signal is considered as a one if higher than Vin_high (only 150 mV higher than Vref), and as a zero if lower than Vin_low (at least 150 mV lower than Vref). These margins appear on the right-side of Fig. 7.74. The small voltage swing enables faster data rates: the SSTL2 I/Os are used for double datarate memories with up to 800 Mbit data-rate.

7.13 Conclusion In this chapter we have described the input/output interfacing of the IC. The power supply network has been described, with emphasis on metal grid strategy. Then, several aspects of the electrostatic discharge prevention have been addressed, and the basic elements for the input protection circuit have been detailed. With regard to the output structures, the buffer architecture, the three-state option and programmable

242 Advanced CMOS Cell Design

Fig. 7.73 Typical waveforms for the SSTL2 emitter and receiver structure

drive design principles have been presented. A brief presentation of IBIS has also been provided, followed by some insights into the connection between the IC and the external world. Quad flat pack, ball-grid array, chip-scale and stacked packages have also been described. Finally, the main standards for lowvoltage and small-swing I/O signals have been listed.

References [1] Sanjay Dabral and Timothy J. Maloney, Basic ESD and I/O Design, 1998, John Wiley and Sons, ISBN 0-471-25359-6. [2] Alan Hastings, The Art of Analog Layout, Prentice Hall, 2001, ISBN 0-13-087061-7. [3] Abdellatif Bellaouar and Mohamed I. Elmasry, Low-power Digital VLSI Design, Kluwer Academic Publishers, 1996, ISBN 0-7923-9587-5. [4] Albert Z.H. Wang, On-Chip ESD Protection for Integrated Circuis: An IC Design Perspective, Kluwer Academic Publishers, 2002, ISBN 0-7923-7647-1. [5] More information about the IBIS standard may be found at their web site www.eia.org/ibis [6] C. Clein, CMOS IC Layout : Concepts, Methodologies and Tools, Newnes, 2000, ISBN 0-7506-7194-7. [7] John. P. Uyemura, Introduction to VLSI Circuits and Systems, Wiley, 2002, ISBN 0-471-12704-3. [8] Bertrand Vrignon, Sonia Bendhia, Enrique Lamoureux, and Etienne Sicard, “Characterization and modeling of parasitic emission in deep submicron CMOS”, IEEE transaction on EMC, Vol. 47, No. 2, May 2005, pp 382–387.

Input /Output Interfacing 243

EXERCISES 7.1 Design a clamp circuit sensitive to a 100 V pulse. The capacitor should be a coupling-capacitor, and the simulation should be performed with the option ‘With crosstalk’. Compare the performances of the clamp circuit and the Zener diode protection circuit proposed in this chapter. 7.2 Design a programmable I/O pad, according to the schematic diagram of Fig. 7.56, with 2, 4 or 6 mA drive capabilities. 7.3 Build a differential emitter/receiver circuit with a 2 Gb/s bandwidth. What is the critical distance to perform the detection correctly? 7.4 Evaluate the I/O density per mm2 for QFP, BGA, µBGA and CSP. 7.5 Build a SSTL bus transfer system for long interconnects on-chip. The interconnect may be routed in metal6, and correspond to 10 mm. Use the command Edit → Generate → Metal Bus to generate long bus lines automatically.

244 Advanced CMOS Cell Design

8 Silicon on Insulator

8.1 Introduction The use of Silicon-On-Insulator (SOI) technology is bringing interesting new possibilities compared to conventional bulk technology. This chapter highlights the extra performance and advantages offered by SOI, as well as the limiting parasitic effects. Performance improvements concern the power consumption and the commutation speed. In the best case, the SOI technology may cut the power consumption nearly by half, with speed improvements close to 30%. The speed improvement itself is equivalent to about two years of progress in bulk CMOS technology. The insulator material used in SOI is a buried SiO2 layer, illustrated in Fig. 8.1.

Fig. 8.1 3D view of SOI ring-inverter showing the SiO2 buried layer (inv3Soi.MSK) Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Silicon on Insulator 245

In fact, the SOI technology has been available for more than 20 years, but its applications were mainly restricted to space and army due to very low sensitivity to radiation. The road to commercial use of SOI still faces several issues: one is the cost of the substrate, which is five to 10 times the cost of a bulk wafer, another is the need to train designers to specific design techniques and rules, as the behavior of a SOI MOS device differs slightly from the bulk MOS device. Although the MOS device fabrication is slightly modified, the fabrication of the metal interconnects is identical to that of the bulk CMOS process. 8.1.1 The SOI Substrate SOI refers to placing a thin layer of silicon on top of a silicon oxide insulator, as illustrated in Fig. 8.2. The transistors are built on top of this thin layer. A 0.12 µm SOI technology, namely soi012.rul is available in MICROWIND. It enables a comparative simulation with the corresponding bulk technology (cmos012.rul, the standard CMOS 0.12 µm technology).

Fig. 8.2 3D view of SOI ring-inverter showing the SiO2 buried layer(inv3Soi.MSK)

The basic idea is that the SOI layer will reduce the parasitic junction capacitance of the switch, so it will operate faster. Every time the transistor is turned on, it must first charge its entire internal capacitance before it can begin to switch. Among these parasitic capacitances are the junction capacitances Csb and Cdb, which are strongly reduced by the silicon dioxide, as described in the two-dimensional cross section shown in Fig. 8.3. The thicker the SiO2 oxide, the smaller the parasitic capacitance. The typical insulator thickness is between 200 and 500 nm. In the SOI CMOS 0.12 µm technology provided by MICROWIND, the dielectric thickness is 300 nm, and the silicon thickness on the top of the insulator is 150 nm. As illustrated in Fig. 8.3, the junction capacitances Csb and Cdb are significantly reduced. This is meant to speed up the switching of devices. 8.1.2 Low-Voltage Operation An important feature of SOI devices is the steeper sub-threshold slope due to a reduction of the substrate body effect. Typical sub-threshold slope factors (NFACT in the BSIM4 menu) are close to 1.0 for SOI

246 Advanced CMOS Cell Design

Fig. 8.3 The junction capacitance between the source and bulk is almost eliminated in the case of SOI

devices, as compared to 1.5 for bulk devices. For a given Ioff current, the SOI circuit may have a much smaller threshold voltage, which means that the circuit can operate at lower supply. Recall that the power is proportional to the total circuit capacitor and the square of the supply voltage. This means that SOI circuits are very good candidates for low-power operations as the parasitic capacitance is reduced and the supply voltage can be lowered. Considering the ring oscillator with three inverters, we obtain a 42 GHz oscillation at a supply voltage of 0.7 V in SOI technology, rather than 1.2 V in bulk technology (Fig. 8.5). The power gain approaches a factor of four.

Fig. 8.4 The steeper sub-threshold slope enables low-voltage and low-power operations

Silicon on Insulator 247

(a) Bulk oscillator at 40 GHz: 266 µW

(b) SOI oscillator at 40 GHz: 66 µW

Fig. 8.5 The lower Ioff current and steeper sub-threshold slope enables low-voltage and low-power operations

Furthermore, the lower sub-threshold combined with a steeper slope is of key interest for analog circuits, which can provide the same functionality and approximately the same bandwidth performances, with a lower power consumption. 8.1.3 Increased Density One important feature of the SOI technology concerns the CMOS cell-density increase thanks to relaxed design rule constraints between N+ and P+ diffusions. In CMOS bulk technology, the n-channel device is separated from the p-channel device by at least 12 lambda. In SOI technology, the design rule drops to only two lambda, as shown in Fig. 8.6.

Fig. 8.6 The increased density due to relaxed design rules between N+ and P+ diffusions (SOIDiffusion.MSK)

248 Advanced CMOS Cell Design

Consequently, the layout implementation of a CMOS cell is more compact as the nMOS and pMOS devices almost touch each other. As an example, the three-inverter ring oscillator in SOI technology is 20% more compact than the bulk version (Fig. 8.7), for an identical size of nMOS and pMOS devices.

Fig. 8.7 The ring-oscillator in SOI technology (Inv3SOI.MSK)

8.1.4 Increased Operating Frequency The comparison between the SOI ring-inverter and the bulk ring-inverter is given in Fig. 8.8. We observe a very significant gain in terms of speed, nearly 80% in this case. In bulk technology, the three-inverter oscillator (Inv3.MSK) operates near 19 GHz, when using the BSIM4 model. In SOI technology, the same inverter oscillates (Inv3Soi.MSK) around 35 GHz. The significant frequency increase observed in Fig. 8.8 finds its origin mainly in the decreased parasitic capacitance of the drain junctions of the MOS devices. As no long interconnect is needed in this design, the reduction of capacitance has a very clear impact on the final frequency. Furthermore, the maximum current available in the SOI MOS is 20% higher than in the bulk version, due to a particular undesired effect (the kink effect) described later in this chapter.

(Contd)

Silicon on Insulator 249

(a) Bulk technology

(b) SOI technology Fig. 8.8 The simulation of the ring-oscillator in bulk and SOI technologies (Inv3.MSK, Inv3SOI.MSK)

8.1.5 Decreased Couplings Oxide isolation has a positive impact on the noise immunity between blocks. One of the main contributors to noise is the substrate in bulk technologies. A high-power, high-frequency circuit such as a power amplifier may inject a fraction of its switched energy to the substrate, which may have a parasitic effect on sensitive parts such as amplifier inputs or ADCs. The insulator provided in the SOI technology has very efficient decoupling capabilities which facilitate the embedding of incompatible functionalities within the same silicon substrate (Fig. 8.9). 8.1.6 High Temperature Leakage The Ioff current, corresponding to a zero gate-voltage, determines the parasitic leakage-current of the MOS device. Low leakage is important for low-power operation. The behavior of SOI devices is better than the bulk device in terms of Ioff current at high temperatures [1]. In the comparative simulations

250 Advanced CMOS Cell Design

Fig. 8.9 Increased decoupling between noisy and sensitive circuits thanks to the insulator

shown in Fig. 8.10, the sub-threshold slope is steeper for SOI at nominal temperature, as presented earlier. When the temperature is increased up to 200°C, the leakage current in the bulk device is rapidly increased up to 10 µA, while in SOI technology, the leakage is kept below 0.1 µA. Consequently, at high temperature, the SOI device has a standby current nearly 100 times lower than in bulk technology.

(a) Bulk technology

Silicon on Insulator 251

(b) SOI technology Fig. 8.10 Temperature-dependence for bulk and SOI MOS devices (low leakage W = 10 µm, L = 0.12 µm)

8.2 SOI Technology Issues 8.2.1 Kink Effect In SOI technology, when an n-channel MOS transistor passes a strong current between the drain and the source, a parasitic phenomenon called kink effect appears [1]. The current Ids suddenly rises and provokes a conductance discontinuity, usually between 0.5 V and 1 V in 0.12 µm CMOS process. The origin of this parasitic effect is the impact ionization of high-energy electrons entering the drain region, which creates supplementary positive and negative charges below the gate. While electrons participate in the Ids current, the underlying insulator prevents the positive charge from being evacuated to the substrate, as would happen in bulk technology, thanks to the natural ground connection of the substrate. The positive charges accumulate below the gate (Figure 8.11). The body of the SOI MOS device may rise significantly, without any direct control. The rise of the local voltage below the gate has an instant impact on the threshold voltage which is lowered. At a certain point, the bias of the PN junction between the P-doped bulk and the N+ source diffusion is high enough to turn on the junction, which leads to a sudden channel-current increase, as seen in the Id/Vd characteristics (Fig. 8.12). This effect is also called floating-body effect (FBE). As the impact ionization is more severe for n-channel MOS devices than for p-channel MOS devices, the kink effect is more pronounced in the n-channel than in the p-channel.

252 Advanced CMOS Cell Design

Fig. 8.11 Impact ionization creates an accumulation of positive charges below the gate in the case of SOI

(a) n-channel MOS

Fig. 8.12

(b) p-channel MOS

The drain current characteristics of the n-channel and p-channel SOI devices show a kink effect near saturation

8.2.2 Fully Depleted MOS A possibility for reducing the FBE is to use a very thin diffusion for the channel, so that there is no more room for accumulation of positive charges, and consequently almost no kink effect. The source and drain diffusions are usually manufactured with an increased thickness on top of the SiO2 insulator.

Silicon on Insulator 253

The fully-depleted MOS devices are much harder to manufacture and control. The process-controlled threshold adjustment required for low Vt, high-speed and ultra-high speed MOS devices is very complex due to the very thin diffusion area below the gate (Fig. 8.13). These drawbacks have made the fullydepleted MOS less attractive than partially-depleted MOS. The SOI process parameters provided in MICROWIND correspond to a partially-depleted MOS technology.

Fig. 8.13

The fully-depleted MOS device has no more kink effect, but several manufacturing and design drawbacks

8.3 SOI Device Model Bulk silicon models such as LEVEL 3 or BSIM4 typically do not include source/bulk diode currents because the junctions are usually reverse-biased, and can be considered as junction capacitors. This is not the case for SOI devices where the source/bulk junctions can be significantly forward-biased due to the impact ionization which provokes the accumulation of positive charges below the gate. 8.3.1 Fully-depleted MOS Model The kink effect is very weak in fully-depleted SOI MOS devices. Consequently, the BSIM4 model may be applied with reasonable accuracy as the underlying physics and working principles are similar. 8.3.2 Partially-depleted MOS Model In MICROWIND, the kink effect is modelized in the case of partially-depleted SOI devices, thanks to a new parameter ASOI. Details on the SOI model in SPICE are provided in [1], who considers the lateral bipolar device made of the source, the channel and the drain regions. The SOI MOS model includes a complete NPN device model in the case of an n-channel MOS, and a PNP device model in the case of a p-channel MOS. A more simple implementation proposed in MICROWIND consists in modifying the saturation current model directly, where the kink effect is the most important.

254 Advanced CMOS Cell Design

A new parameter called ASOI is introduced. The kink effect occurs when Vds is higher than the saturation voltage Vdsat. The parameter ASOI determines the amplitude of the kink. A new term is introduced, as shown in Eq. 8.1. This approach is a simplified version of the model used in BSIM3 SOI device model [2].

⎛ ASOI Ids = I ds _ bsim 4 ⎜1 + ⎜ Leff .Vt. VDS − Vdsat ⎝

⎞ ⎟ ⎟ ⎠

(Eq. 8.1)

L = device channel length (m). Vds = voltage difference between drain and source (V) Vdsat = saturation voltage as defined in Chapter two (V) Vt = threshold voltage of the MOS device (V) ASOI = technological parameter for handling the kink effect (default 2 × 106 V/cm) As the oxide thickness scales down to 2 nm and below, the quantum mechanism of direct-tunneling through the gate-oxide rises exponentially. The gate-current becomes large enough to compete with the channel-current and consequently to affect the body potential. Many more complex models such as BSIMPD [3] have been developed for an accurate simulation of these nano-scale MOS devices (Fig. 8.14).

8.4 SOI Design Assuming a partially-depleted SOI technology, the kink effect may be reduced by adding a polarization contact to ground which helps in evacuating the accumulated charges outside the channel. The T-shaped and H-shaped MOS with body-tie to ground are shown in Fig. 8.15. The MOS device on the left has no body contact, and may suffer from kink effect as soon as the VDS voltage is higher than 0.5 V. The T-shaped MOS device includes a supplementary P+ diffusion which is connected to the P-channel region on one side and the VSS ground contact on the other side. The body-tie is very efficient at the bottom of the T-shaped MOS but cannot evacuate charges accumulated on the upper part of the channel rapidly. An improved design (H-shape) consists of placing two bodyties, one at the bottom and one on the top, which almost eliminates the kink effect. The main disadvantage of the body-tie is the significant increase in device surface and the need for VSS connections at each MOS device. Important benefits of the SOI technology in terms of compact layout are lost as the bodyties take up valuable silicon space. 8.4.1 The Memory Effect Accounting for the FBE requires specific models which handle the ‘memory effect’ of accumulated charges below the channel. Without body-tie, the time constant for eliminating these charges is in the order in the millisecond, far larger than the switching delay within the logic gates. However, only a very small percentage of the transistors in a typical logic circuit are unable to work properly with a floating body and require a body-tie to ground. Functional error examples linked to floating body effect have been described in [1].

Silicon on Insulator 255

Fig. 8.14 The effect of the ASOI parameter on the Id/Vd characteristics (using soi012.RUL)

8.5 The Tera-Hertz MOS Device The Tera-Hertz (1012 Hertz) transistor is the key device for the development of 10-to-20 GHz processors. A MOS device with Tera-Hertz transit frequency is expected to be fabricated in phase with the 45 nm CMOS process. The Tera-Hertz MOS device combines an SOI substrate, narrow gate length, new gate materials and a high-K dielectric insulator for the gate. Technological issues to be solved concern the gate and transistor leakage currents and the reliability of the high-K dielectric. A comparison between standard and Tera-Hertz SOI MOS is outlined in Fig. 8.16.

256 Advanced CMOS Cell Design

Fig. 8.15 Adding a contact in partially-depleted MOS to avoid the kink effect (mosSoi.MSK)

(a) Standard MOS device

(b) Tera-Hertz MOS device

Fig. 8.16 The Tera-Hertz transistor

Silicon on Insulator 257

8.6 Conclusion This chapter has briefly described the SOI technology, its main advantages and drawbacks. Using MICROWIND, some comparison may be performed between bulk and SOI technology in terms of MOS characteristics, switching-speed and power consumption. The kink effect has been described and a simplified model has been proposed, together with body-tie technique to limit its consequences. The adoption of SOI as a mainstream technology is not yet a reality, maybe because of the steady progress in bulk CMOS technology and the drawbacks linked to kink effect at device level. However, fully-functional microprocessors utilizing SOI have recently been introduced, with important gains in terms of speed and power consumption, which have convinced many skeptics that SOI is a serious candidate for the fabrication of the next generation of chips.

References [1] James B. Kuo, Shih Chia Lin, Low Voltage SOI CMOS VLSI Devices and Circuits, Wiley Intersciences, ISBN 0-471-41777-7, 2001 [2] BSIM3v3 Manual, University of California at Berkeley, USA, http://www-device.eecs.berkeley.edu, 1998 [3] BSIMPD version 2.0 MOSFET model user’s manual, http://www-device.eecs.berkeley.edu

EXERCISES 8.1 Design a NAND gate in SOI technology. What is the switching-speed and standby-current improvement as compared to bulk technology? 8.2 Redesign a basic cell (XOR, AND, Dlatch, etc.) using soi012.rul technology design rules, trying to build the most compact design. What is the surface improvement, as compared to the initial design?

258 Advanced CMOS Cell Design

9 Future and Conclusion

9.1 Predicting the Unpredictable Will the semiconductor industry run out of process technology soon? The international technology roadmap for semiconductors sets technology targets and milestones for the next 15 years [1]. This prospective is probably one of the most referenced document in micro-electronics. The roadmap predicts transistor-gate lengths in microprocessors shrinking to 9n nanometers by 2020.

Fig. 9.1 The art of predicting the future of micro-electronics Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Future and Conclusion 259

9.1.1 The Technology Scale-down beyond 90 nm The 90 nm CMOS technology was introduced in 2004 for system-on-chip fabrication. The next technological steps are presented in Fig. 9.2: 65 nm, 45 nm, 32 nm, 22 nm and 12 nm. By 2008, processors with one billion transistors are expected to be developed, running at 10 GHz. The power consumptions of these processors could rise to around 100 Watts per square centimeter on the die, which could represent the second limiting factor in CMOS design, after the cost of the foundry and the masks. By 2020, the ITRS roadmap [1] projects the minimum physical gate length of transistors to be close to 9 nm (0.009 micron), which is considered by most researchers to be the physical limit of silicon. Will the transistor-gate length scaling continue below 9 nm? This prospect is now causing the industry to consider post-CMOS technologies.

Fig. 9.2 CMOS technologies forecast until 2020

9.2 Conclusion This book has described several aspects of the CMOS circuit design, using the MICROWIND tool as an illustration. A very significant gap exists between the educational PC-based tool and the professional tools used in the industry for real-case designs. However we hope that the readers have grasped the essential parts of MOS devices, logic circuits, memories, analog cells and interfacing, through the illustrations and numerous examples. No book, no teaching can replace practical experience. Although the simulations should never be trusted, the access to microelectronic technology tends to be more and more costly, which justifies the relevance

260 Advanced CMOS Cell Design

of simple tools such as MICROWIND. The authors have dedicated around two years to build the technical contents of this book, and tried their best to improve the MICROWIND and Dsch tools, trying to make attractive and simple something which tends to be more and more complicated. Still, some bugs needs to be corrected, the user’s interface is continuously being improved, and important new features are regularly introduced. As the tools are in constant evolution thanks to users’ feedback and comments, we encourage the readers to download the updated versions of MICROWIND and Dsch from the web page [2]. The tools have benefited from the real-case experiments conducted in 0.35, 0.25 and 0.18 µm CMOS technologies in partnership with ST-Microelectronics, Grenoble, France, and Freescale Semiconductors, Toulouse, France. We hope that the readers will find the contents of this book and the companion tools useful. It is our hope that the readers will design logic and analog circuits by themselves, understand by a practical approach the principles of CMOS VLSI design, and later contribute to innovative designs which will support the electronic systems of the future.

References [1] The ITRS is devised and intended for technology assessment. See http://public.itrs.net/ [2] More information about MICROWIND may be found at http://www.microwind.org

Appendix A—Design Rules 261

Appendix A Design Rules This section gives information about the design rules used by MICROWIND. You will find all the design rule values common to all CMOS processes. All the rules, as well as process parameters and analog simulation parameters are detailed here.

Lambda Units The working of the MICROWIND software is based on a lambda grid, not on a micron grid. Consequently, the same layout may be simulated in any CMOS technology. The value of lambda is half the minimum polysilicon gate-length. Table A-1 gives the correspondence between lambda and micron for all CMOS technologies available in the companion CD-ROM. Table A-1 Correspondence between technology and the value of lambda in µm Technology file available in the CD-Rom Cmos12.rul Cmos08.rul Cmos06.rul Cmos035.rul Cmos025.rul Cmos018.rul Cmos012.rul Cmos90n.rul Cmos65n.rul Cmos45n.rul

Minimum gate length 1.2 µm 0.7 µm 0.5 µm 0.4 µm 0.25 µm 0.2 µm 0.12 µm 0.1 µm 0.07 µm 0.05 µm

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Value of lambda 0.6 µm 0.35 µm 0.25 µm 0.2 µm 0.125 µm 0.1 µm 0.06 µm 0.05 µm 0.035 µm 0.025 µm

262 Advanced CMOS Cell Design

Layout Design Rules The software can handle various technologies. The process parameters are stored in files with the appendix ‘.RUL’. The default technology corresponds to a generic six-metal 0.12 µm CMOS process. The default file is CMOS012.RUL. To select a new foundry, click on File → Select Foundry and choose the appropriate technology in the list.

Nwell r101

Minimum well size

12 λ

r102

Between wells

12 λ

r110

Minimum well area

144 λ

Diffusion r201

Minimum N+ and P+ diffusion width 4 λ

r202

Between two P+ and N+ diffusions

4λ

r203

Extra Nwell after P+ diffusion

6λ

r204

Between N+ diffusion and Nwell

6λ

r205

Border of well after N+ polarization

2λ

r206

Between N+ and P+ polarization

0λ

r207

Border of Nwell for P+ polarization

6λ

r210

Minimum diffusion area

24 λ2

Polysilicon r301

Polysilicon width

2λ

r302

Polysilicon gate on diffusion

2λ

r303

Polysilicon gate on diffusion for high-voltage MOS

4λ

r304

Between two polysilicon boxes

3λ

r305

Polysilicon vs. other diffusion

2λ

r306

Diffusion after polysilicon

4λ

r307

Extra gate after polysilicon

3λ

r310

Minimum surface

8 λ2

Appendix A—Design Rules 263

Second Polysilicon Design Rules r311

Polysilicon2 width

2λ

r312

Polysilicon2 gate on diffusion

2λ

r320

Polysilicon2 minimum surface

8 λ2

MOS Option rOpt

Border of “option” layer over diff N+ and diff P+

7λ

rOpt N+ diff

Contact r401

Contact width

2λ

r402

Between two contacts

5λ

r403

Extra diffusion over contact

2λ

r404

Extra poly over contact

2λ

r405

Extra metal over contact

2λ

r406

Distance between contact and poly-gate 3 λ

r407

Extra poly2 over contact

2λ

r501

Metal width

4λ

r502

Between two metals

4λ

r510

Minimum surface

16 λ2

Metal1

264 Advanced CMOS Cell Design r604

Via r601

Via width

2λ

r602

Between two vias

5λ

r603

Between via and contact

0λ

r604

Extra metal over via

2λ

r605

Extra metal2 over via

2λ

r602 via r601

Stacked via over contact when r603 is 0

metal2

r603 contact

Metal2 r701

Metal width

4λ

r702

Between two metal2

4λ

r710

Minimum surface

16 λ2

r801

Via2 width

2λ

r802

Between two via2

5λ

r804

Extra metal2 over via2

2λ

r805

Extra metal3 over via2

2λ

r901

Metal3 width

4λ

r902

Between two metal3

4λ

r910

Minimum surface

32 λ2

ra01

Via3 width

2λ

ra02

Between two via3

5λ

ra04

Extra metal3 over via3

2λ

ra05

Extra metal4 over via3

2λ

Via2

Metal3

Via3

r804 r802 via2 r801

Metal3

Appendix A—Design Rules 265

Metal4 rb01

Metal4 width

4λ

rb02

Between two metal4

4λ

rb10

Minimum surface

32 λ2

rc01

Via4 width

2λ

rc02

Between two via4

5λ

rc04

Extra metal4 over via4

3λ

rc05

Extra metal5 over via4

3λ

rd01

Metal5 width

8λ

rd02

Between two metal5

8λ

rd10

Minimum surface

100 λ2

Via4

rc04 rc02 via4

Metal4,5

rc01

Metal5

re04

Via5 re01

Via5 width

4λ

re02

Between two via5

6λ

re04

Extra metal5 over via5

3λ

re05

Extra metal6 over via5

3λ

rf01

Metal6 width

8λ

rf02

Between two metal6

15 λ

rf10

Minimum surface

300 λ2

Metal6

re02

re01

Via5

Metal5, 6

266 Advanced CMOS Cell Design

Pads The rules are presented below in µm. In the RUL files, the rules are given in lambda. As the pad-size has an almost constant value in µm, each technology gives its own value in λ. rp01

Pad width

100 µm

rp02

Between two pads

100 µm

rp03

Opening in passivation vs. via

5 µm

rp04

Opening in passivation vs. metals

5 µm

rp05

Between pad and unrelated active area

20 µm

Electrical Extraction Principles MICROWIND includes a built-in extractor from layout to electrical circuit. Worthy of interest are the MOS devices, capacitance and resistance. The flow is described in Fig. A-1.

Fig. A-1 Extraction of the electrical circuit from layout

The first step consists of cleaning the layout. Mainly, redundant boxes are removed, and overlapping boxes are transformed into non-overlapping boxes. In the case of complex circuits, MICROWIND may skip this cleaning step as it requires a significant amount of computational time.

Appendix A—Design Rules 267

Node Capacitance Extraction Each deposited layer is separated from the substrate by an oxide and generates a parasitic capacitor. The unit is aF/µm2 (atto = 10–18). Basically all layers generate parasitic capacitors. Diffused layers generate junction capacitors (N+/P–, P+/N). The list of capacitances handled by MICROWIND is given in Fig. A-3. The name corresponds to the code name used in CMOS012.RUL (CMOS 0.12 µm). Surface capacitance refers to the body. Vertical cross-talk capacitance refers to inter-layer coupling-capacitance, while lateral cross-talk capacitance refer to adjacent interconnects.

Fig. A-2 Capacitances

Fig. A-3 Cross-talk capacitance

268 Advanced CMOS Cell Design

Surface Capacitance Table A-2 Capacitance parameters related to layer surface Name

Description

Lineic (aF/µm)

Surface (aF/µm2)

CpoOxyde

Polysilicon/Thin oxide capacitance

n.c

4600

CpoBody

Polysilicon to substrate capacitance

n.c

80

CMEBody

Metal on thick oxide to substrate

42

28

CM2Body

Metal2 on body

36

13

CM3Body

Metal3 on body

33

10

CM4Body

Metal4 on body

30

6

CM5Body

Metal5 on body

30

5

CM6Body

Metal6 on body

30

4

Inter-layer Cross-talk Capacitance Table A-3 Capacitance parameters related to inter-layer cross-talk coupling Value (aF/µm2)

Name

Description

CM2Me

Metal2 on metal 1

50

CM3M2

Metal3 on metal 2

50

CM4M3

Metal4 on metal 3

50

CM5M4

Metal5 on metal 4

50

CM6M5

Metal6 on metal 5

50

The cross-talk capacitance value per unit length is given in the design rule file for a predefined interconnect width (w = 4λ) and spacing (d = 4λ). In MICROWIND, the computed cross-talk capacitance is not dependant on the interconnect width w. The computed cross-talk capacitance value is proportional to 1/d where d is the distance between interconnects.

Appendix A—Design Rules 269

Lateral Cross-talk Capacitance Table A-4 Capacitance parameters related to lateral cross-talk coupling Name

Description

Value (aF/µm)

CMeMe

Metal to metal (at 4 λ distance, 4 λ width)

10

CM2M2

Metal2 to metal 2

10

CM3M3

Metal3 to metal 3

10

CM4M4

Metal4 to metal 4

10

CM5M5

Metal5 to metal 5

10

CM6M6

Metal6 to metal6

10

Parameters for Vertical Aspect of the Technology The vertical aspect of the layers for a given technology is described in the RUL file after the design rules, using code HE (height) and TH (thickness) for all layers. The Fig. A-4 illustrates the altitude zero, which corresponds to the channel of the MOS. The height of diffused layers can be negative, for P++ EPI layer for example.

Fig. A-4 Description of the 2D aspect of the CMOS technology

270 Advanced CMOS Cell Design

Table A-5 Parameters related to technological options Layer

Description

Parameters

EPI

Buried layer made of P++ used to create a good ground reference underneath the active area

HEEPI for height (negative in respect to the origin) THEPI for thickness

STI

Shallow trench isolation used to separate the active areas

HESTI for height THSTI for thickness

Passivation

Upper SiO2 oxide on the top of the last metal layer

HEPASS for height THPASS for thickness

Nitride

Final oxide on the top of the passivation, usually Si3N4

HENIT for height THNIT for thickness

NISO

Buried N-layer to isolate the Pwell underneath the nMOS devices, to enable forward bias and back bias

HENBURRIED for height THNBURRIED for thickness

Resistance Extraction Table A-6 Parameters related to material resistance Name

Description

Value (Ω)

RePo

Resistance per square for polysilicon

4

RePu

Resistance per square for unsalicide polysilicon

ReP2

Resistance per square for polysilicon2

ReDn

Resistance per square for n-diffusion

100

ReDp

Resistance per square for p-diffusion

100

ReMe

Resistance per square for metal

0.05

ReM2

Resistance per square for metal 2 (up to 6)

0.05

ReCo

Resistance for one contact

ReVi

Resistance for one via (up to via5)

40 4

20 2

Appendix A—Design Rules 271

Dielectrics Some options are built in MICROWIND to enable specific features of ultra-deep submicron technology. Details are provided in the table below. Table A-7 Parameters related to oxide permittivity and thickness Code

Description

Example Value

HIGHK

Oxide for interconnects (SiO2)

4.1

GATEK

Gate oxide

4.1

LOWK

Inter-metal oxide

3.0

LK11

Inter-metal1 oxide

3.0

LK22

Inter-metal2 oxide (up to LK66)

3.0

LK21

Metal2-Metal1 oxide

3.0

LK32

Metal3-Metal2 oxide (up to LK65)

3.0

TOX

Normal MOS gate oxide thickness

0.004 µm (40 Å)

HVTOX

High voltage gate oxide thickness

0.007 µm (70 Å)

Fig. A-5

Illustration of the use of low-K, high-K dielectric constants (left figure) or detailed permittivity for each layer (right figure)

272 Advanced CMOS Cell Design

Simulation Parameters The following list of parameters is used in MICROWIND to configure the simulation. Table A-8 Parameters used to configure the simulation Code

Description

Typical Value

VDD

Supply voltage of the chip

1.2 V

HVDD

High voltage supply

2.5 V

DELTAT

Simulator minimum time step to ensure convergence. You may increase this value to speed up the simulation but instability problems may rise.

TEMPERATURE

Operating temperature of the chip

0.5e–12 s 25°C

Models Level1 and Level3 for Analog Simulation Up to four types of MOS devices may be described. In the rule file, the keyword “MOS1”, “MOS2”, “MOS3” and “MOS4” are used to declare the device names appearing in menus. In 0.12 µm technology, three types of MOS devices are declared as follows. Also, NMOS and PMOS keywords are used to select n-channel MOS or p-channel MOS device parameters. Table A-9 Description of MOS options in 0.12 µm technology (cmos012.RUL) Parameter Default name

MOS1 High Speed

MOS2 Low Leakage

MOS3 High voltage

Vt (nmos)

0.3

0.5

0.7

Vt (pmos)

–0.3

–0.5

–0.7

KP (nmos)

300

300

200

KP (pmos)

150

150

100

Appendix A—Design Rules 273

The list of parameters for Level one and Level three is given below: Table A-10 MOS parameters related to Level 3 Parameter

VTO U0 PHI LD GAMMA KAPPA VMAX THETA NSS TOX CGSO CGDO CGBO CJSW

Keyword

Definition

l3vto l3u0 l3phi l3ld l3gamma l3kappa l3vmax l3theta l3nss l3tox L3cgs L3cgd L3cb L3cj

Threshold voltage Low field mobility Surface potential at strong inversion Lateral diffusion into channel Bulk threshold parameter Saturation field factor Maximum drift velocity Mobility degradation factor Sub-threshold factor Gate oxide thickness Gate to source lineic capacitance Gate to drain overlap capacitance Gate to bulk overlap capacitance Side-wall source & drain capacitance

Typical Value 0.25 µm nMOS

pMOS

0.4 V 0.06 m2/V.s 0.3 V 0.01 µm 0.4 V0.5 0.01 V–1 150 Km/s 0.3 V–1 0.07 V–1 3 nm 100.0 pF/m 100.0 pF/m 1e-10 F/m 1e-10 F/m

–0.4 V 0.025 m2/V.s 0.3 V 0.01 µm 0.4 V0.5 0.01 V–1 100 Km/s 0.3 V–1 0.07 V–1 3 nm 100.0 pF/m 100.0 pF/m 1e-10 F/m 1e-10 F/m

For MOS2, MOS3 and MOS4, only the threshold-voltage, mobility and oxide thickness are useraccessible. All other parameters are identical to MOS1. Table A-11 MOS parameters related to Level 3 Parameter

VTO Mos2 VTO Mos3 U0 Mos2 U0 Mos3 TOX Mos 2 TOX Mos 3

Keyword

l3v2to l3v3to l3u2 l3u3 l3t2ox l3t3ox

Definition

Threshold voltage for MOS2 Threshold voltage for MOS3 Mobility for MOS2 Mobility for MOS3 Thin oxide thickness for MOS2 Thin oxide thickness for MOS3

Typical Value 0.25 µm nMOS

pMOS

0.5 V 0.7 V 0.06 0.06 3 nm 7 nm

– 0.5 V – 0.7 V 0.025 0.025 3 nm 7 nm

274 Advanced CMOS Cell Design

BSiM4 Model for Analog Simulation The list of parameters for BSiM4 is given below: Table A-12 MOS parameters related to BSiM4 Parameter

Keyword

VTHO

b4vtho

VFB

Description

nMOS value in 0.12 µm

pMOS value in 0.12 µm

Long channel threshold voltage at Vbs = 0 V

0.3 V

0.3 V

b4vfb

Flat-band voltage

– 0.9

– 0.9

K1

b4k1

First-order body bias coefficient

0.45 V1/2

0.45 V1/2

K2

b4k2

Second-order body bias coefficient

0.1

0.1

DVT0

b4d0vt

First coefficient of short-channel effect on threshold voltage

2.2

2.2

DVT1

b4d1vt

Second coefficient of shortchannel effect on Vth

0.53

0.53

ETA0

b4et

Drain induced barrier lowering coefficient

0.08

0.08

NFACTOR

B4nf

Sub-threshold turn-on swing factor. Controls the exponential increase of current with Vgs.

1

1

U0

b4u0

Low-field mobility

0.060 m2/Vs

0.025 m2/Vs

UA

b4ua

Coefficient of first-order mobility degradation due to vertical field

11.0e-15 m/V

11.0e-15 m/V

UC

b4uc

Coefficient of mobility degradation due to body-bias effect

– 0.04650e-15 V-1 – 0.04650e-15 V-1

VSAT

b4vsat

Saturation velocity

8.0e4 m/s

8.0e4 m/s

WINT

b4wint

Channel-width offset parameter

0.01e–6 µm

0.01e–6 µm

LINT

b4lint

Channel-length offset parameter

0.01e–6 µm

0.01e–6 µm

PSCBE1

b4pscbe1

First substrate current induced body-effect mobility reduction

4.24e8 V/m

4.24e8 V/m (Contd)

Appendix A—Design Rules 275

Parameter

Keyword

Description

PSCBE2

b4pscbe2

Second substrate current induced body-effect mobility reduction

4.24e8 V/m

4.24e8 V/m

KT1

b4kt1

Temperature coefficient of the threshold voltage

– 0.1 V

– 0.1 V

UTE

b4ute

Temperature coefficient for the zero-field mobility U0.

–1.5

–1.5

VOFF

b4voff

Offset voltage in subthreshold region

– 0.08 V

– 0.08 V

PCLM

b4pclm

Parameter for channel length modulation

1.2

1.2

TOXE

b4toxe

Gate oxide thickness

3.5e-9m

3.5e–9 m

NDEP

b4ndep

0.54

0.54

XJ

b4xj

1.5e-7

1.5e–7

Junction depth

nMOS value in 0.12 µm

pMOS value in 0.12 µm

For MOS2, MOS3 and MOS4, only the threshold voltage, mobility and oxides thickness are useraccessible. All other parameters are identical to MOS1.

Technology Files for DSCH The logic simulator includes a current evaluator. To run this evaluation, the following parameters are proposed in a TEC file (Example, cmos012.TEC): DSCH 2.0 - technology file NAME “CMOS 0.12um” VERSION 14.12.2001 * Time unit for simulation TIMEUNIT = 0.01 * Supply voltage VDD = 1.2 * Typical gate delay in ns TDelay = 0.03

276 Advanced CMOS Cell Design

THsDelay = 0.02 THvDelay = 0.06 * Typical wire delay in ns TWireDelay = 0.07 * Typical current in mA TCurrent = 0.5 * Default MOS length and width ML = “0.12u” MHVL = “0.36u” MNW = “1.0u” MPW = “2.0u” * * End cmos012.tec *

Appendix B—MICROWIND31 Program Operation and Commands 277

Appendix B MICROWIND31 Program Operation and Commands Getting Started To get your MICROWIND version 3.1 program started, use the following procedure:

Typo: at end of each line of the list, add a ‘.’? Connect to http://www.microwind.net Follow the download procedure Follow the install procedure Double-click on the MICROWIND 3.1 icon to start the software

The software runs on Windows 98, 2000, NT and XP operating systems. Command Line Parameters The command line may include two parameters: • the first parameter is the default mask file loaded at initialization • the second parameter is the design rule file loaded at initialization. For example, the command « Microwind31 test.MSK cmos65n.RUL » executes MICROWIND with a default mask file « test.MSK » and the rule file « cmos65n.RUL ».

List of Commands in Microwind31 2D Vertical Cross-section

Click on the above icon to access process simulation. A mouse-operated line is given and embodies the cross-section. The screen of Fig. B-1 appears. The arrows can be used to move the cross-section to the Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

278 Advanced CMOS Cell Design

right or to the left in the X-axis, and forward and backward in the Y-axis. Zoom-in and Zoom-out are available. Remove the layer names by removing the tick in front of “Layer infos”.

Fig. B-1 Process section of a portion of an IC showing two n-channel MOS devices (center), two polarization contacts (sides) and the metal structure

About MICROWIND31 Information about the software release and contact for support (see Fig. B-2). Add Text to Layout

Use this icon to fix a text to one box or location in the design. The text illustrates the layout and should be used as much as possible for each significant node such as inputs and outputs. To add some text to a particular place, use the three following steps.

Fig. B-2 Information about the software version and licence

Appendix B—MICROWIND31 Program Operation and Commands 279

Click on the icon. Set the text location with the mouse. A dialog-box appears. Enter the text in front of “Label name:” and press Assign. The text is set in the drawing. A text can be modified as follows: click on the icon and click inside the existing text. The old text appears. Modify it and click on “Assign”. You may add a clock, a pulse, a VDD or VSS voltage source to the text.

Compile One Line The cell compiler is a specific tool designed for the automatic creation of CMOS cells from logic description. Click on Compile → Compile One Line. The menu shown in Fig. B-3 appears. The default equation corresponds to an inverter gate. If needed, one can use the keyboard in order to modify the equation and then click on Compile. The gate is compiled and its corresponding layout is generated.

Fig. B-3 The cell compiler window

• The first item of the one-line syntax corresponds to the output name. • The latter is followed by the sign « = », by the list of input names separated by operators AND ‘&’, OR ‘|’, XOR ‘^’, NOT ‘~’, XNOR ‘~^’. If need be, parenthesis can be added. • The input and output names are eight character strings maximum. Table B-1 Examples of logic cell descriptions Cell

Formula

Inverter

out=/in

NAND gate

n=/(a.b)

3 Input OR

s=a+b+c

3 Input NAND

out=/(a.b.c)

AND-OR Gate

cgate=a.(b+c)

CARRY Cell

cout=(a.b)+(cin.(a+b))

280 Advanced CMOS Cell Design

The p-channel transistors are located on the top of the n-channel transistor net. If some layout already exists near those icons, the cell origin is moved to the right until enough free space is found. If the NOT operator (symbol ‘~’) has not been specified after the ‘=’ sign, an inverter is added at the right-hand-side of the compiled cell. That is why an AND gate is compiled as a NAND gate followed by an inverter. Compile VERILOG File The cell compiler can handle layout generation from a primitive-based VERILOG description text into a layout form automatically. Click on Compile → Compile Verilog File. Select a VERILOG text file and click on Generate. For instance, the MICROWIND directory contains the « FADD.V » file which corresponds to the description of a full-adder. module fadd( C,B,A,Sum,Carry); input C,B,A; output Sum,Carry; xor #(12) xor2(w4,A,B); nand #(10) nand2(w5,A,C); nand #(10) nand2(w6,B,C); nand #(10) nand2(w7,B,A); xor #(12) xor2(Sum,w4,C); nand #(10) nand3(Carry,w7,w6,w5); endmodule // // // //

Simulation parameters C CLK 10 10 B CLK 20 20 A CLK 30 30

Add a pad ring around the layout

Verilog text

Horizontal limit of the layout Compiling informations Click here to start compiling Back to Microwind

Fig. B-4 The VERILOG compiler window

Appendix B—MICROWIND31 Program Operation and Commands 281

Table B-2 The VERILOG primitives supported by the CMOS compiler Primitive

Nodes

Example

dreg

Inputs : Data, RESET, CLOCK Outputs: Q, nQ

dreg reg1(d,rst,h,q,nq);

Inv, not

Inputs : IN Outputs: OUT

inv inv1(s,e); // both ‘inv’ and ‘not’ not inv1(s,e); // can be used

and

Inputs : 2 to 4 Outputs: S

and and1(s,a,b,c,d); // limit inputs to 4

nand

Inputs : 2 to 4 Outputs: S

nand nand1(s,a,b,c,d);

or

Inputs : 2 to 4 Outputs: S

or or3(s,a,b,c);

nor

Inputs : 2 to 4 Outputs: S

nor my_nor4(s,a,b,c,d);

xor

Inputs : a,b Outputs: S

xor xor_gate(xor_out,d0,d1);

Nmos

Inputs: gate, source Outputs: drain

nmos nmos1(d,s,g);

The I/O nodes are routed on the top and the bottom of the active parts, with a regular spacing to ease automatic channel-routing between cells. Click Compile → Show grid to superpose the routing grid on the layout. In Fig. B-5, the routing of I/O between basic cells is presented. Notice that the routing is performed both on the top and on the bottom of the active parts. Convert into CIF MICROWIND converts the MSK layout into CIF using a specific interface, invoked by File → Convert Into → CIF layout file. The CIF file can be exported to VLSI CAD software. The right-side table of the screen (Fig. B-6) gives the correspondence between MICROWIND layers and CIF layers, the number of boxes in the layout and the corresponding over-etch. The over-etch is used to modify the final size of the CIF boxes in order to fit the exact design rules. Click on To CIF to start conversion. Some parts of the result appear in the left-side window. The main unit is 1 nm. You may change it to fit the requirements of the target CAD tool. For CMOS 0.25 µm rule file (cmos025.RUL), notice the over-etch applied to contact and via. This over-etch is mandatory to obey the final design rules, while keeping the user-friendly and portable lambda-based design.

282 Advanced CMOS Cell Design

Fig. B-5 The compiler grid

Fig. B-6 The CIF generation screen

Appendix B—MICROWIND31 Program Operation and Commands 283

Concerning diffusions, notice that the CIF generator produces active areas and implants. MICROWIND uses simple N+diffusion and P+diffusion while most industrial layout design tools use the concept of active area surrounded by implants, either N+ or P+, as illustrated in Fig. B-7.

Fig. B-7 The CIF conversion produces active areas and implants, to be compatible with industrial processes

This means that each N+ diffusion box drawn in MICROWIND is converted into two boxes, one linked with “Active Area”, with a code-name declared in the design rule file, a second box linked to “diffn”, with a given over-etch. Each P+ diffusion box is converted into two boxes too. On is linked with the same “Active Area”, a second box linked to “diffp”, with a given over-etch. Convert into GDS2 (Added to Version 3.1) MICROWIND also converts the MSK layout into GDS2 format using a specific interface, invoked by File → Convert Into → GDS2 file. The GDS2 file can be exported to virtually all VLSI CAD software. The right-side table of the screen (Fig. B-8) gives the correspondence between MICROWIND layers and GDS2 layers, the number of boxes in the layout and the corresponding over-etch. The over-etch is used to modify the final size of the GDS2 boxes in order to fit the exact design rules. Click on Generate GDS2 to start conversion. The result is a binary file “myfile.gds” in this example. The layer numbers are configured for each technology using the keyword “GDS2” in the RUL file. An example of GDS2 layer definition as it appears in “CMOS90n.RUL” is listed below.

284 Advanced CMOS Cell Design

Fig. B-8 The GDS2 conversion window

* GDS2 Layers * gds nwell 1 gds diffp 17 gds diffn 16 gds aarea 2 gds poly 13 gds contact 19 gds metal 23 gds via 25 gds metal2 27 gds via2 32 gds metal3 34 gds via3 35 gds metal4 36 gds via4 52 gds metal5 53 gds via5 54 gds metal6 55 gds text 94

Appendix B—MICROWIND31 Program Operation and Commands 285

Convert into SPICE MICROWIND converts the MSK layout into SPICE format using a specific interface, invoked by File → Convert Into → SPICE file. The SPICE file can be exported to analog SPICE-compatible simulators such as PSPICE or WinSPICE. More information about MICROWIND interfacing to WinSPICE may be found in Appendix E. Colors • Switch to monochrome. The layout is drawn in black and white. This type of drawing is convenient to build monochrome documentation. Press “Alt”+“Print Screen” to copy the screen to the clipboard. Then, open Microsoft Word™ or WordPad, click Edit → Paste. The screen is inserted into the document. • White background. The layers appear with a palette of colors on a white background. Copy (CTRL+C)

Click on the Copy icon. Move the cursor to the design window, and delimit the active area with the mouse. Consequently, all the graphics included in this area are copied. The external shape of the copied elements appears. Fix those copied elements at the desired location by a click on the mouse. Click on Undo to cancel the copy command. Cut (CTRL+X)

Click on the Cut icon. Move the cursor to the design window, and delimit the active area with the mouse. Consequently, all the graphics included in this area are erased. Click on Undo to fix those elements back into the design. • A layer is protected from erasing if you remove the tick in the palette twice. In the palette, an empty square to the right of the layer indicates a protected layer. • A layer is unprotected from erasing if you select it again in the palette. A tick in the square to the right of the layer indicates an unprotected layer. • Only one box can be erased by a click inside that box when the cut command is active. The box is then erased. Design Rules Provides an on-line help for using MICROWIND. Includes a summary of commands, and some details about the design rules, as shown in Fig. B-9.

286 Advanced CMOS Cell Design

Fig. B-9 Design rules and electrical rules proposed in the help menu

Design Rule Checker

The design rule checker (DRC) scans the entire design and verifies that all the minimum design rules are respected. Click on the icon shown above or on Analysis → Design Rule Checker to run the DRC. The errors are highlighted in the display window, with an appropriate message giving the nature of the error. Details about the position and type of the errors appear on the screen (Fig. B-10).

Fig. B-10 Example of design rule error

Appendix B—MICROWIND31 Program Operation and Commands 287

Draw a Box

The Draw Box icon is the default icon. It creates a box in the selected layer. The default layer is polysilicon. If the Draw Box icon is not selected, click on it. Then, move the cursor to the display window and fix the first corner of the box with a press of the mouse. Press and drag the mouse to the opposite corner of the box. Release the mouse and see how the box is created. • The active layer is selected in the palette. • The red color indicates the active layer. • The gray key on the right of the layer button specifies that all boxes using the layer can be erased, stretched or copied. • A click on the gray key turns the key to a red color. A red key protects the layer. Duplicate XY The command Duplicate XY is very useful to generate an array of identical cells such as RAM cells for example. Click on Edit → Duplicate XY, include the elements to duplicate in an area defined by the mouse, and the screen shown in Fig. B-11 appears. In both X and Y, the default multiplication factor is two. You may adjust the space between cells. By default, the cells touch each other. Selected boxes appear in the right window, and also in yellow on the main layout window.

Fig. B-11 The Duplicate X Y menu, used for a metal1/metal2 crosspoint

288 Advanced CMOS Cell Design

The data option (Assign data, Edit Hexadecimal data, Fill Array) is very useful to generate ROM masks or decoder arrays. An example of decoder array is given in Fig. B-12. The programming affects the via between metal1 and metal2, according to the list of hexadecimal values given in the editing list. • Click the desired data on the Edit hexadecimal data editing area. The values must be separated by a space. • Click Fill array to convert these data into Boolean values, according to the X,Y array size. • Select the appropriate box for programming. In this case, a 1 creates 2 via each X,Y location. • Click Generate. The following result appears.

Fig. B-12 Example of duplicating a pattern with programmed via (DuplicateXYExample.MSK)

Find Floating Nodes Use the command Analysis → Find Floating Nodes to locate small portions of layout without any active connection to other layers. If floating nodes are found, numbers and labels are automatically assigned and listed in the navigator. Click on the label list to locate the floating portion of layout. Flip

To apply a rotation or a flip to one part of the design, click on Edit → Flip and Rotate, and choose the appropriate Flip command (horizontal or vertical). Delimit the active area of the boxes in the layout which are to be modified.

Appendix B—MICROWIND31 Program Operation and Commands 289

Generate Box

This is the same as the Draw Box command described previously. Generate Contacts

This macro generates contacts such as polysilicon/metal, N-diffusion/metal, P-diffusion/metal and metal/ metal2 metal2/metal3 and so on, or stacked contacts can be obtained here. You may also click the above icon in the palette. Multiple contacts can be generated when entering number of contacts in X and Y greater than one (Fig. B-13).

Fig. B-13 The contact menu

Generate nMOS, pMOS Devices

This macro generates either an n-channel or a p-channel transistor (Fig. B-14). The double-gate MOS is also available in some CMOS technologies, for building the EEPROM memory. The parameters of the

290 Advanced CMOS Cell Design

cell are the channel length (default value is given by the design rules), its width, and the number of gates. Once those parameters are defined, the device outline appears. Click on the mouse to place it in the appropriate place.

Fig. B-14 The MOS generator menu

Generate Resistor This command generates a resistor in n-well, polysilicon or poly2, N+ or P+ diffusion. The default aspect of the resistor is a Z with three bars (parameter n in the menu of Fig. B-15). A virtual resistor symbol may be inserted in the resistor layout, to ensure the handling of the resistance effect during simulation. By default, an option layer configured to remove salicidation is added to the resistance layout. By default, all polysilicon and diffusions have a salicide surface metallization to decrease the sheet-resistance by a factor around 10. The unsalicide option is recommended for a high resistance value in a small area. Finally, contacts are added by default to the near and far ends of the resistance to facilitate further interconnection. Generate Inductor This command generates a coil made from user-defined metal layers. This item is used for very-high frequency oscillators. This inductor is viewed as an inductance thanks to a virtual inductor symbol inserted in the layout. An evaluation of the inductance is proposed at a click of the button Update L,Q. The inductor may be a stack of several layers, thanks to the layer menu. Generate Bus This command generates a set of parallel lines with user-defined layer, width and spacing. This command is useful to build coupled interconnects, or bus path used in the final routing of a chip.

Appendix B—MICROWIND31 Program Operation and Commands 291

Fig. B-15 The resistor generator menu

Fig. B-16 The inductor generator menu

292 Advanced CMOS Cell Design

Fig. B-17 The bus generator menu

Generate Path This command generates a path of interconnects using one single layer. The path width can be changed, as well as the alignment to the routing grid. A set of contacts can also be placed at both ends of the path. This command is very useful for VDD and VSS supply drawing and single-layer interconnects.

Fig. B-18 The path configuration menu

Appendix B—MICROWIND31 Program Operation and Commands 293

Generate I/O Pads It is possible to add various items such as a single pad, (usually 80 × 80 µm), or even a set of pads all around the layout using the VDD and VSS power rings. In the last case (adding more than one pad), give the number of pads on each side of the chip and if need be modify the width of the VDD and VSS tracks, as well as the number of VDD/VSS pad pairs. Generate Diode You may also generate a polarization seal around the contact to create a pad diode protection, for example. Global Cross-talk Evaluation (Version 3.1) An evaluation of the cross-talk effect based on analytical approximations of the coupling-amplitude is available using the command Analysis → Global Crosstalk analysis to access to this command. The example of the complete crosstalk calculation of each interconnect for the layout “AddBCD.MSK” is displayed in Fig. B-19. The first-order formulations used for the computation of the cross-talk voltage ∆V are shown below. Cx =

x =

C12 Cvictim

C12

Wvictim Laffector Lvictim Waffector

∆V = Vdd

Cx 1 1 + Cx 1 + x

Caffector

Cvictim

Where: C12 = crosstalk capacitance (Farad) Cvictim = capacitance of victim (Farad) W = width of MOS device (m)

Substrate (Ground)

L = length of MOS device (m) Vdd = supply voltage (V) In Fig. B-19, the nodes in red correspond to the highest crosstalk noise, while the nodes in blue have almost no noise due to lateral coupling. Vss and Vdd nodes may be removed from the list, and interconnects with length less than a user defined value may also be removed. The values higher that 30% of VDD may jeopardize the safe behavior of signal propagation. In the list, three internal nodes (i0w9, iow10, iow4) may suffer noise above that limit. However, the evaluation takes into account a worst-case situation where all potential aggressors switch synchronously. A timedomain simulation should be conducted including the evaluation of cross-talk noise for these three victim nodes to verify that the noise does not reach this worst-case value.

294 Advanced CMOS Cell Design

Fig. B-19 Global cross-talk extraction and classification of dangerous nodes (AddBCD.MSK)

Global Delay Evaluation (Version 3.1) At the IC level, there exists a possibility of evaluating the delay of each interconnect, in a global way, thanks to analytical approximations. We implemented in M ICROWIND version 3.1 very simple approximations of the delay within interconnects, using the following formulations (more information may be found in Chapter 5 of the book “Basic CMOS Cell Design”): delay=0.43*Rline*Cline+0.92*(Rline*Cgate+Rd_mos*(Cline+Cgate)) Where: delay = RC delay of the propagation, (in s) Rline = resistance of the line (in Ω) Cline = capacitance of the interconnect (in Farad) Cgate = capacitance of the loading gates (in Farad) Rd_mos = equivalent on resistance of the MOS device driving the interconnect Click Analysis → Global Delay Evaluation within MICROWIND to access to this command. An example of the complete delay calculation of each interconnect is displayed in Fig. B-20. The classification of each node by decreasing delay appears in the navigator window. The worst delay appears at node Y1, with a delay estimated to 412 ps.

Appendix B—MICROWIND31 Program Operation and Commands 295

Fig. B-20 RC delay estimation at chip level (AddBCD.MSK)

Invert Diffusion N <-> P (Version 3.1) This command is useful to invert the nature of the diffusion. All N+ diffusions become P+, and vice versa, as illustrated in Fig. B-21.

Fig. B-21 Inverting the nature of the diffusion

Insert Layout The command File → Insert Layout is used to add an MSK file to the existing files. The inserted layout is fixed at the right-lower side of the existing layout. The current file name remains unchanged.

296 Advanced CMOS Cell Design

Interconnect Analysis with FEM A finite element method (FEM) has been implemented to display the electric field lines that appear between conductors used for interconnects. Click Analysis → Interconnect Analysis to access the screen shown in Fig. B-22. Four basic conductor configurations are proposed: single conductor over a ground plane, conductor between two ground planes, two conductors over a ground plane or between two ground planes. The default conductor dimensions are linked to the minimum design rules. Click Compute Field to compute the voltage distribution inside the insulator, from which the electrical field lines are deduced. Notice that the special low-K dielectric creates a discontinuity at the interface with SiO2. The R, L, C parameters listed in the Compute section are deduced from analytical formulations, as detailed in Chapter 5 of the book “Basic CMOS cell design”.

Fig. B-22

A view of the field lines around a conductor

Layer Connection

The command Edit → Layer Connection (or the icon above) is a very convenient way to establish automatically a design-rule compliant link between the lower-layer and the upper-layer found at the cursor location. An example of direct connection between N+ diffusion and metal5 is given in Fig. B-23.

Appendix B—MICROWIND31 Program Operation and Commands 297

Fig. B-23

Automatic generation of a contact between the upper and lower layers

Leave Microwind3.1 Click on File -> Leave MICROWIND3.1 (Or CTRL+Q) in the main menu. If you have made a design or if you have modified some data, you will be asked to save it. After confirmation, you can return to Windows. Generate The layout generator includes a set of predefined layout macros such as box, contacts, NMOS and PMOS devices, resistor, metal bus, metal path, inductor, diode, capacitor, logo and I/O pads (Fig. B-24). The cells are built according to design rules, and user-specified size parameters. Label List (Version 3.1) The most convenient way to find a text in the layout is to invoke View → Label List. The list of text labels appears in the navigator menu. If you click on the desired text, the screen is redrawn so that the text label is at the center of the window, with two lines drawing a cross at the text location. Its properties appear in the navigator menu. • Click on Hide to close the navigator window. • Click on Extract to add the electrical properties of the selected text if the layout has not been previously extracted.

Fig. B-24

The generator menu

In the case of a very long text list, select the first letter of the text at hand, press that letter on the keyboard. This will automatically effect an alphabetic search and the selector will move to the first label starting with the selected letter. Lambda Grid The command View → Lambda Grid draws or hides the lambda grid on the layout window. The grid is automatically fitted with the zoom scale.

298 Advanced CMOS Cell Design

Make SPICE File Click on File → Make Spice File to translate your design into a SPICE-compatible description. The circuit extractor included in the software generates the equivalent circuit diagram of the layout and a SPICE compatible netlist ready to be simulated. We recommend WinSpice shareware for Windows SPICE simulation (see www.winspice.com). You may select the model you will be using for simulation. The choice lies between model 1, model 3 and BSIM4. • The SPICE description includes the list of n-channel and p-channel transistors and their associated width and length extracted from the layout. • The text file also details the node names, parasitic capacitances, and device models. • The SPICE filename corresponds to the current filename with the appendix .CIR Measure Distance

The ruler gives the horizontal and vertical measurements (dx and dy) between two points, directly on the screen in lambda and in micron. It is accessible through Analysis → Measure Distance. The algebraic distance d is also given in µm. The ruler is simply erased by the command View → Refresh the screen or by pressing of <ESC>. MOS Characteristics

Click on the icon. The Id/Vd curve of the default MOS (W = 20 µm, L = Lminimum) appears (Fig. B-25). The effects of changing the model parameters can be seen directly on the screen by a click on the little arrows (up/down), which change the parameter values. • • • • •

Click “Id vs. Vg” to highlight the threshold voltage Click “Id(log) vs. Vg” to see the sub-threshold behavior. Add measurements by selecting a « .MES » file Skip from NMOS to PMOS device by a click on the corresponding button Select the size of the device in the lower list menu

MOS List Click Edit → MOS List to get the list of n-channel and p-channel MOS devices currently edited in the layout. The MOS list is displayed in the navigator window. Click the desired MOS in the list to zoom at the corresponding location in the layout.

Appendix B—MICROWIND31 Program Operation and Commands 299

Fig. B-25

The Id/Vd characteristics of an n-channel MOS device W = 1 µm, L = 0.1 µm, using the BSIM4 model

Move Area or Stretch

To move one box, click on the above icon. Using the mouse, create an area that includes the box. Then, drag the mouse to the new location and release the mouse. As a result, the box is moved to the new place. Repeat the same in order to move a set of boxes. • To protect a layer from moving, click on the rectangle in the palette that is situated on the rightside of the layer. This will remove the tick. • To stretch a box, click on one side of the box that you want to stretch. The box outline appears. Drag the mouse to the new location and release the button. The box is stretched. TIP: To catch the desired border of the box, draw a line perpendicular to the border, entering the box.

300 Advanced CMOS Cell Design

Move Step by Step (CTRL+M) To move a box lambda by lambda, click Edit → Move Step by Step. Using the mouse, create an area that includes the boxes. The selection appears in yellow. Then, click the arrow until the selection has been moved to the new place. The step value (in lambda) is fixed in the edit line.

Navigator Window The navigator window is used to display various information such as the electrical node properties, as illustrated below, the device list, option layer properties, etc. The navigator window automatically appears when the command View Electrical Node is invoked.

Fig. B-26

The navigator displays the electrical node property

Appendix B—MICROWIND31 Program Operation and Commands 301

New Click on File → New in order to restart the software with an empty screen. The current design should be saved before asserting this command, as all the graphic information will be physically removed from the computer memory. No Undo is available to disable the New command. Open

Click on the above icon. In the list, double-click on the file to load. « .MSK » is the default extension that corresponds to the layout files. The CIF files « .CIF » can also be loaded. The appropriate conversion program transforms the input CIF into MSK format. Paste (CTRL+V) Used to paste elements previously deleted. Select the command Edit → Paste. The shape of the elements deleted using the command Edit → Cut appears. Move the cursor to the design window. Fix the desired location by a click on the mouse. Click on Undo to cancel the Paste command. Palette of Layers

The palette is located on the right-side of the screen. A little tick indicates the current layer. The selected layer by default is a polysilicon (PO). The palette aspect is given in Fig. B-27. The current layer is N+ diffusion.

Fig. B-27

The palette of layers and its most important features

302 Advanced CMOS Cell Design

If you remove the tick on the right-side of the layer, the layer is switched to protected mode. The Cut, Stretch and Copy commands no longer affect that layer. • Use View → Protect all to protect all layers. The ticks are erased. • Use View → Unprotect all to remove the protection. All layers can be edited. Parametric Analysis The Parametric Analysis command is a powerful way to investigate the variation of the circuit performance with respect to one key parameter. The simulation is performed iteratively, with one varying parameter within a user’s defined range and one parameter evaluation. The example of an inverter delay variation with the output loading capacitance is illustrated in Fig. B-28.

Fig. B-28

1. 2. 3. 4.

Inverter delay increases with the output capacitance (InvCapa.MSK)

Load the file InvCapa.MSK Invoke the command Analysis → Parametric Analysis Click on the output node Click Start Analysis.

By default, the capacitance of the output node is increased step-by-step from its default value Cdef to Cdef + 100 fF. For each value of the output capacitance, the analog simulation is performed, and the last computed rise time is plotted, appearing as one single red dot in the graphs. The complete graph is built once all analog simulations have been completed. The memory button enables to store one curve (evaluation of the rise times for example) prior to a new parametric simulation, for comparison purposes.

Appendix B—MICROWIND31 Program Operation and Commands 303

Three main parameters may vary in the parametric analysis: the capacitance as in Fig. B-28, voltage or temperature. Several analog parameters may be monitored: rise and fall delay, oscillating frequency, power consumption, final voltage of a node, cross-talk, etc. For example, the parametric analysis tool may be invoked to monitor the power dissipation versus the supply voltage. The incremental change in the supply voltage is defined from 0.5 to 2.0 V. The supply voltage step is 0.1 V. In the measurement window, the item Dissipation must be selected. The result should show a non-linear dependence of power dissipation on VDD. Process Steps in 3D

Click Simulation → Process Steps in 3D. Click Next step to watch how the layout currently edited on the screen will be fabricated using the selected technology. An example of a 3D view of a layout is proposed in Fig. B-29. Use the arrow to shift the displayed portion. Zoom-in and Zoom-out are available.

Fig. B-29

Process section in 3D

Protect All Click on Edit → Protect All to protect all layers from editing. All ticks in the palette are removed.

304 Advanced CMOS Cell Design

Fig. B-30

The palette aspect after protecting all layers

Properties The command File → Properties provides some information about the current technology, the percentage of memory used by the layout and the size of the layout plus its detailed contents. If the layout has previously been extracted or if you click Extract, the number of devices and nodes will be updated. If you have loaded a technology using the command File → Select Foundry and you want to make it the default technology, click Set as default technology. You can also get access to the design rules details. Print Layout Click on File → Print Layout to transfer the graphical contents of the screen to the printer. Alternatively, you can make a copy of the window into the clipboard in order to import the screen into your favorite text-editor by pressing +. In the text-editor or in the graphic-editor, simply click on Edit → Paste. We recommend that you switch to monochrome mode first by invoking the function File → Colors → Switch to Monochrome. In that case the layout will be drawn in a white background color using gray levels and patterns.

Appendix B—MICROWIND31 Program Operation and Commands 305

Fig. B-31 Layout properties

Refresh Click on View → Refresh to simply redraw the layout. Rotate

To apply a rotation to one part of the design, click on Edit → Flip and Rotate → Rotate 90° or Rotate –90°. Delimit the active area of the boxes in the layout so that it can be modified using the mouse. Reference Manual Click Help → Reference Manual to get access to a description of each command of the software. The main menu is shown in Fig. B-32. Resonant Frequency Use this command to compute the resonant frequency and characteristic impedance of an inductor and capacitor couple. The resonant frequency fr is given by the following formula. fr =

1 2F LC

(Eq. B-1)

306 Advanced CMOS Cell Design

Fig. B-32 The on-line help

The characteristic impedance is given by: Z0 =

L C

(Eq. B-2)

The inductor impedance is estimated at a given frequency using the equation:

ZL

= 2p f L

(Eq. B-3)

The capacitor impedance is estimated at a given frequency using the equation: ZC

=

1 2F f C

(Eq. B-4)

For a given value of inductor and capacitor (5 nH and 1 pF in the example of Fig. B-33), the resonant frequency is directly computed in mega-hertz (MHz). The resonant frequency is around 2.25 GHz, with ZL = 70 Ω and ZC = 70 Ω. Routing Grid The command View → Routing Grid draws or hides the routing grid on the layout window. The routing grid is used for cell compiling into layout, invoked by the commands Compile one line or Compile Verilog File.

Appendix B—MICROWIND31 Program Operation and Commands 307

Fig. B-33 The resonant frequency interface

Save Layout (CTRL+S)

Click on File → Save Layout to save the layout with its current name. The default name is « EXAMPLE.MSK » Save As A new window appears, into which you are to enter the design name. Use the keyboard and type the desired file name. Press Save. Your design is now registered within the .MSK appendix. Select Foundry (CTRL+F) Click on File → Select Foundry. The list of available processes appears. The default design rule file is written in bold characters. Various technologies are available from 1.2 µm down to 45 nm. Click on the rule file name and the software reconfigures itself in order to adapt to the new process. Simulation Parameters Click Simulate → Simulation Parameters to access a set of parameters that pilot the simulation. The screen shown in Fig. B-34 appears. • In the “models, parameters” menu, the MOS level can be chosen between Level one, Level three and BSIM4. The simulation can run in typical case, min case (where the threshold voltage is increased by 20% and the mobility is decreased by 20%), max case (where the threshold voltage is decreased by 20% and the mobility is decreased by 20%) and Monte-Carlo, where the threshold and mobility are chosen in a random way within the min and max boundaries.

308 Advanced CMOS Cell Design

Table B-3 Description of the design rule files Technology file

Description

Minimum gate length

Value of lambda

Cmos12.rul

2 metal layers, 5 V

1.2 µm

0.6 µm

Cmos08.rul

2 metal layers, 5 V

0.7 µm

0.35 µm

Cmos06.rul

2 metal layers, 5 V

0.5 µm

0.25 µm

Cmos035.rul

3 metal layers, 3.3 V

0.4 µm

0.2 µm

Cmos025.rul

5 metal layers, 2.5 V

0.25 µm

0.125 µm

Cmos018.rul

5 metal layers, 1.8 V

0.2 µm

0.1 µm

Cmos012.rul

6 metal layers, 1.2 V

0.12 µm

0.06 µm

Cmos90n.rul

6 metal layers, 1.0 V

0.1 µm

0.05 µm

Cmos65n.rul

6 metal layers, 0.8 V

0.07 µm

0.035 µm

Cmos45n.rul

8 metal layers, 0.7 V

0.05 µm

0.025 µm

Choose the model here (Level3 by default)

Fig. B-34 The simulation parameter menu

• The main parameters are the supply VDD, the I/O supply, the temperature, the simulation length, and in case of some noise added to inputs, the RMS noise amplitude. In the same menu, the tick at the bottom enables the simulation result to be written in a “.DAT” output text file, by default each is 100 ps.

Appendix B—MICROWIND31 Program Operation and Commands 309

• In the “Extractor Options” window (Fig. B-35), options related to the extraction from layout to electrical netlist are displayed. The default extraction includes the removal of redundant boxes (Purge) and the removal of overlaps (Merge). The fast extraction does not handle Purge or Merge operations. • Other options concern the computation of lateral capacitance and vertical cross-talk capacitance.

Fig. B-35 The simulation parameter menu

Simulation on Layout The simulation is performed directly on the layout with a palette of colors. Interesting layout files to be simulated in this mode are analog blocks such as the DAC (Fig. B-36).

Fig. B-36 Example of simulation on layout (ADC.MSK)

310 Advanced CMOS Cell Design

During simulation, a window also appears which displays the operating point of the n-channel MOS or p-channel MOS selected by the user. What may be seen is the trajectory of the operating point in the Id/Vd characteristics. In simulation mode Simulation on Layout, the node voltage is superimposed on the layout and appears with a palette of colors. Start Simulation

The above icon or the command Simulate → Start Simulation both give access to the automatic extraction and analog simulation of the layout. • Click on Voltage vs Time to obtain the transient analysis of all visible signals (Fig. B-37). The delay between the selected start node and selected stop node is computed at VDD/2. You can change the selected start node in the node list, in the right-upper menu of the window. You can do the same for the selected stop node.

Fig. B-37 Example of time-domain simulation of an inverter (CMOS.MSK)

• Click on Voltage and Currents so as to make all voltage curves appear in the lower window, and the VDD, the VSS and the desired MOS currents appear in the upper window. In that mode, the dissipated power within the simulation is also displayed. • Click on Voltage vs Voltage to obtain transfer characteristics between the X-axis selected node and the Y-axis selected node. Initially the start node is the first clock or pulse of the node list, and the stop node is the first varying node. This mode is useful for the computing of the inverter characteristics (commutation point), the DC response of the operational amplifier, or for the

Appendix B—MICROWIND31 Program Operation and Commands 311

Schmitt trigger to see the hysteresis phenomenon. The first simulation computes the value of the stop node for start node varying from zero to VDD. The second click on « Simulate » computes the same for start node varying from VDD to zero. This feature is interesting for circuits with memory effects (Schmidt trigger). Note that the curves may not be exactly the same. You may increase the precision by reducing the computational step “Precision”, accessible from the menu, and expressed in mV. You can modify the minimum simulation step (default value of 0.3 ps in Fig. B-25) but it may be dangerous. If you increase the simulation step, the simulation speed improves but the numerical error increases too and may lead to imprecise simulations, or even unstable simulations. If you decrease the simulation step, the simulation speed is decreased too but the numerical precision is improved. The risk of computing divergence is also reduced. Simulation Properties The simulation icons add properties to the nodes. Properties are applied to the electric nodes of the circuit in order to serve as simulation guides. You must specify which node is assigned to which voltage before starting the analog simulation.

VDD and VSS

The node is pushed to the power supply voltage with icon VDD (1.2 V in 0.12 µm for example), and pulled to the ground (0 V) with icon VSS. There also exists a highvoltage VDDH, usually for input/output structures, represented in red with the same symbol as for VDD.

312 Advanced CMOS Cell Design

CLOCK

When a node becomes a clock, the parameters of the latter are divided as follows: rise time, level one, fall time, and level zero. All values are expressed in nano seconds (ns). If you ask for a second clock, the period will be multiplied by two. • You may alter level zero and level one by entering a new value with the keyboard. • To generate a clock starting from VDD instead of VSS, change the values in the fields Level 1 and Level 0. • Use Slower to multiply the clock period by two. • Use Faster to divide the clock period by two.

PULSE

The pulse switches from “Level 0” (zero by default) to “Level 1” (VDD by default) after a delay ts defined in the field “Time start”.

Appendix B—MICROWIND31 Program Operation and Commands 313

SINUS

The sinusoidal waveform parameters are the amplitude, the offset and frequency. A noise may be added with a user-defined amplitude. The parameter “increase f ” is useful to generate a sinusoidal wave with a time-varying frequency (chirp signal). Such signals are useful for investigating the response of circuits to a range of frequencies. Examples may be found in Chapter 5 dedicated to radio-frequency design.

PWL

The pulse switches from “Level 0” (zero by default) to “Level 1” (VDD by default) depending on the user-defined time-table. The easiest way to fill the table (Time, Value) is as follows: • Enter the string “0101100”. • Press “Insert”. The time-table is updated. If you click “Clear”, all lines situated after the selected element of the time-table are erased. You may also invert the values (1 → 0, 0 → 1) using the button Invert. Random values may be introduced using “r”, the node may also be left in floatingstate using “x”, and a high-voltage (as defined in the simulation options) may be assigned to the node using the Character “2”.

314 Advanced CMOS Cell Design

VISIBLE NODE Click on the “eye” and click on the existing text in the layout to make the chronograms of the node appear. Initially, all nodes are invisible, but the clocks and impulse nodes are subsequently made visible. MATH (New in version 3.1)

A user-defined equation may be entered to create virtually any type of waveform. Examples are given below. The full list of functions is reported in Table B-4. t*1e8 pos(cos(2*pi*t/1e-9)) sqr(sin(2*pi*t/1e-9)) sin(2*pi*1e9*t)*sin(2*pi*1.45e9*t) 0.2*sin(2*pi*450e6*t)+0.1*sin(2*pi*2.0e9*t) exp(-2*t*1e9) white(vdd) logic(1e-9) gauss(0.1)+0.1*sin(2*pi*2.5e9*t) rms(gauss(1)) avg(logic(0.1e-9)) vdd*Exp(-sqr((t-5e-9)/2e-9))*sin(2*pi*2e9*t)

Undo (CTRL+U) The Undo command (Edit → Undo) is useful to not take into account the last editing command. It is possible to undo the commands Cut, Paste, Copy, Move, Stretch, Edit and Compile. The undo command also works after inserting layout from the library, predefined contacts, or layout from other files.

Appendix B—MICROWIND31 Program Operation and Commands 315

Table B-4 Functions provided in the MATH simulation property Abs

Absolute value

Arcos

Invert cosine

Arcsin

Invert sinus

Arctan

Invert tangent

Abs

Absolute value

Avg

Average of the signal

Cos

Cosine

CosH

Hyperbolic Cosine

Exp

Exponent

Gauss

Gaussian noise; the parameter is the variance

Int

Integral

Logic

Random logic value between VDD and VSS, changed at period given as a parameter.

Norm

Normal distribution

Pi

3.1415927

P2

2*pi

Pos

Positive value of the signal

RMS

Root mean square

Sin

Sinus

SinH

Hyperbolic Sinus

Sqr

Square

Sqrt

Square root

White

White noise; the parameter is the amplitude

t

Time in seconds

TAN

Tangent

VDD

Voltage supply; given in the technology file

VDDH

High Voltage supply; given in the technology file

x

Time in seconds

316 Advanced CMOS Cell Design

Unprotect All (CTRL+P) Click on View → Unprotect All to select all layers for editing purpose. All ticks in the palette are asserted. Unselect All (ESC) Click on View → Unselect All (or <ESC>) to unselect the layout. This command is useful to view the layout again in its default colors after commands such as View Interconnect or View Nodes, which highlight one single node. Using Model The electrical simulation may be performed using MOS Level one, MOS Level three or BSIM4. • MOS Model One (Berkeley SPICE Level One) is very simple, but only valid for very long channel devices. This model is considered as obsolete but remains interesting for comparison with advanced models. • MOS Model Three (Simplified Version of Berkeley MOS Level Three) is still in use for firstorder estimation of the circuit performances. However, severe discrepancies are observed for non-standard width and length, as well as for gate-voltage lower that the threshold voltage. • BSIM4 (Simplified Version of Berkeley MOS BSIM4). The state-of-the-art model for deep sub micron device modeling. UV Exposure to Discharge Floating Gates The command Simulate → UV Exposure to Discharge Floating Gates enables the charges accumulated in double-poly MOS devices to be evacuated (Fig. B-38). In reality, the discharge is performed by ultraviolet light. The charging of the gate in such devices may be done using the command Simulate Mos Characteristics, by acting on the right-most cursor that corresponds to the floating gate charge level.

Fig. B-38 Simulation of the floating gate discharge

View All

Click View → View All to fit the screen with all the graphical elements currently on display

Appendix B—MICROWIND31 Program Operation and Commands 317

Virtual R, L or C

A resistance, capacitance or inductance symbol may be placed directly on the layout. This feature allows to pilot the simulator with R, L, C elements without the need to draw its corresponding layout. This feature is used in several cases such as: radio-frequency design, gate loading investigations, etc. View Electrical Node

Click on the icon above or on View →View Electrical Node. Then, click in the desired box in the layout. After an extraction procedure has been carried out, you will see all the boxes connected to that node. In the case of a large layout, the command may take time. The associated parasitic capacitance, the list of text labels added to the selected boxes, and the node properties are also displayed in a separate navigator window. Click “Unselect”, “Hide”, <Escape> or View → Unselect All to unselect the layout. View Interconnect (CTRL+I) The command View →View Interconnect performs an electrical extraction of the metal and polysilicon boxes connected to the desired point. Compared to View Electrical Node, this command works faster but does not consider diffused layers that can extend the node interconnect network. The command gives the list of connected text labels. Click on <Escape> or on View → Unselect All to unselect the layout. With Crosstalk Click on Simulate → With Crosstalk to add the effect of lateral and vertical coupling between conductors, in order to take into account the noise coupling. More information and several illustrations of the crosstalk effect may be found in the book “Basic CMOS cell design” by the same authors, in Chapter five which is dedicated to interconnects. Zoom-In and Zoom-Out

The above icons perform Zoom-In and Zoom-Out. When zooming in, the area determined by the mouse will be enlarged to fit the display window. When zooming out, the area determined by the mouse will contain the display window. If you click once, a zoom is performed at the desired location. CTRL+Z for Zoom-In, and Ctrl+O for Zoom-Out. You may also press CTRL+A for « View All ».

318 Advanced CMOS Cell Design

Appendix C DSCH31 Logic Editor Operation and Commands

Getting Started To get your DSCH31 program started, use the following procedure: Connect to http://www.microwind.net Follow the download procedure Follow the install procedure Double-click the DSCH31.EXE icon The software runs on Windows 98, 2000, NT and XP operating systems.

Commands About DSCH31 Information about the software release and support. Connect

Use the “Connect ” icon to create the electrical contact between crossing interconnects.

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Appendix C—DSCH31 Logic Editor Operation and Commands 319

Cut (CTRL+X) Click on the Cut icon or Edit → Cut. Move the cursor to the design window, and delimit the active area with the mouse. Consequently, all the graphics included in this area are erased. Click on Undo to fix those elements back into the design. One symbol only can be erased by a click inside its shape when the cut command is active. The symbol is then erased. One single interconnect can be erased by a click on its wire when the cut command is active. Check Floating Lines The command Simulate → Check Floating Lines may be found in the Simulation menu. The schematic diagram is scanned in order to detect interconnects with a wrong connection to the symbol or other interconnects, as in the example of Fig. C-1.

Fig. C-1 Example of floating line

Copy (CTRL+C) Click on the Copy icon or Edit → Copy. Move the cursor to the design window, and delimit the active area with the mouse. Consequently, all the graphics included in this area are copied. The external shape of the copied elements appears. Fix the copied elements at the desired location by a click on the mouse. Click on Undo to cancel the copy command. Design Hierarchy The design hierarchy command gives an interesting insight into the hierarchical structure of symbols, together with the list of input and output symbols.

320 Advanced CMOS Cell Design

Fig. C-2 An example of design hierarchy

Electrical Net Click on the icon above or on View → Electrical Net. Then, click on the desired interconnect or pin in the schematic diagram. After an extraction procedure has been carried out, you will see all the wires connected to that node. Click <Escape> or View → Unselect All to unselect the diagram. Find Critical Path The critical path is the series of logic gates between the output and input with the longest propagation delay. The command Simulate → Find Critical Path shows the graph of the critical path. Invoke the command View → Critical Path Details to extract the list of symbols and cumulative delays which build the critical path. Flip Vertical/Horizontal To apply an horizontal or vertical flip to one part of the design, click on Edit → Flip. Then, delimit the area inside which the elements will be changed. Generate SPICE File DSCH converts the SCH schematic diagram into SPICE format using a specific interface (Fig. C-3), invoked by File → Generate SPICE file. The SPICE file can be exported to analog SPICE-compatible simulators such as PSPICE or WinSPICE. More information about DSCH interfacing to WinSpice may be found in Appendix E. Help Provides an on-line help for using DSCH31. Includes a summary of commands, some details about the design rules, and some details about the current version of the software.

Appendix C—DSCH31 Logic Editor Operation and Commands 321

Fig. C-3 Converting a schematic diagram into a SPICE-compatible text file

Insert Another Schema The command File → Insert another Schema is used to add an SCH file to the existing files. Its contents is fixed at the right-lower side of the existing schematic diagram. The current file name remains unchanged. Insert User Symbol The command Insert → User Symbol is used to add a user-defined symbol to the existing schematic diagram. The user symbol is created using the command File → Schema To New Symbol. The inserted symbol can be fixed at the desired location. Leave Dsch31 Click on File → Leave Dsch3.1 (or CTRL+Q) in the main menu. If you have made a design or if you have modified some data, you will be asked to save it. After confirmation, you can return to Windows. Line (or right-click with the mouse) The “Line” icon is the default icon. It creates an interconnection between two points in the schematic diagram. If the “Line” icon is not selected, click on it. Then, move the cursor to the display window and fix the start point of the interconnect with a press of the mouse. Keep pressed and drag the mouse to the interconnect end. Release the mouse and see how the line is created.

322 Advanced CMOS Cell Design

List of Symbols The command gives the complete netlist corresponding to the schematic diagram. The internal structure of hierarchical symbols also appears. The symbol name, list of pins, related node numbers and model number are listed. Make VERILOG File DSCH3 converts the schematic diagram into VERILOG using a specific interface, invoked by File → Make Verilog file. The VERILOG text can be exported to VLSI CAD software. The right-side table of the screen (Fig. C-4) gives the list of options: module name, gate-delay information, list of labels, and general information about the size of the design. The conversion of the schematic diagram into a VERILOG description is useful for compiling the schematic diagram into layout using MICROWIND. The VERILOG description is a text with a predefined syntax. Basically, the text includes a description of the module (name, input, output), the internal wires, and the list of primitives. An example of VERILOG file generated by DSCH is given below.

Fig. C-4 Conversion into Verilog

Monochrome/Color (F5) The command File → Monochrome/Color switches to monochrome mode and the layout is drawn in black and white. This type of drawing is convenient to build monochrome documentation by avoiding a black background. Alternatively, press “Alt”+“Print Screen” to copy the active window to the clipboard. Then, open Microsoft Word™ or WordPad, click Edit → Paste. The screen is inserted into the document.

Appendix C—DSCH31 Logic Editor Operation and Commands 323

Move To move one graphical element, click on the “Move” icon or Edit → Move. Then using the mouse, draw an area that includes the elements. Then, drag the mouse to the new location and release the mouse. As a result, the elements are moved to the new place. One single line can be moved or stretched (depending on where you click) by a direct click on the line. One single text can be moved by a direct click on the text location. New Click on File → New in order to restart the software with an empty screen. The current design should be saved before asserting this command, as all the graphic information will be physically removed from the computer memory. No Undo is available to disable the New command. Open

Click on the above icon. In the list, double-click on the file to load. “.SCH” is the default extension that corresponds to the schematic diagrams. Paste Invoke the Paste command Edit → Paste. All previously copied elements are pasted at the desired location. Deleted elements can be replaced that way. Click on Undo to cancel the paste command. Print Click on File → Print Layout to transfer the graphical contents of the screen to the printer. Alternatively, you can make a copy of the window into the clipboard in order to import the screen into your favorite text-editor by pressing +. In the text-editor or in the graphic-editor, simply click on Edit → Paste We recommend that you switch to monochrome mode first by invoking the function File → Monochrome/color. In that case the layout will be drawn in a white background color using gray levels and patterns. Properties The command File → Properties provides some information about the current technology, the percentage of memory used by the schematic diagram and its detailed contents. In the Technology part, details about the time unit, voltage supply, typical delay and typical wire delay are provided, which configure the delay estimation and current estimation during logic simulation. Rotate To apply a rotation to one part of the design, click on Edit → Rotate. Select one of the proposed actions: • Rotate right or 90° • Rotate left or –90°

324 Advanced CMOS Cell Design

Fig. C-5 File properties, including statistics about the number of symbols, nodes and lines

Then, delimit the area inside which the elements will be rotated. Save, Save As Click on File → Save to save the schematic diagram with its current name. The default name is “EXAMPLE.SCH”. In the case of “Save As…”, a new window appears, into which you are to enter the design name. Use the keyboard and type the desired file name. Press “Save”. Your design is now registered within the .SCH appendix. Select Foundry Click on File → Select Foundry. The list of available processes appears. The initial design rule file is “default.tec”. Various technologies are available from 1.2 µm down to 45 nm. Click on the rule file name and the software reconfigures itself in order to adapt to the new process. Show Critical Path The critical path is the series of logic gates between the output and input with the longest propagation delay. The command Simulate → Show Criticial Path gives the list of symbols and cumulative delays which build the critical path. Simulation Options The simulation parameters are: the simulation step (10 ps by default), the gate delay, wire delay, supply voltage, and elementary gate-current. These parameters are loaded from .TEC files at initialization or with the command File → Select Foundry. One example is shown in Fig. C-7.

Appendix C—DSCH31 Logic Editor Operation and Commands 325

Table C-1 Technology files used to configure DSCH3.1 Technology file

Description

Cmos12.tec

2 metal layers, 5V

Cmos08.tec

2 metal layers, 5V

Cmos06.tec

2 metal layers, 5V

Cmos035.tec

3 metal layers, 3.3V

Cmos025.tec

5 metal layers, 2.5V

Cmos018.tec

5 metal layers, 1.8V

Cmos012.tec

6 metal layers, 1.2V

Cmos90n.tec

6 metal layers, 1.0V

Cmos65n.tec

6 metal layers, 0.8V

Cmos45n.tec

8 metal layers, 0.7V

Fig. C-6 Critical path details

Start Simulation The command Start Simulation launches the electrical net extraction and the logic simulation. The simulation speed may be controlled by the cursor “Fast-Slow”. The simulation may be paused, run stepby-step and stopped. By default, the logic state of all interconnects is made visible. You may also see each pin state by a tick in front of “Show pin state” (Fig. C-8).

326 Advanced CMOS Cell Design

Fig. C-7 Logic simulation parameters

Fig. C-8 The logic simulation control window

Schema to New Symbol This command is very important to create user-defined symbols in order to build hierarchical designs. As an example, the full-adder diagram based on primitives can be translated into a single symbol which includes the structure, input and outputs, as shown in Fig. C-9. I/Os. The list of I/Os is based on active symbols (buttons, clocks, keyboards, etc…). The position and side in the symbol may be changed in the table. VERILOG. The structural description based on primitives is described in VERILOG format and added to the symbol description. REFRESH. Update the layout of the user symbol. SIZING. Act on the icons to change the shape of the user symbol. SYMBOL PROPERTIES. These properties may be changed by the user. Symbol Library The symbol library contains basic logic and electrical symbols, sources, displays and switches. The aspect of the logic library is reported in Fig. C-10. Most standard logic symbols (Inverter, Buffer, NAND,

Appendix C—DSCH31 Logic Editor Operation and Commands 327

Fig. C-9 The user’s symbol control window

AND, NOR, OR, XOR) and D-latches are part of the “Basic” symbol menu. The analog components such as resistor, inductor, capacitor, operational amplifiers are reported in the “Advanced” menu. Notice several I/O symbols, as well as a variety of switches for programmable arrays. Some more symbols may be found in the IEEE directory, accessible through the command Insert → User Symbol.

Fig. C-10 The symbol library

328 Advanced CMOS Cell Design

Text

Use this icon to fix a text to one box or location in the design. That text illustrates the layout and should be used as much as possible for all significant nodes such as inputs and outputs. To add some text to a particular place, proceed as follows: Click on the icon. Set the text location with the mouse. A dialog box appears. Enter the text in front of “Text:” and press “Ok”. The text is set in the drawing. A text can be modified as follows: click on the icon and click inside the existing text. The old text appears. Modify it and click on “Ok”. Text is added for information only. It has no impact on simulation. Timing Diagram The timing diagram gives the time-domain aspect of all input and output nodes. An example of timing diagram is shown in Fig. C-11. You may zoom on a specific time window, add the evaluation of the consumed current, and get the exact value of each input/output at a desired location.

Fig. C-11 The timing diagrams of a logic simulation

Undo The Undo command (Edit → Undo) is useful to not take into account the last editing command. It is possible to undo the commands Cut, Paste, Copy, Move, Stretch, and Edit.

Appendix C—DSCH31 Logic Editor Operation and Commands 329

Unselect All (Escape Key) Use the command View → Unselect All to cancel undesired commands, or to redraw the complete schematic diagram. View All

Click View → View All to fit the screen with all the graphical elements currently on display. View Same Draw again the schematic diagram without changing the scale. Used to refresh the screen. Zoom-In and Zoom-Out

The above icons perform Zoom-In and Zoom-Out. When zooming in, the area determined by the mouse will be enlarged to fit the display window. When zooming out, the area determined by the mouse will contain the display window. • If you click once, a zoom is performed at the desired location. • Press Ctrl+A for « View All », and Ctrl+o for zoom out.

330 Advanced CMOS Cell Design

Appendix D Quick Reference Sheet

MICROWIND31 Menus File Menu

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Appendix D—Quick Reference Sheet 331

View Menu

Unselect all layers and redraw the layout Redraw the screen Fit the window with all the edited layout Zoom In, Zoom Out the layout window

Extract the electrical node starting at the cursor location

Give the label list

Show/Hide the lambda grid or the cell compiler grid

Give the list of nMOS and pMOS devices

View one interconnect without extracting the whole circuit

Show the palette of layers, the layout macro and the simulation properties

Show the navigator window to display the node properties

Edit Menu

332 Advanced CMOS Cell Design

Simulate Menu

Compile Menu

Analysis Menu

Appendix D—Quick Reference Sheet 333

Palette

Navigator Window

334 Advanced CMOS Cell Design

List of Icons Open a layout file (MSK format)

Extract and simulate the circuit

Save the layout file in MSK format

Measure the distance in lambda and micron between two points

Draw a box using the selected layer of the palette

2D vertical aspect of the device

Delete boxes or text

Step by step fabrication of the layout in 3D

Copy boxes or text

Design rule checking of the circuit. Errors are notified in the layout.

Stretch or move elements

Add a text to the layout. The text may include simulation properties.

Zoom In

Connect the lower to the upper layers at the desired location using appropriate contacts.

Zoom Out

Static MOS characteristics

View the entire drawing

View the palette

Extract and view the electrical node pointed by the cursor

Move the layout up, left, right, down

Appendix D—Quick Reference Sheet 335

MICROWIND3.1 Simulation Menu

DSCH3.1 Menus File Menu

336 Advanced CMOS Cell Design

Edit Menu

Insert Menu

View Menu Redraw all the schematic diagral Redraw the screen Zoom In, Zoom Out the window

Extract the electrical nodes Show the timing diagrams Show the palette of symbols

Give the list of symbols Describes the design structure Show details about the critical path Unselect all the design

Appendix D—Quick Reference Sheet 337

Simulate Menu

Symbol Palette

Silicon Tool The software s“ilicon”is able to give a user-controlled 3D view of silicon atoms such as SiO 2 (Fig. D-1). The 3D view of the lattice shown in Fig. D-2 shows the regular aspect of silicon atoms and the very specific properties of the material. One boron atom acts as a dopant in the structure.

338 Advanced CMOS Cell Design

Fig. D-1 The « silicon » main menu

Fig. D-2 The silicon lattice and a boron dopant

Appendix D—Quick Reference Sheet 339

List of Files File

Description

MICROWIND31.EXE

MICROWIND3.1 executable file

DSCH31.EXE

DSCH3.1 executable file

*.HTML

Help manuals for MICROWIND3 and DSCH3

*.RUL

TECHNOLOGY FILES. The MICROWIND3 program reads the rule file to update the simulator parameters, the design rules and parasitic capacitor values. A detailed description of the .RUL file is reported at the end of Appendix A.

*.MSK

LAYOUT FILES. The MICROWIND31 software creates data files with the appendix .MSK. Those files are simple text files containing the list of boxes and layers, and the list of text declarations.

*.CIR

The command File → Make SPICE File generates a SPICE compatible text description.

*.MES

MOS I/V Measurements

*.V

Verilog text files

*.TEC

TECHNOLOGY FILES. The DSCH31 program reads the rule file to update the simulator parameters. A detailed description of the .TEC file is reported at the end of Appendix A.

*.SCH

Schematic diagram created by DSCH31

*.SYM

Symbols generated and used by DSCH31

File Organization The MICROWIND3.1 and DSCH3.1 software are organized in four directories, as illustrated in Fig. D-3. The « example » directory includes all MSK and SCH files used in projects. The user’s files should be stored in this directory. The « html » directory includes the help manual. The « rules » sub-directory includes the files used to configure the tools (.RUL for MICROWIND, .TEC for DSCH). Finally, the « system » folder includes the executable files and initialization files. examples

Html

rules

system

Fig. D-3 MICROWIND and DSCH folders

340 Advanced CMOS Cell Design

Appendix E Interface to WinSpice About WinSpice3 WinSPICE3[1] is a general-purpose circuit simulation program for non-linear DC, non-linear transient, and linear AC analyses. A shareware version of WinSpice3 may be downloaded at web site www.winspice.com. The tool was developed by Mike Smith, OuseTech Ltd. Circuits may contain resistors, capacitors, inductors, voltage and current sources, transmission lines and semiconductor elements such as diodes and MOS devices. WinSpice3 is based on Spice3F4[2] which was developed by the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley.

SPICE Syntax The description of basic elements used by DSCH and MICROWIND for SPICE conversion is given in the following table. Table E-1 Basic elements in SPICE format RESISTOR RXXXXXXX N1 N2 VALUE Example: Rvss 3 7 2ohm

N1 and N2 are the two element nodes. VALUE is the resistance (in ohms) and should be positive.

CAPACITOR CXXXXXXX N+ N– VALUE Example: Cb 6 2 1n

N+ and N– are the positive and negative element nodes, respectively. VALUE is the capacitance in Farads.

(Contd) Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Appendix E—Interface to WinSpice 341

INDUCTOR LYYYYYYY N+ N– VALUE Example: Lvss 8 2 2n

N+ and N– are the positive and negative element nodes, respectively. VALUE is the inductance in Henry.

CURRENT SOURCE IYYYYYYY N+ N– < DC/TRAN VALUE> Example: IB 23 21 DC 0.01

N+ and N– are the positive and negative nodes, respectively. A current source of positive value forces current to flow out of the N+ node, through the source, and into the N– node. DC/TRAN is the dc and transient analysis value of the source. If the source value is zero both for dc and transient analyses, this value may be omitted. If the source value is time-invariant (e.g., a power supply), then the value may optionally be preceded by the letters DC.

Supply voltage VYYYYYYY N+ N– < DC/TRAN VALUE> Example: VDD 1 0 DC 2.0V

N+ and N– are the positive and negative nodes, respectively. A voltage source of positive value is set between N+ node, and N– node.

MOS devices MXXXXXXX ND NG NS NB MNAME <W=VAL> Example: MN1 2 17 6 10 MOSN L=5U W=2U Diode DXXXXXXX N+ N– MNAME Examples: DBRIDGE 2 10 DIODE1 DCLMP 3 7 DMOD 3.0 IC=0.2

ND, NG, NS, and NB are the drain, gate, source, and bulk (substrate) nodes, respectively. MNAME is the model name. L and W are the channel length and width, in meters. N+ and N– are the positive and negative nodes, respectively.

DSCH and MICROWIND use three MOS device models, which differ in the formulation of the I-V characteristic. The variable LEVEL specifies the model to be used (Table E-2). Table E-2 MOS models available through DSCH and MICROWIND LEVEL=1

MOS1, Shichman-Hodges, very simple model (see [2])

LEVEL=3

MOS3, a semi-empirical model (see [2])

LEVEL=14

BSIM4, an advanced model for deep submicron technology (see[3])

342 Advanced CMOS Cell Design

Table E-3 The three most common types of SPICE analysis in relation with CMOS cell simulation .TRAN

Transient Analysis

.DC

DC Transfer Function

.AC

Small-Signal AC Analysis

Current Source Description The current source is assigned a time-dependent value for transient analysis. There are five independent source functions: pulse, exponential, sinusoidal, piece-wise linear, and single-frequency FM. In the schematic editor, the PULSE description has been implemented, as shown in Fig. E-1.

Fig. E-1 Current pulse parameters in the schematic editor

The PULSE description restricts the current shape to a periodic pulse, which has a triangular shape if the pulse width parameter is set to zero (Fig. E-2). PULSE(I1 I2 TD TR TF PW PER) Example: IIcpu 5 7 PULSE(0 1.2A 1.0n 2n 2n 0.1n 50n) I1 I2 TD TF PW PER

initial value pulsed value rise time fall time pulse width period

Amps Amps TSTEP seconds TSTEP seconds TSTOP seconds TSTOP seconds

Appendix E—Interface to WinSpice 343

PW is almost zero to obtain a triangle

Current I2

TD time

I1 Tf

tr

Period PER

Fig. E-2 Current pulse parameters described as a PULSE

Voltage Source Description The voltage source is assigned a constant value to modelize the supply source. In Fig. E-3, the voltage source is constant, with a DC value of 3 V.

Fig. E-3 Constant voltage source

There are mainly three analyses of interest that are listed below. Other types of analyses exist in WinSpice as described in [1], which are not introduced in this Appendix.

344 Advanced CMOS Cell Design

.AC: Small-Signal AC Analysis General form: .AC DEC ND FSTART FSTOP .AC LIN NP FSTART FSTOP Examples: .AC DEC 10 1K 100MEG .AC LIN 100 1MEG 10G DEC stands for decade variation, and ND is the number of points per decade. OCT stands for octave variation, and NO is the number of points per octave. LIN stands for linear variation, and NP is the number of points. FSTART is the starting frequency, and FSTOP is the final frequency. .DC: DC Transfer Function General form: .DC SRCNAM VSTART VSTOP VINCR [SRC2 START2 STOP2 INCR2] Examples: .DC VIN 0.25 5.0 0.25 .DC VDS 0 10 .5 VGS 0 5 1 The DC line defines the DC transfer curve source and sweep limits (again with capacitors open and inductors shorted). SRCNAM is the name of an independent voltage or current source. VSTART, VSTOP, and VINCR are the starting, final, and incrementing values respectively. The first example causes the value of the voltage source VIN to be swept from 0.25 Volts to 5.0 Volts in increments of 0.25 Volts. A second source (SRC2) may optionally be specified with associated sweep parameters. In this case, the first source is swept over its range for each value of the second source. This option can be useful for obtaining semiconductor device output characteristics. .TRAN: Transient Analysis General form: .TRAN TSTEP TSTOP Example: .TRAN 1NS 1000NS TSTEP is the printing or plotting increment for line printer output. TSTOP is the final time. The transient analysis always begins at time zero.

Appendix E—Interface to WinSpice 345

Generate a SPICE File with DSCH3.1 Schematic Diagram Not all symbols may be translated into SPICE. Only R, L, and C elements, transmission lines, current sources, voltage sources, MOS and diode devices may be translated and simulated. Logic gates such as AND, NAND, NOR, XOR, etc. cannot be converted into SPICE. The only possibility is to replace the gates by their MOS-based equivalent circuits, as explained in Chapter four of the book “Basic CMOS cell design” [4].

Fig. E-4 A schematic diagram example

Let us consider the schematic diagram of Fig. E-4, containing one inverter and an RC load. Invoke the command File → Generate Spice file or click +. A screen appears (Fig. E-5).

Fig. E-5 The SPICE file generated from the schematic diagram and some options

346 Advanced CMOS Cell Design

The text is saved using the same project name, with the appendix <.CIR>. In Fig. E-5, the text file starts with comments (‘*’ in the first column), the declaration of voltage sources (‘V’ as the first character), the R, L, and C components (here one capacitor C1 and one resistor R1), the active devices (one pMOS, one nMOS) and the simulation control. Defining the Type of Analysis By default, the analysis is the timedomain transient simulation “.TRAN”. The duration of the simulation is 250 ns by default. The text added in the layout starting with “.TRAN” (“.TRAN 0.1 N 100 N” in the case of Fig. E-6) is recognized by the SPICE translator as the new transient simulation control. Three keywords are recognized by DSCH 3.1: • “.TRAN xxx” • “.DC xxx” • “.AC xxx”

Fig. E-6

Defining the analysis parameters by adding a label in the schematic diagram

Input/output All clocks, keyboard, LEDS and displays are declared as voltage outputs. In the case of the inverter, the input “IN1” and the output “OUT1” appear in the “plot” control line as voltage V(2) and V(4). Table E-4 Control section at the end of the SPICE file Control line

Description

.TRAN 0.1N 100N

Transient analysis, step 0.1N, duration 100 ns

.control

Start the control section

Run

Run the transient analysis

set nobreak

No break in the output text file

print V(2) V(4) > spiceInv.txt

Dump two voltages in the file “spiceInv.txt”

plot V(2) V(4)

Open a window and plot the same voltages

.endc

End of control section

.OPTIONS DELMIN=0 RELTOL=1E-6

Options for simulation

.END

End of SPICE file

Appendix E—Interface to WinSpice 347

Run WinSPICE Simulation Start the WinSPICE program, and click File → Open (Fig. E-7). Select the desired .CIR file. In our example, the file generated by DSCH is “spiceInv.CIR”.

Fig. E-7 The WinSpice initial screen

The simulation is performed in time-domain, and the following screen appears. The .TRAN analysis is conducted during 100 ns. The result is stored in a file called “spiceInv.txt”. The plot of the transient simulation appears in a new window reported in Fig. E-8.

Fig. E-8 The transient simulation of the inverter

348 Advanced CMOS Cell Design

Generate a SPICE File with MICROWIND3.1 Introduction The translation from layout to SPICE is performed through the command File → Convert Into → Spice Netlist. Let us consider the layout file I“nvCapa.MSK”as illustrated in Fig. E-9. The corresponding SPICE netlist is given below. CIRCUIT InvCapa.MSK * * IC Technology: CMOS 0.12µm - 6 Metal * VDD 1 0 DC 1.20 Vin 5 0 DC 0 SIN(0.00 1.20 0.98N 0.02N 0.02N 0.98N 2.00N) * * List of nodes * “inv” corresponds to n°3 * “in” corresponds to n°5 * * MOS devices MN1 0 5 3 0 N1 W= 0.24U L= 0.12U MP1 1 5 3 1 P1 W= 0.24U L= 0.12U * C2 1 0 1.256fF C3 3 0 0.487fF C5 5 0 0.314fF * * Extra RLC * Cadd1 3 0 0.01pF * * * n-MOS BSIM4 : * low leakage .MODEL N1 NMOS LEVEL=14 VTHO=0.40 U0=0.050 TOXE= 2.0E-9 LINT=0.010U +K1 =0.450 K2=0.100 DVT0=2.300 +DVT1=0.540 LPE0=23.000e-9 ETA0=0.080 +NFACTOR= 1.6 U0=0.050 UA=3.000e-15 +WINT=0.020U LPE0=23.000e-9 +KT1=-0.060 UTE=-1.800 VOFF=0.050 +XJ=0.150U NDEP=170.000e15 PCLM=1.100 +CGSO=100.0p CGDO=100.0p +CGBO= 60.0p (Contd.)

Appendix E—Interface to WinSpice 349

* * p-MOS BSIM4: * low leakage .MODEL P1 PMOS LEVEL=14 VTHO=-0.45 U0=0.018 TOXE= 2.0E-9 LINT=0.010U +K1 =0.450 K2=0.100 DVT0=2.300 +DVT1=0.540 LPE0=23.000e-9 ETA0=0.080 +NFACTOR= 1.6 U0=0.018 UA=1.500e-15 +WINT=0.020U LPE0=23.000e-9 +KT1=-0.060 UTE=-1.800 VOFF=0.050 +XJ=0.150U NDEP=170.000e15 PCLM=0.700 +CGSO=100.0p CGDO=100.0p +CGBO= 60.0p * * Transient analysis * * (Winspice) .options temp=27.0 .control tran 0.1N 5.00N print V(3) > out.txt plot V(3) .endc .END Fig. E-9 Layout and SPICE translation of the file “invCapa.MSK”

Notice the MOS model description using “LEVEL = 14” which corresponds to BSIM4 in WinSpice. The model parameters are controlled by the technology file “default.RUL” of MICROWIND, inside which a reduced set of BSIM4 model parameters are defined. The analysis is always the time-domain transient simulation “TRAN”. The duration of the simulation is the one appearing in the Time Scale box of the analog simulation menu. The parameter may also be changed in the simulation parameter window (Simulate → Simulation Parameters → Simulation Length). Run SPICE Simulation Start the WINSPICE program, and click File → Open. Select the desired .CIR file. In our example, the file generated by MICROWIND is “ invCapa.CIR”. The simulation is performed in time-domain, and the following screen appears. The time-domain analysis is conducted during 5 ns. The result is stored in a file called invCapa.TXT. The plot of the transient simulation appears in a new window reported in Fig. E-10.

350 Advanced CMOS Cell Design

Fig. E-10 The transient simulation performed by WinSpice

Static Analysis The example proposed in Fig. E-11 corresponds to the transfer function of the output voltage versus the input voltage. The script of the SPICE file is modified manually so that the “TRAN” analysis is replaced by the “DC” analysis. The DC parameters are the control node (here Vclock1), the start voltage (0.0 V), the stop voltage (1.2 V) and the voltage step (10 mV). The WINSPICE result shows that the inverter switches when Vclock1 = 0.55 V. .control dc Vclock1 0 1.2 0.01 print V(3) V(5) > out.txt plot V(3) V(5) .endc The MOS characteristics may also easily be obtained using the DC control. The current is plotted using the command “Plot –I(Vdrain)” (Fig. E-12). The Id/Vd curve for varying gate-voltage may be obtained using the control line “dc Vdrain 0 1.2 0.01 Vgate 0 1.2 0.2”. The Vgate step is 0.2 V, ranging from 0 to 1.2 V. The DC analysis is performed 7 times, and generates seven curves that are superimposed on the same window (Fig. E-13). CIRCUIT nmos * * IC Technology: CMOS 0.12µm - 6 Metal * Vgate 1 0 DC 1.20

Appendix E—Interface to WinSpice 351

Fig. E-11 The transfer characteristics of the inverter using DC simulation

Vdrain 2 0 DC 0 * * MOS devices MN1 0 1 2 0 N1 W= 0.60U L= 0.12U * * * n-MOS BSIM4 : * low leakage .MODEL N1 NMOS LEVEL=14 VTHO=0.40 U0=0.050 TOXE= 2.0E-9 LINT=0.010U +K1 =0.450 K2=0.100 DVT0=2.300 +DVT1=0.540 LPE0=23.000e-9 ETA0=0.080 +NFACTOR= 1.6 U0=0.050 UA=3.000e-15 +WINT=0.020U LPE0=23.000e-9 +KT1=-0.060 UTE=-1.800 VOFF=0.050 +XJ=0.150U NDEP=170.000e15 PCLM=1.100 +CGSO=100.0p CGDO=100.0p +CGBO= 60.0p * * DC analysis

352 Advanced CMOS Cell Design

* * (Winspice) .options temp=27.0 .control dc Vdrain 0 1.2 0.01 plot -i(Vdrain) .endc .END

Fig. E-12 The Id/Vd characteristics of the NMOS device

Fig. E-13 The Id/Vd characteristics for various gate voltages

Appendix E—Interface to WinSpice 353

Frequency Analysis An example of frequency simulation using WINSPICE is proposed in this paragraph. The goal of the frequency analysis is to find out the cut-off frequency of an amplifier. We start from the layout of the follower-amplifier “AmpliFollow.MSK”, loaded by a 1 pF virtual capacitor at its output (Fig. E-14). The SPICE file corresponds to the direct translation of the layout using the command File → Convert Into → Spice Netlist.

Fig. E-14 The follower-amplifier used for AC analysis (AmpliFollow.MSK)

An AC simulation can be performed by declaring an AC source (here, the input voltage source is changed into “VIn 9 0 DC 0.6 AC 1 0”) and replacing the “TRAN” analysis by “AC”. The proposed analysis covers the frequency range from 1 MHz to 1 GHz by decades. The corresponding control line is: ac dec 10 1meg 1g The usual plot unit for the output voltage is the decibel, as given in Equation E-1. VDB = 20 log (V)

(Eq. E-1)

The command line to plot the output voltage V(3) directly in dB is as follows. The results is shown in Fig. E-15. The –3 dB loss is found around 500 MHz. plot vdb (3) A powerful script language [1] has been developed in WINSPICE that enables iterative parametric analysis. For example, the output loading capacitor Cload may be changed from 1 pF to 5 pF using the following control lines:

354 Advanced CMOS Cell Design

.control let ii=1 while ii<5 alter Cadd1 =ii*1e-12 ac dec 10 1meg 1g let ii = ii+1 end plot ac1.v(3) ac2.v(3) ac3.v(3) ac4.v(3) ac5.v(3) .endc

Fig. E-15 Frequency response of the follower-amplifier

The result, shown in Fig. E-16 is a set of curves that represent the gain of the output (knowing that the input voltage has been fixed to 1 V) for Cload = 1 pF (ac1) to 5 pF (ac5).

Fig. E-16 Using a specific script to modifiy the Cload parameter and perform the iterative AC analysis

Appendix E—Interface to WinSpice 355

References [1] WinSpice3 User’s Manual, October 2003, Mike Smith, www.winspice.com [2] A. Vladimirescu and S. Liu, “The Simulation of MOS Integrated Circuits Using SPICE2”, ERL Memo No. ERL M80/7, Electronics Research Laboratory, University of California, Berkeley, October 1980. [3] [Liu] W. Liu, “Mosfet Models for SPICE simulation including Bsim3v3 and BSIM4”, Wiley & Sons, 2001, ISBN 0-471-39697-4 [4] E. Sicard, S. Ben Dhia “Basic CMOS cell design”, Tata McGraw-Hill, 2005,ISBN 0-07-059933-5

356 Advanced CMOS Cell Design

Glossary Terms

Explanation

ADC

Analog-to-digital converters (ADC)

AOI

AND/OR inverted logic

APS

Active pixel sensors

CCD

Charge-coupled-device

CMOS technology

Complementary Metal-Oxide-Semiconductor, where complementary refers to the use of n-channel and p-channel MOS devices, metal to a metal gate, oxide to the oxide between the gate and the channel, and semiconductor to the structure of the channel.

Commutation point

Voltage value for a gate input at which the output state changes.

CSP

Chip-scale packaging

DAC

Digital-to-analog converters (DAC)

DIBL

Drain induced barrier lowering

Die

Piece of silicon that includes the active devices and the input/output interfaces. Usually 350 to 500 µm thick, with an area from 2 × 2 to 25 × 25 mm.

Dielectric

An insulator layer between conductors, or underneath the gate that isolates the gate from the channel.

Drain

The part of the transistor where doping carriers flow to. The source is the source of holes in the case of pMOS devices, i.e. the region with the highest potential.

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Glossary 357

ESD

Electrostatic discharge

Fanout

Number of gate inputs connected to an output node.

FBE

Floating-body effect

FPGA

Field programmable gate arrays ICs with programmable hardware to configure the circuit function according to the customer’s needs. See Chapter four for more details.

Gate

A region at the top of the transistor whose electrical state determines whether the transistor is on or off. In CMOS technology, the gate is made of polysilicon (polycrystalline silicon).

High-k dielectric

A material that can replace silicon dioxide as a gate dielectric. High-K materials can be thicker than silicon dioxide to reduce leakage, while keeping similar switching properties.

Hot electrons

Electrons accelerated in the MOS channel may acquire sufficient temperature to provoke parasitic effects such as hole/electron generation following the impact on the silicon lattice.

Intrinsic carriers

Charges included in the pure silicon crystal.

IP

Intellectual property blocks

Kink effect

In SOI technology, when a MOS transistor passes strong current between the drain and the source, the current Ids suddenly rises and provokes a conductance discontinuity, due to impact ionization of high energy electrons.

Latchup

Destructive short circuit effect created by NPNP stack of layers that creates a direct path from VDD to VSS, under wrong polarization conditions.

LDD

Lateral drain diffusion. Light doping on the channel borders to prevent the hot electron effect.

Low-K

Dielectric material with low permittivity. Mainly used between interconnects to reduce coupling effects.

MOS

Metal Oxide Semiconductor

nMOS

n-channel MOS device

358 Advanced CMOS Cell Design

Optical masks

Films used to pattern the layout of the IC on the silicon.

Parasitic leakage effect

Electrons may cross the channel from the source to the drain even with a zero bias on the gate, that creates a parasitic leakage between drain and source.

Passivation

Thick oxide at the surface of the integrated circuit to prevent external contamination and protect active devices.

PIP

Programmable interconnect point. Used in FPGAs.

PLL

Phase-lock-loop. Commonly used in microprocessors to generate a clock at high frequency from an external clock at low frequency.

pMOS device

p-channel MOS device

PPS

Passive pixel sensor

Process

The technology steps required to complete the fabrication of the integrated circuit.

PWL

Piece-wise-linear signal. Used to describe complex signals based on tabulated current or voltage versus time.

Salicide

Metal deposit at the surface of a doped silicon area to further decrease its resistance. MOS drain and source, as well as poly gate usually use salicide.

Semiconductor

Depending on biasing conditions, the silicon conducting properties may vary from a conductor to a insulator.

SHF

Super-high frequencies (SHF) ranging from 3 GHz to 30 GHz

Short channel effect

Accounts for a set of parasitic effects observed for very narrow MOS devices.

Silicon dioxide (SiO2)

Material consisting of one silicon and two oxygen atoms. Its exceptional quality as an insulator have made this material applicable for very thin dielectric beneath the gate and for inter-metal insulator. However, new insulators may be introduced to minimize parasitic effects of SiO2 such as leakage and cross-talk coupling.

SiP

System-in-Package

Source

The part of the transistor where doping carriers flows from. The source is the source of electrons in the case of nMOS devices, i.e the region with the lowest potential.

SPD

Spectral power density

STI

Shallow trench isolation. Deep trenches of silicon insulator to isolate active devices.

Glossary 359

Substrate

The silicon on which active devices are implanted. Usually lightly doped with P impurities.

THF

Tremendously high frequencies (THF) ranging from 3 THz (Tera-hertz) to 30 THz

Three-state

High impedance state. Corresponds to a floating node, meaning that no active device ties the node to any defined voltage value.

Threshold voltage

The voltage level (noted Vt) which distinguishes whether a transistor is on or off. Transistors are designed to have a low threshold voltage, usually 20% of the supply voltage VDD. A low Vt improves the device switching speed by significantly increase the leakage.

Transistor

A simple on/off switch. Current flows from the source to the drain depending on the gate voltage.

Transmission gate

Enables or disables the link between two nodes, without any loss of voltage.

UHF

Ultra-high frequencies ranging from 300 MHz to 3 GHz

UMTS

Universal Mobile Telecommunication System

VCO

Voltage-controlled oscillator.

Wafer

The initial substrate used to fabricate ICs. Usually around 500 µm thick, its diameter ranges from 4 inches to 12 inches.

XHF

Extremely high frequencies (XHF) ranging from 30 GHz to 300 GHz

ZTC

Zero temperature coefficient

This page intentionally left blank

Index

A Accumulator 45 Active pixel-sensors 188 Addition 42 Amplifier Class A 110 Class B 111 Class C 113 Analog-to-digital converters 157, 176 Analysis AC 344 menu 332 Antenna model 104 resistance Ra 104 Atoms 6

B BSiM4 274 Buffer stage 223

C Called ONO (Oxide, Nitride, Oxide) 75 Capacitance 267 Carrier mobility 5 Check floating 319 Clamp MOS 207

Clock division 87 Compile menu 332 Complete microprocessor 57 Compressive strain 6 Conductivity xv Connect 318 Contact 263 Copy 319 Core limited 232 Cross-talk 267 Current source 342 starved oscillator 130 Cut 319

D DC transfer function 344 DCS 93 Delay-cell 122 Design flow 60 hierarchy 319 Dielectric 271 constant xv materials 10, 11 Differential non-linearity 164 Diffusion 262 Digital-to-analog converters 158 Diode design 203 detector 186

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

Diodes 201 Domains 32 Double-balanced mixer 146 Double-poly transistor 26 Down-conversion 137, 153 Dsch3.1 menus 335 Dummy transistors 174 Dynamic Random Access Memory (DRAM) 14, 23

E Edit menu 331, 336 Electrical net 320 Electrically erasable PROM (EEPROM) 14, 24 Electro-migration 198 Electrostatic discharge (ESD) 201 External voltage 201 Extraction 266

F Fast fourier transform 111 Ferroelectric RAM (FRAM) 14, 31 Field-programmable gate arrays 67 File menu 330, 335 organization 339

362 Index

Filter 127 band-pass 113 color 189 Find critical path 320 Flash 14 ADC converter 178 converter 176 memories 29 Flip vertical/horizontal 320 Floating-body effect 251 Fourier transform 139 Fowler-Nordheim 28 Frequency 353 conversion 137 demodulation 131 synthesis 132 Full-adder 84 Fully depleted MOS 252–3 Fuse 73

I I/O pads 194 IBIS 234 Id / V d 8 Image sensors 186 Impact ionization 251 Inductor impedance 98 Input register 50 Input/output 192 Insert another 321 menu 336 user 321 Instruction register 55 Instructions 38 Integral non-linearity 164 Inter-layer 268 ITRS 259

J

G Gaussian distribution 117 noise 214 General purpose 10 Generate SPICE file 320, 345 Gilbert mixer 149 Give output 45 GSM 93

H H-shape 254 Heat dissipation 202 Help 320 High temperature 250 voltage 105 High-K 10, 271 dielectric 255 High-speed 8 High-voltage MOS 210

Jitter 135

K Kink effect 251

L Lambda units 261 Lateral cross-talk 269 LC oscillator 118 LDD 210 Leakage currents 172 Leave DSCH 321 Level 3 273 Level shifter 219, 221 Line 321 List 322 List of icons 334 Load instruction 45 Look-up table 71

Low leakage 6 power 10 swing 230 voltage differential voltage TTL standard 240 Low-K 271

M Magneto-resistive RAM (MRAM) 14 Make verilog 322 Memories 13 array 62 effect 254 move 58 points 72 Metal slit 200 Metal-insulator-metal 10 Metal1 263 Metal2 264 Metal3 264 Metal4 265 Metal5 265 Metal6 265 Microinstructions 40 controller 56 Mixer 143 Mobile phone 94 Models 272 Monochrome 322 Monte-carlo analysis 117 simulation 120 Moore’s law 1 Move 323 Multiplexors 70 Multiplication 138 Multiplier xv

N Navigator 333

Index 363

New 323 n-MOS devices 7 No operation 41 Non-volatile 14

O On-chip inductor 94, 96 Open 323 Output MOS 222 register 51 structures 216 Overheating 202 Oxide thickness 5

P Package 236 chip-scale 238 Pad bonding 193 limited 232 ring 196 structure 229 Pads 266 Palette 333 Parallel fingers 223 Parametric analysis 116, 185 Parasitic consumption 220 Partially-depleted MOS model 253 Passive light-sensors 188 Paste 323 PbZrTiO3 31 Phase generator 53 Phase Lock Loop (PLL) 125 Photo-resistive 186 Physical constants and parameters xv Pipelined analog-to-digital 183 Polysilicon 262 Polysilicon2 263 Power added efficiency 110

amplifier 102 clamp 232 efficiency 108 supply 192 Print 323 Process variants 10 Program counter 54 execution 65 Programmable buffer 226 drive 226 interconnect point 79 logic block 77 memories (PROM) 14 Properties 323 Protection circuit 205 Pull-up resistor 228 PZT 31

Q Quad flat pack 237 Quality factor 96

R R-2R ladder [3] 166 Radiation 245 Radio-frequency 93 Radio-frequency choke 104 RAM 2 READ CYCLE 16 Read Only Memory (ROM) 14 Resistance 270 Resistance ladder 159 Resistivity xv Resistor design 201 Resonance 99 Resonant frequency 100 Ring oscillator 114 Roadmap 258 Root-mean-square 214 Rotate 323 Row selection circuit 18

Run SPICE 349 Run WinSPICE 347

S Salicide 161 Sample-and-hold 170 Save, Save As 324 Scale-down 1 Schema to new 326 Schmitt trigger 215 Select foundry 324 Self heating 114 Shannon’s Sampling Theorem 175 Show critical 324 Signal propagation 239 Silicon dioxide (SiO2) 1 Silicon tool 337 Silicon-on-insulator 244 Simulate menu 332, 337 Simulation 324 menu 335 Small swing voltage standards 240 SOI substrate 245 SPICE 340 Square law 142 Start simulation 325 Static memory 15 RAM (SRAM) 21 Strained silicon 5 Sub-threshold slope 245 Subtraction 44 Successive-approximation 182 Super-high frequencies 94 Supply rails 196 voltage 4, 5 Surface capacitance 268 Switched capacitor 168 Switching delay 4 matrix 81

364 Index

Symbol library 326 Symbol palette 337 Synchronous RAM 34 System-in-package 238 System-on-chip 13

T T-shaped MOS 254 TEC file 275 Technological options 270 Technology file 261, 325 ramping 3 Temperaturedependent 184 sensing 184 Tensile channel strain 6 Tera-Hertz 255 Text 328 The PULSE 342 Thermometer code 178 Timing Diagram 328

Transient Analysis 344 Transmission 239 TTL standard 240 Tunneling 27 Type of analysis 346

U Ultra-high frequencies (UHF) 94 Undo 328 Unselect 329

Via3 264 Via4 265 Via5 265 View 329 menu 331, 336 same 329 Virtual resistor 163 Volatile 14 Voltage source 343 Voltage-controlled oscillator (VCO) 121

W V VDB 353 VDD 272 VERILOG 61 Vertical aspect 269 Very-simplemicroprocessor 36 Via 264 Via2 264

WinSpice3 340 Wireless communication 93 WRITE CYCLE 16

Z Zener diode 208 Zoom-in 329