ANALOG CIRCUIT DESIGN
Analog Circuit Design Scalable Analog Circuit Design, High Speed D/A Converters, RF Power Amplifiers Edited by
Johan H. Huijsing Delft University of Technology
Michiel Steyaert KU Leuven and
Arthur van Roermund Eindhoven University of Technology
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
0-306-47950-8 0-7923-7621-8
©2003 Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow Print ©2002 Kluwer Academic Publishers Dordrecht All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America Visit Kluwer Online at: and Kluwer's eBookstore at:
http://kluweronline.com http://ebooks.kluweronline.com
Table of Contents Preface
vii
Part I: Scalable Analog Circuit Design Introduction
1
Scalable High-Speed Analog Circuit Design M. Vertregt and P. Scholtens
3
Scalable High Resolution Mixed Mode Circuit Design R.J. Brewer
23
Scalable “High Voltages” Integrated Circuit Design for XDSL Type of Applications D. Rossi
43
Scalability of Wire-Line Analog Front-Ends K. Bult
57
Reusable IP Analog Circuit Design J. Hauptmann, A. Wiesbauer and H. Weinberger
71
Process Migration Tools for Analog and Digital Circuits K. Francken and G. Gielen
89
Part II: High-Speed D/A Converters Introduction
113
Introduction to High-Speed Digital-to-Analog Converter Design R. van de Plassche
115
Design Considerations for a Retargetable 12b 200MHz CMOS CurrentSteering DAC J. Vital, A. Marques, P. Azevedo and J. Franca
151
High-Speed CMOS DA Converters for Upstream Cable Applications R. Roovers
171
Solving Static and Dynamic Performance Limitations for High Speed D/A Converters A. Van den Bosch, M. Steyaert and W. Sansen
189
vi
High Speed Digital-Analog Converters – The Dynamic Linearity Challenge A.R. Bugeja
211
A 400-MHz, 10-bit Charge Domain CMOS D/A Converter for LowSpurious Frequency Synthesis K. Khanoyan, F. Behbahani and A.A. Abidi
233
Part III - RF Power Amplifies Introduction
247
Design Considerations for RF Power Amplifiers demonstrated through a GSM/EDGE Power Amplifier Module P. Baltus and A. van Bezooijen
249
Class-E High-Efficiency RF/Microwave Power Amplifiers: Principles of Operation, Design Procedures, and Experimental Verification N.O. Sokal
269
Linear Transmitter Architectures L. Sundström
303
GaAs Microwave SSPA’s: Design and characteristics A.P. de Hek and F.E. van Vliet
325
Monolithic Transformer-Coupled RF Power Amplifiers in SI-Bipolar W. Simbürger, D. Kehrer, A. Heinz, H.D. Wohlmuth, M. Rest, K. Aufinger and A.L. Scholtz
347
Low Voltage PA Design in Standard CMOS K. Mertens and M. Steyaert
373
Preface
This book contains the revised contributions of the 18 tutorial speakers at the tenth AACD 2001 in Noordwijk, the Netherlands, April 24-26. The conference was organized by Marcel Pelgrom, Philips Research Eindhoven, and Ed van Tuijl, Philips Research Eindhoven and Twente University, Enschede, the Netherlands. The program committee consisted of: Johan Huijsing, Delft University of Technology Arthur van Roermund, Eindhoven University of Technology Michiel Steyaert, Catholic University of Leuven The program was concentrated around three main topics in analog circuit design. Each of these topics has been covered by six papers. The three main topics are: Scalable Analog Circuit Design High-Speed D/A Converters RF Power Amplifiers Other topics covered before in this series: 2000 High-Speed Analog-to-Digital Converters Mixed Signal Design PLL’s and Synthesizers 1999 XDSL and other Communication Systems RF MOST Models Integrated Filters and Oscillators 1998 1-Volt- Electronics Mixed-Mode Systems Low-Noise and RF Power Amplifiers for Telecommunication vii
viii
1997 RF A-D Converters Sensor and Actuator Interfaces Low-Noise Oscillators, PLL’s and Synthesizers 1996 RF CMOS Circuit Design Bandpass Sigma Delta and other Converters Translinear Circuits 1995 Low-Noise, Low-Power, Low-Voltage Mixed Mode with CAD Trials Voltage, Current and Time References 1994 Low-Power Low Voltage Integrated Filters Smart power 1993 Mixed-Mode A/D Design Sensor Interfaces Communications Circuits 1992 Op Amps ADC’s Analog CAD We hope to serve the analog design community with these series of books and plan to continue this series in the future. Johan H. Huijsing
Scalable high-speed analog circuit design Maarten Vertregt and Peter Scholtens
Philips Research Eindhoven, The Netherlands
[email protected] Abstract The impact of scaling on the analog performance of MOS circuits was studied. The solution space for analog scaling was explored between two dimensions: a “standard digital scaling” axis and an “increased bandwidth and dynamicrange” axis. Circuit simulation was applied to explore trends in noise and linearity performance under analog operating conditions at device level and for a basic circuit block. It appears that a single scaling rule is not applicable in the analog circuit domain.
1 Introduction The two-year cycle of successive technology generations [1] has enabled an ever increasing amount of system integration per chip. For a long time, this increase in integration density was satisfied by adding extra digital functions and memory. Nowadays, interfaces to the analog world (both base-band and RF) are also packed onto these systems-on-chip. In addition to the dominant “constant field” CMOS scaling trend, and the associated continuous decrease of the power supply voltage, there are other major hurdles for system integration. Increasing demands for extended dynamic range and signal bandwidth of modern integrated systems must also be met (Figure 1, dynamic range is plotted in the resolution in bits of an A/D converter). 3 J. H. Huijsing et al., (eds.), Analog Circuit Design, 3-21. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
4
It is not necessarily true that the most advanced technology generation will have the highest value for the product of the dynamic range and signal bandwidth (scaling towards the upper right-hand corner of the graph in Figure 1) [2]. Additional devices (highly linear capacitors, gate-oxide for MOS transistors) can facilitate systemon-chip integration, since those “high quality” passives enable a performance increase, and “previous generation” analog blocks can easily be re-used (voltage levels are maintained). The combination of doing a trend analysis and having additional devices available creates two problems. Firstly (when the total function remains in a previous technology generation, because of the time needed to create and characterize high quality passives), the digital part of the system-to-be-integrated suffers from a lack of function-density and an elevated supply voltage. This has a quadratic effect on the dynamic power dissipation through Secondly (with combined use of state-of-the-art MOS transistors for digital functions, and previous-generation MOS transistors for analog functions), the potential of the new technology is not exploited for these analog functions. The approach of adding devices is therefore useful for porting functions, but is not interesting when identifying scaling issues.
5
2 Scaling goals Scaling of digital functions is directly coupled to feature size reduction. Per function, this yields a combination of continuous area reduction, speed increase and dynamic power reduction (see [3] for an example). Static power dissipation becomes a major limitation with the integration of more functions at an increased density. Speed and power improvement for digital functions is then done concurrently by selecting the optimum On/Off ratio of the MOS transistor for a certain application domain. The scaling space basically narrows down to two dimensions: on/off ratio vs feature size [4]. For analog functions, the goals of scaling are diverse. The focus can be on area efficiency, with the continued availability of a function at a fixed power dissipation, bandwidth and dynamic range. Alternately, the focus can be on the exploration of the ultimate bandwidth capability, without limiting the power or area. It could also be on pushing the combined limits of dynamic range, bandwidth and power. The preferred scaling scenario heavily depends on the goal, and we must sacrifice the performance in directions that have a lower priority to obtain a feasible solution. The basic quadratic MOS current/voltage relationships (see [5] for example) are used to choose the relative change of the operating points across technology generations, as well as to approximate (for a limited bias range only) the analog scaling rules of Table 1:
To the first order, the quasi DC distortion is dependent on the variation across the signal amplitude through modulation of the first order derivatives
6
3 Scaling scenarios We have applied several methods to explore analog scalability. The focus varies from general “power and SNR” considerations [6], to concurrent “power, SNR, and linearity” optimization for a fixed building block [7]. The focus also ranges from practical device artifacts, through compact model simulation [8, 9], to trend analysis at the functional block level [10, 11]. Here, the solution space for scaling (expressed in the well-known linear scaling factor s=0.7 from generation-to-generation) is explored using three different cases: Relaxed Dynamic Range (Digital scaling I) Standard digital scaling as in [3] for example. The focus is now on area and power reduction-per-function. The performance metrics being sacrificed are linearity [9], and the signal-to-noise ratio (a power ratio, assumed to be dominated by thermal noise in the denominator). Neither linearity nor SNR degradation have to be a limiting factor when scaling a circuit, however, the fact that the will degrade by a factor per generation under this scaling regime requires attention for wide-band circuits. The linearity degradation is consistent with the third harmonic intercept voltage findings in [9]. In case of a dominant third harmonic, the expected signal-to-distortion ratio deteriorates by due to a combination of loss of intrinsic MOS gain and insufficiently scaled with respect to the supply scaling (s). Relaxed area and power (Analog Scaling II) The major consideration is that analog circuits only occupy a minor portion of a system-on-chip. Area reduction is therefore not ranked as a top priority. Instead, with the application demands in mind, the focus during scaling is on the concurrent performance increase in terms of bandwidth and dynamic range increase at a fixed frequency. Maintaining the linearity part of the dynamic range requires the signal amplitude at least to scale with the supply voltage. The effect of the noise part of the dynamic range is treated
7
as follows, where signal level:
is the bandwidth of the circuit and
is the rms
Thus, for constant should remain constant, and the active “ Area” is now the metric to be sacrificed:
Applying the part of equation (4), we learn that a constant SNR requires a cubic increase of the transconductance, i.e. a cubic decrease of the impedance level, to compensate for the lower signal amplitude and the higher bandwidth. To reach this goal (bound by the feature-size scaling of L), equation (2) subsequently defines the to scale with and the W with From equation (3), it follows that the foremost sacrificed item is now gds, with For the signal carrying parts of the circuit, the overall decrease in the impedance level on the circuit nodes compensates for this sacrifice. Constant area and constant power (Analog Scaling III) The third scenario is a mixture of the previous two. A minor loss of SNR is accepted to avoid an increased area and power dissipation, whilst the linearity performance is maintained at the level of the “Analog Scaling II” scenario. The results of these three scaling strategies were evaluated using circuit simulation (on both the device and the basic circuit block level). The use of circuit simulation safeguards the inclusion of higher-order impairments on performance (such as moderate inversion), and gives insight into the performance latitude (in signal amplitude, distortion and noise) of the scaled circuit. The scaling rules applied within these three strategies are summarized in Table 1:
8
4 Results of scaling 4.1 Single device scaling Compact model simulation of single MOS devices [9] was applied for various technology generations. This identifies the impact of higherorder impairments on these approximate relationships and checks for the validity of the selected operating range.
9
Figures 2 and 3 give an example how this validity check on the usable basic square-law model operating range is done. This is shown for an NMOS device with in the generation. For the and the trans-conductance as function of gate-drive (Figure 2) and for the and the output conductance as function of source-drain voltage (Figure 3).
We use the gate overdrive for all device biasing. By this choice we circumvent that variations in threshold voltage (for different device geometries, or for successive technology generations) will influence the actual analog operating point. At first glance we see for this example in Figure 2 a more or less linear relationship of for the limited range (and for the obvious condition of sustained saturation At the edges of this range, part of the distortion caused by this deviation can be overcome by circuit design techniques such as a differential circuit topology and/or feedback. We will have a better look on the trans-conductance linearity behavior for successive technology generations later on, and first inspect the output conductance for an NMOS example in technology.
10
For the the variation for a 300mV signal swing is approximately an order of magnitude, at relatively low . Multiple devices have to be stacked in the 1.8V supply for circuit reasons. A higher value for can therefore not be accommodated. Reliability reasons are not the critical issue. This lack of headroom explains why the signal level and the nominal gate overdrive in subsequent technology generations have to scale down, at least proportionally with the power supply. As shown in Figures 2 and 3, the trans-conductance and output conductance derivatives give a better view on relevant analog device properties (such as linearity) than the current characteristics. We now rearrange these characteristics in a form allowing easy comparisons across technology generations. We do this by looking at the relative variation of these conductance curves with respect to the drain-source current. Figure 4 shows the trend for an NMOS (normalized to q/kT units) against for four subsequent technology generations. According to equation (2), we expected proportionality of with 1/s through . Also this curve is plotted in Figure 4. Higher order impairments present in the simulation model show up as a loss of improvement for (with an increasing deviation for smaller Figure 5 shows the consequence of this effect when an NMOS device is scaled through successive technology generations. is scaled with s, according to the scaling rule choices of Table 1 (from 300mV at the generation to l00mV at the generation).
11
12
Higher order impairments, present in the simulation model, show up as a deterioration of the expected scaling across technology generations, with a multiplication factor of approximately (for
We performed a similar exercise for the output conductance. Figure 6 shows the trend for the "early voltage" as a function of where is scaled with in the simulation. According to equation (3) we would expect proportionality of with through L.
We therefore verify this relationship in Figure 7 with a graph of vs minimum feature size of successive technology generations. Higher order impairments show up as a deterioration of scaling, with a multiplication factor of approximately
13
To summarize these observations on conductance scaling, we found that the intrinsically “obtainable gain” of the MOS transistor is a factor per generation lower than expected from the square-law model. This affects all three scaling rule scenarios. We will now verify the consequences of this deviation on circuit block level. 4.2 Circuit block scaling A basic voltage follower was applied to explore the bandwidth and dynamic range consequences at circuit level (Figure 8). As a starting point, an implementation in with a signal amplitude of 0.3V, targeted for a signal-to-noise-and-distortion of approximately 50dB, was used. The three scenarios listed in Table 1 were subsequently applied, scaling backward to and forward to technology.
14
Figures 9 to 11 show the results of the voltage-follower simulation exercise across four technology generations for the performance criteria bandwidth, signal-to-noise ratio (SNR), and signal-todistortion ratio (SDR). The circuit is current driven (with a diode generating the gate voltage of the tail current source), which means that we expect the deviations from the applied scaling rule primarily to show up in the linearity, and not in the or the (being representative for the circuit bandwidth). For this voltage-follower circuit, Figure 9 shows that the “digital I” scaling scenario gives the best bandwidth improvement across technology generations. The improvement factor 1.75x more or less equals the expected value from Table 1. The bandwidth improvement for the generation is slightly less than expected. This means that the parasitic capacitances outside the scaled MOS devices and the scaled external load are not scaling as well as before. Figure 10 shows that, as expected from Table 1, the “digital I” scaling scenario is worst for thermal noise in this wide-band circuit. The aim to keep SNR constant is clearly fulfilled by the “analog II” scenario. The “analog III” scaling scenario creates the expected, minor, degradation in SNR. Due to the combination of a relatively reduced bandwidth and a lower impedance level (due to an increased the generation consistently shows a slightly better SNR.
15
16
17
Striking in the signal-to-distortion plot of Figure 11 is the tremendous impairment on linearity that occurs throughout technology generations when “digital I” scaling is applied for this kind of wide-band circuit block. The drop in circuit linearity at higher frequencies is a consequence of this generation’s lower bandwidth. We do not get the low-frequency circuit linearity beyond 70dB in “reverse ” scaling from the initial circuit definition in We contribute this to undefined higher order impairments far beyond the initial circuit linearity. The severe degradation across two generations (from to is approximately 10dB/generation (with the generation positioned slightly eccentric). This is a combination of insufficient scaling of (by instead of and an additional degrading factor affecting both analog scaling scenarios as well (see below). Contrary to the original expectation for the “Analog II” and “Analog III” scaling scenarios, we see no systematic linearity improvement across technology generations (and some degradation for the DClinearity in the case). We attribute this lack of voltagefollower linearity improvement to the additional loss of a factor s per generation of the intrinsically “obtainable gain” of a voltage driven MOS transistor (shown by the device level simulations of Figures 4 and 5). The proper current, bandwidth, and conductance levels are maintained at the expense of an increased gate-drive and thus a degraded scaling. Within the context of the voltage follower circuit block the impairment therefore mainly shows up in the linearity performance.
18
19
5 Discussion and functional block examples The application demands from Figure 1 have been confronted with the practical consequences of analog scaling. Concurrent analog building block improvements in area, and bandwidth and dynamic range cannot be created through feature size scaling alone. A drawback of using simulation as a forward-looking scaling tool is the lack of model card parameters with an “analog” quality. This fate is a major reason for the time shift in applied technology generation between state-of-the-art digital circuits and high dynamic range analog circuits. Circuit topology plays a major role in attacking high-dynamic range application problems with low-dynamic range circuitry. Examples are in sigma-delta conversion (where single bit quantization is capable of delivering high dynamic range), in spread spectrum techniques (where a very low or negative SNR is allowed without hampering proper communication), or by applying dynamic correction techniques to compensate for degraded device properties (as in wide-band nyquist A/D conversion). To highlight the last point; a 10bit A/D function was implemented using the technology generation in [12]. In a scaled technology, this A/D function occupies 1 at a 12 bit resolution, thanks to the “mixed-signal chopping and calibration” circuit topology technique [13]. This scaled realization results in a 4-fold increase in dynamic range, and twice the bandwidth Together with times more power consumption and a larger area this means a performance improvement in one generation.
20
6 Conclusions We can therefore draw the following conclusions: Concurrent improvement of bandwidth and dynamic range by a feature-size scaling rule results in a power and area increase of signal-carrying devices in critical blocks. Porting fixed functions benefits most from previous generation device availability and/or a higher power supply voltage (maintaining the original operating points and signal levels). However, the scaled technology is not exploited and scaling trends cannot be identified. Down-scaling analog circuits by applying a feature-size scaling rule does not fulfill the application demand. Circuit topology improvement does. Therefore, new application domain demands are best served by employing a mixture of scaling rules and by optimal system level choices.
7 Acknowledgments We are grateful to Anne Johan Annema, Pierre Woerlee and Ronald van Langevelde for their constructive discussions and supporting material.
8 References [1] ITRS 2000, http://public.itrs.net/Files/2000UpdateFinal/2kUdFinal.cfm [2] Kelly, D. et. al.,: "A 3V 340mW 14b 75MSPS CMOS ADC with 85dB SFDR at nyquist", Technical Digest ISSCC, 2001, pp. 134-439. [3] Veendrick, Harry: "Digital goes Analog", Proceedings ESSCIRC 1998, pp. 44-50. [4] Jurczak, M. et.al.: “Dielectric pockets-a new concept of the junctions for deca-nanometric CMOS devices”, lEEE-Transactionson-Electron-Devices (USA), vol.48, no.8, p. 1776-82, Aug. 2001 [5] Razavi, B.: "Design of Analog CMOS Integrated Circuits", McGrawHill, 2001. [6] Vittoz, E.A.: "Low-power design: ways to approach the limits", Proceedings of ISSCC '94, San Francisco, CA, USA, 16-18 Feb. 1994. pp.14-18, 1994.
21
[7] Annema, Anne-Johan: "Analog circuit performance and processscaling", IEEE tr. on Circuits and Systems II, Vol. 46, No. 6, June 1999, pp. 717-725. [8] Pelgrom, M.J.M. et. al.: "CMOS Technology for mixed signal ICs", Solid-State Electronics, Vol. 41, No. 7, 1997, pp. 967-974. [9] Woerlee P. et. al.: "RF-CMOS Performance Trends" IEEETransactions-on-Electron-Devices (USA), vol.48, no.8, p. 1776-82, Aug. 2001. [10] Walden, R.H.: "Analog-to-Digital Converter Survey and Analysis" IEEE Journal on Selected Areas in Communications, Vol. 17, No. 4, April 1999, pp. 539-550. [11] Bult, Klaas: "Analog Design in Deep Sub-Micron CMOS", Proceedings ESSCIRC 2000, pp. 11-17. [12] Ploeg, Hendrik van der et. al.: "A 3.3-V, 10-b, 25-MSample/s Two-Step ADC in CMOS", IEEE Journal of Solid-State Circuits (JSSC), Vol. 34, No. 12, December 1999, pp. 1803-1811. [13] Ploeg, Hendrik van der, et. al.: "A 2.5V, 12b, 54MSample/s 0.25um CMOS ADC in Technical Digest ISSCC, 2001, pp. 132–439.
SCALABLE HIGH RESOLUTION MIXED MODE CIRCUIT DESIGN R.J.Brewer Analog Devices Pembroke Road, Newbury RG14 1BX, U.K. bob.
[email protected] ABSTRACT This paper discusses architectures for analog to digital interchange which are suitable for implementation in deep sub-micron CMOS mixed mode technologies. Discussed in detail are successive approximation and low over-sampling ratio sigmadelta converters giving >12 bits resolution at order MHz bandwidth. Also discussed are architectures potentially suitable for operational amplifiers buffering such converters, integrated in the same technology. 1. INTRODUCTION The topic of “scalable high resolution mixed mode circuit design” is potentially broad and the focus of this paper will be architectures suitable for fabrication in deep sub-micron CMOS technology (DSM) which implement analog to digital interchange at bandwidths from DC to several MHz and with resolutions of 12 bits and above. Analog design in DSM is dominated by the reality that the process driver is digital. Typically, a mixed mode DSM technology will lag in development by about a year behind its digital substrate and comprise a digital process with the addition of reasonably linear double polysilicon capacitors with a layer of medium-resistivity polysilicon available to create non-trimmable resistors with matching no better than a few tenths of a percent with realistic values up to several tens of 23 J. H. Huijsing et al. (eds.), Analog Circuit Design, 23-42. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
24
kohms. Moore’s Law famously identifies a trend line of a 70% linear geometry shrink per 18 months; the current range of mixed mode processes widely available run as follows: 0.5um/5v to 0.35um/3v to 0.25um/2.5v to 0.18um/1.8v. In many cases higher voltage devices are also offered on the lower voltage processes; it is tempting to assume these may be used for the analog sections of a mixed-mode circuit, but in some cases these higher voltage devices are intended for digital I/O and have poor electrical properties which make them less, not more, suitable for analog circuits. Thus this paper assumes that scalable means using the small geometry devices. Another subtle assumption is that application forces are driving up signal processing information bandwidth and a driver in scalable design is to use the speed of these processes: so that analog bandwidths of MHz are more interesting than kHz. Although this paper addresses scalable design, it is worth remarking that in many cases this may not be the economically optimum design approach to the implementation of systems combining complex logic and high performance analog. There are several approaches to avoiding rather than solving the problem. In some DSM technologies, high performance analog devices with thicker gate oxide and higher supply voltage are made available permitting what is essentially a hybrid design approach on one substrate. However, this means the digital section must carry the cost overhead of the dual gate oxide and other analog components, such as double polysilicon. If the number of interconnects required between the analog and digital sections is small, it may actually be better to split the die. Package engineers are becoming increasingly comfortable with dual paddle packages with die-to-die bonding, or alternatively low pin-count packages are becoming increasingly small: e.g. an MSOP-8 at just 3mmx4.9mm. 2. TRENDS IN DEVICE PROPERTIES
The most obvious effect of scaling is an approximately linear shrink in permitted supply voltage allowing about 1 volt per 0.1 um minimum gate length. Unfortunately shrinking does little if anything for one of the dominating noise sources in CMOS data conversion design: kTC noise. This results in an approximately linear compression with
25
scaling of signal to noise ratio. However, process designers do usually scale the threshold voltages to some degree with the shrink, so that the ability to stack devices in, for example, an opamp design, does become compressed with process scaling but relatively softly. Thus scalable design implies maximising the p-p signal range within the supply voltage whilst some stacking of devices is still acceptable but becomes increasingly undesirable. Some processes may also offer optional low threshold devices for analog design; but note these may be leaky if used as switches. A major problem is a rising 1/f (flicker) noise corner frequency: for example for a minimum length NMOS with W/L=10 at 100uA the 1/f corner may be around 1MHz at 0.35um and escalate to tens of MHz at 0.18um. Amplifier stage gains are low (e.g. 25dB) and decline with scaling and, as already mentioned, stacking devices for cascoding is permitted but increasingly difficult. Finally leakage currents – between all terminals of the MOS device – rise with scaling. Scalable design implies accommodating all these trends. 3. ANALOG to DIGITAL CONVERSION Two architectures are coming to dominate sub-micron and deep-submicron CMOS design in the resolutions and bandwidths discussed here: SAR (successive-approximation) converters with electricallytrimmable capacitor-array DACs [1,2] and low-oversampling-ratio sigma-deltas [14-30]. For very high resolution conversion at low bandwidths sigma-delta converters with high oversampling ratios (>16, perhaps typically 128) are the predominant technology. These typically have architecturally simple single loops with single bit quantization and noise shaping of 2-4 orders. These work very well and are very scalable to very deep sub-micron. It may be expected that they will remain a very commercially important class of converter but will not be discussed further here, where the focus is on medium bandwidth (of order MHz) leading edge architectures. At higher bandwidths (10-1000MHz) flash and pipelined converters dominate; again, these will not be discussed further here.
26
In the early to mid 1990s there was a wave of great interest in selfcalibrated ADCs. Typical lithography permits analog element matching to around 12 bits resolution. To achieve high yields at 12 bits, and performance beyond, wafer level laser trimming was the dominant technology. Self-calibration became an attractive alternative apparently better suited to volume-manufactured CMOS technologies [3-5] giving a high resolution ADC capability to designers without access to laser trimming. There are two fundamental approaches. The first uses some slow but high linearity method such as an integrator to create a very linear sequence of voltage levels which are then used to calibrate the capacitor or resistor array which forms the working converter. The second relies on the observation that in an ideal linear binary-weighted element (e.g. capacitor) array each element equals the sum of all the lesser weight elements (plus 1 LSB). Thus a calibration algorithm can clearly be devised which relies solely upon establishing internal equalities, without reference to any absolute calibration standards. In either case, some on-chip memory is then required to store calibration constants which are applied dynamically to some form of trim-DAC during conversions. This all works; however, the trend seems to be away from self-calibration as it increasingly appears that similar performance can be achieved more economically and more conveniently for the user with one-time electrical trimming as discussed below. 3.1 Capacitor Array Successive Approximation ADCs
The core element is a DAC implemented with an array of binary weighted double-polysilicon capacitors. This is usually broken into two similar sections, making the upper and lower bits, linked by a series capacitor which de-weights the lower bits. To achieve >12 bits performance various of these capacitors are trimmed by further small arrays of capacitors which are switched in or out at test [1,2] (Fig 1). The capacitors are usually double plates of polysilicon with silicon dioxide dielectric which are very mechanically and electrically stable. Since these structures are very stable post-manufacturing, the trim is usually once-only with a small on-chip PROM, often comprising electrically-blown polysilicon fuses. This architecture is relatively cheap to manufacture and easy to use in the end application as no
27
calibration cycles are needed. There are very few publications describing the internal detail of such converters but they are of great commercial importance.
A problem with SAR converters is that all bit trials are critical and errors are non-recoverable. A popular trend is to incorporate redundancy into the bit trials with an algorithm which permits errors made in the earlier (MSB) bit trials to be corrected later [8-13]. This gives improved noise immunity and permits a higher sampling rate by permitting accelerated bit trials. This also plays well to DSM scaling and mixed mode design. Various methods can be used, although all ultimately serve the same purpose and overcome the same weakness in a conventional binary weighted successive approximation search algorithm. The problem is illustrated in Fig 2a. Assume the true input voltage is slightly below mid-scale, but the comparator makes an error and believes it lies slightly above. This error could be due to allowing insufficient time for DAC settling (e.g. settling to the 10 time constants required for 0.005% accuracy), analog noise, digital noise or reference or ground bounce. It is obvious that with a simple binary
28
weighted search path there is no path through the search space which recovers the error. However, consider the search path of Fig. 2b with “one bit per bit redundancy”. After the first bit trial (with erroneous result) the search space is shifted by one quarter of its span in the direction of the result and the bit trial repeated. After this second bit trial the search space is halved normally according to the comparator result, but it is now certain that the true input value lies within the new reduced search space even with an error in the first comparator result of ¼ of the span of the search space. This algorithm converges on the correct result with a tolerance for error of ¼ of the current search space at each bit trial. Twice as many bit trials are required but each may be many times faster with, for example, DAC settling of only a few time constants.
In practice, one redundant bit per bit may be excessive. Some designers favour only one redundant bit in the whole array, typically around half way down the sequence of bit trials. Others favour perhaps one redundant bit per 4 bits; this latter still allows large errors but only increases the number of bit trials by 5/4. A conceptually elegant alternative which achieves the same result is a successive approximation with an array with slightly less than binary weighting, as in Fig 2c. It will be apparent from inspection that this also allows a search path which converges on the correct result despite moderate errors on the way. However, it suffers the major disadvantage of using non-binary-weighted elements. The analog elements, typically capacitors, are difficult to make with accurate non-
29
binary weighting, and the results of the non-binary bit trials require significant computation to map onto a binary coded output word. As supply voltages are reduced with scaling there is generally a lessthan-proportional decrease in MOS threshold voltages, together with a need to make the signal range as close to the full supply voltage as possible, so that on-resistance in the input sampling switches become increasingly important as the effective over-voltage reduces. This is eased by pumping or boot-strapping the gate voltage on the NMOS switches but of course this ideally should be as much as possible without exceeding the processes absolute maximum voltage rating. Gate boot-strapping methods have been developed to do this accurately [6,7,17]. The above collection of design features make for cheap, easy to use and robust converters which are suitable for scaling to deep submicron CMOS processes. Successive approximation converters are the subject of very little published work but of great commercial importance which is likely to continue into the foreseeable future. For every paper published on the SAR solutions there are likely ten on the sigma-delta; but for every sigma-delta ADC manufactured there are likely ten SAR. Converters with signal bandwidths in the range 110MHz and resolutions in the range 12-16 bits are likely achievable in the near future at time of writing using both the SAR architectures discussed above and the low oversampling ratio sigma-delta discussed below as the two approaches increasingly converge. 3.2. Low Oversampling Ratio Sigma-Deltas There is an observable architectural evolution or fashion trend at the leading edge: very high order single-bit single-loop: e.g. -order shaping such as the established CS5396 or AD7722 single-bit multi-loop: e.g. 4 loops of order 2+1+1+1 ( order) such as the established more recent AD7723
30
multi-bit single-loop or multi-loop designs using bit-shuffling to suppress non-linearity in the multi-bit DAC: e.g. 3 loops of order 2+1+1 order) (3x4)-bit [15] For high resolution at low bandwidths simple (e.g. order) single-bit single-loop modulators appear to have a commercially very important long term future. However, we focus here on the informationbandwidth leading edge, where the performance of interest could be 16 bits at 2Ms/s or 20 bits at 100ks/s (which have the same resolvable information rate). Increasing the order of the noise shaping in single-bit single-loop designs has apparently reached its natural limit as instability limits the benefit of increasing orders of noise shaping. Converters have been made (and are commercially successful products from more than one company) with order loop filtering, but in truth order probably represents a useful maximum. Multi-loop and multi-bit designs achieve higher performance but there exists a very wide range of architectural alternatives. We will now discuss the optimisation of such architectures.
The principle of multi-loop architectures (Fig 3.) is that the quantization noise in the first loop is copied as an analog voltage into
31
a second loop where it is measured, to be subtracted in the digital domain. This process may be repeated or cascaded indefinitely, resulting in a theoretical possibility of any desired SNR. The main limitation is the accuracy of the analog copy from the first to second loop; thus the lower the quantization noise in the first loop, the lower the accuracy requirement of this analog amplifier. This suggests that multiple orders of noise shaping and multiple bits of quantization be pushed into the first loop. However, aggressive noise shaping in the first loop may result in stability problems which limit the achievable shaping, as is well known. Also, if multi-bit quantization is used in the first loop, it must be bit-shuffled as its integral non-linearity otherwise appears directly in the digitised output. If the multi-bit quantizers are in second or subsequent loops, which are converting quantization noise and not signal, bit-shuffling is likely not needed. Further, the number of comparators needed in the multi-bit flash converters is minimised if the multi-bit quantization is spread across multiple flash converters in multiple loops: for example, if 6 bits of quantization are chosen, this requires 64 comparators in one converter but ideally 3x3=9 comparators if split into three 2-bit converters. Whether dither is required in the first loop is also arguable: for very low residual tones (e.g. <-100dB) in the digitised output, dither may be useful to ease the demands on the difference amplifier which must copy any tones generated by the first loop into the second loop for digitisation and cancellation. Considering scalability to very deep sub-micron geometries, it is generally true that the analog difference amplifier which copies the quantization noise out of the first loop into the second becomes increasingly difficult to make accurate. At the same time, the multi-bit flash converters tend to take less die area and power; similarly the bitshuffling required for multi-bit quantizers in the first loop. Thus scaling tends to favour pushing more noise shaping and more bits of quantization into the first loop, possibly to the extreme of moving back to a single loop. It will be clear from the above that there are contradictory arguments in optimising a sigma-delta architecture. All will thus be compromises
32
and the optimisation appears quite flat and broad, permitting different sensible designers to adopt very different solutions. Extreme architectural optimisations seen from different designers within one company are illustrated entertainingly by the AD7722 (unpublished) and AD9260 [17]: one is a 1-loop, 1-bit order design; the other has one-and-a-half loops with a 5-bit order loop followed by a 12-bit unshaped pipeline converter – thus 1.5 loops, 17-bit order (Fig. 4). Both doubtless seemed sensible decisions at the time (and work).
The above examples probably represent extremes of sensible design optimisations. A reasonable view could be that a good scalable optimised architecture has parameters in the following ranges ... a total of 4-5 orders of noise shaping: fewer than 4 is undoubtedly “leaving potential performance on the table” while greater than 5 orders probably buys little improvement a total of some 4-8 bits of quantization: a few bits of quantization bring significant improvements, perhaps particularly in loop
33
stability characteristics, but circuit complexity increases rather nonlinearly with too many bits two or maybe three cascaded loops; (although, a single noiseshaped loop could also be a well optimised solution, as multi-bit quantization eases the loop stability criteria and permits very aggressive higher order noise shaping) at least 2 orders of noise shaping and 2-3 bits of quantization in the first loop, to ease the required accuracy of the amplifier which copies the quantization noise into the second loop the remaining bits of quantization spread as much as possible and probably not bit-shuffled bit-shuffling of the first loop multi-bit quantizer of course possibly dither in the first loop. Such an architecture has characteristics which should make it potentially scalable to very deep sub-micron CMOS. Bit-shuffling algorithms are the subject of much research and becoming increasingly complex but effective. The principle is that a multi-bit flash ADC and then DAC in the noise shaping loop increases the SNR and also, importantly, increases the loop stability under normal and overload conditions. However, in the first loop where the signal is being directly digitised, any integral non-linearity in the loop feedback DAC will appear directly in the digitised signal. However, this non-linearity may be converted to shaped noise by various versions of the old idea of dynamic element matching [31]. If a 3-bit DAC is made of 8 supposedly matched elements, if these elements are used in randomised combinations to create the 8 voltage levels then it is obvious that this maps non-linearity due to element mis-match into noise. As always the detail is more complex but work in this area is relatively well published [14-30].
34
4. OPAMP ARCHITECTURES
The A/D converters discussed above are very difficult to drive. The capacitor-based successive approximation converters typically sample the input voltage directly onto the array capacitance to form a simple inherent sample-and-hold action. The capacitance is of order a few tens of picoFarads and this will not reduce with scaling as kTC noise is typically the determinant of the noise floor. Sigma-delta converters are different but not necessarily more benign in that sampling is on a slightly smaller capacitance but at a higher (over-) sampling rate. This puts a very challenging charge-gulp recovery requirement on the driving opamp in addition to the straightforward challenge of achieving SINAD in the range 70-100dB at several MHz signal bandwidth with near rail-to-rail signal swing. Inherently, switched capacitor A/D converters, whether SAR or sigma-delta, do not have very good DC accuracy due to MOS switch injection. However it is quite easy to self-calibrate this out in the ADC, to create a DC accuracy specification which cannot be met by any opamp which is not trimmed or auto-zeroed. There are few discrete bipolar opamps which can meet this challenge and fewer which can be integrated in CMOS; and fewer still which are scalable. The process scaling issues which directly impact opamp design are: supply voltage compression, low stage gain with reducing stackability and rising 1/f noise corner frequency. Conventionally CMOS opamps are designed using an architecture originally developed for bipolar implementation with +/-15v supplies (which in turn derives from vacuum tube designs): this typically comprises two gain stages with a differential input pair transconductance stage driving a single stage Miller integrator via a current mirror differential to single ended converter. We assume this is very familiar. However, it is badly effected by all the scaling issues summarised above. The following discussion assumes an objective of a scalable opamp architecture suited to buffering A/D and D/A converters with >12 bits resolution over a signal bandwidth from DC to several MHz. In the light of the scaling issues summarised above, we postulate that an opamp architecture is “scalable” if it offers:
35
“rail-to-rail” signal swing: this is most important at the output where a reasonable target is a signal p-p swing of >75% of VDD but is also desirable as a common mode range at the input for high impedance unity gain buffering multiple gain stages: e.g. 5 inverter stages suppression of low frequency flicker noise (which of course also brings good DC performance). Rail-to-rail output stage design is well known [32] and the textbook methods appear adequately scalable with stacking of threshold and saturation voltages which can be accommodated within the shrinking supplies. However the equivalent textbook input stages [ibid.] with rail-to-rail common mode (CM) range appear challenging for high performance (MHz bandwidth and low THD) applications because of input offset shifting over the CM range as the input stage transitions between N- and P-MOS conduction. Two scalable architectures are shown. Both can use rail-to-rail output stages to maximise p-p signal swing and thus SNR. Both are scalable in that they further meet the requirements of multiple (five) gain stages at low frequencies and suppression of flicker noise whilst retaining wide signal bandwidth and the capability of low distortion at order MHz bandwidths.
The design shown in Fig. 5 (unpublished) essentially splits the signal by frequency into three paths which are recombined by summing the
36
outputs of transconductance stages, with the lowest frequencies passing through a chopped or auto-zeroed path with 5 gain stages while the high frequencies have a short un-chopped 2-stage path. This design only works well over a limited input CM range as the input differential pairs require some voltage headroom and if used in a noninverting configuration the harmonic distortion will be limited by the input pair common mode rejection ratio. Further, the chopped or autozeroed path has a bandwidth much lower than the full signal bandwidth and cancellation of DC offsets and low frequency flicker noise effectively relies on time-averaging the input offset to zero. It is thus intolerant of non-linear transient overloads which will generally not average correctly to zero. The low frequency path in this design can use any of the wide variety of chopping and auto-zeroing methods. This field has itself been the subject of a comprehensive review paper so will not be discussed further here [33]. It is thus suited to applications where the input common mode range and frequency spectrum are well defined and thus known to lie within the architecture’s limitations. It has been used to the author’s knowledge successfully as a D/A converter output reconstruction buffer and driver, with application-specific implementations in both 0.6um with 5v supply and 0.35um and 3v supply. The application in 0.6um delivers 6v p-p (differential) into 1 kohm with –85dB distortion at 300kHz; the 0.35um driver application delivers 4v p-p (differential) into just 8 ohms load with a class A/B output stage with order 10mm wide output MOS devices with –75dB distortion at 140kHz [Hurrell – private communication]. This architecture should scale well to smaller geometries and lower supply voltages.
37
The design in Fig. 6 (unpublished at time of writing) has an inherently rail-to-rail wide bandwidth input CM range and is tolerant of nonlinear transients, which makes it better suited as an A/D converter buffer where the input signal is undefined. However it contaminates the signal with some (few millivolts) level of modulator frequency (>100MHz) noise which must then be suppressed by further filtering: for example, when driving an A/D converter with a 20pF capacitance a series resistance of around 500 ohms is necessary to provide the necessary attenuation of modulator feed-through. Alternatively, with some ADC architectures the ADC may be operated synchronously with the modulation at a somewhat reduced modulation frequency and the output of the opamp sampled twice per clock cycle with the signal voltage taken as being the sum or average of the two samples. The input voltage is modulated up to a frequency well beyond the signal bandwidth; in this example a modulation frequency in the range 20-200MHz is practical in geometries in the range 0.6-0.25um with signal bandwidths of a few MHz. It is AC amplified by a 2-stage amplifier and demodulated back to base-band with a demodulator which incorporates a differential to single ended conversion, followed by a 3-stage integrator. With 3 gain stages the integrator requires an internal nested pole to preserve stability. The use of two gain stages in the AC amplifier does not greatly degrade the loop stability as the first stage must be run at quite high current levels to achieve adequately low thermal noise, resulting in it having very low delay.
38
Whilst this is a functional design, it benefits greatly from two key improvements, as follows (Fig. 7). It will be apparent that, with 5 gain stages and a nested pole, the amplifier’s transient response will tend to be poor. A transconductor (Gm stage) is thus added leapfrogging the 3-stage integrator. The 3-stage integrator is merged with the transconductor via a resistor with a value R=l/Gm. An analysis of this compound structure shows that it combines much of the low frequency gain of the 3 stages and the transient behaviour of the simple transconductor.
A further improvement is to incorporate the passive current filtering network shown between the demodulator and the integrator. Analysis will show that this network has a band-stop current transfer function with zero phase shift at a selected high frequency (Fig. 8), chosen to be the amplifier’s unity gain frequency.
39
In this example, optimised for an opamp with unity gain bandwidth of 40MHz and a maximum signal frequency of 1 MHz, it is seen that the effect of the filter is to permit a factor 3 reduction in integrator time constant to give 3x loop gain increase at the maximum signal frequency with zero phase loss at the unity gain bandwidth. With suitable optimisation of component values this permits a significant reduction in the value of the integrator time constant without loss of overall loop phase margin, with a corresponding increase in gain and thus reduction in harmonic distortion at the higher end of the signal spectrum. This architecture has been implemented in 0.6um with 5v supply achieving –80dB THD at 500kHz as a 2.5v p-p follower.
5. CONCLUSION This review paper has identified the issues facing the designer of ADCs, DACs and buffering opamps which are: inherently robust in DSM CMOS from 0.5um / 5v down to 0.18um / 1.8v and potentially further; and achieve resolutions of >12 bits at bandwidths up to several MHz. Architectures which meet these requirements have been discussed.
6. REFERENCES (successive approximation converters) 1) “A Two-Stage Weighted Capacitor Network for D/A-A/D Conversion” Yee, Terman and Heller, IEEE Jnl. of Solid State Circuits, Vol. 14, pp. 778-781, Aug. 1979 2) “A Low Power 12b Analog to Digital Converter with On-Chip Precision Trimming” de Wit et al. IEEE Jnl. of Solid State Circuits, Vol. 28, pp. 455-461, Apr. 1993 (self-calibration)
40
3) “A Self-Calibrating 15 bit CMOS A/D Converter” Lee, Hodges and Gray, IEEE Jnl. of Solid State Circuits, Vol. 19, pp. 813-819, Dec. 1984 4) “Architecture and Algorithm for Fully Digital Correction of Monolithic Pipelined ADCs” Soenen and Geiger, IEEE Trans. Circuits and Systems II, Vol. 42, pp 143-153, March 1995 5) “200mW 1Ms/s 16-b Pipelined Converter with an On-chip 32-b Microcontroller” Mayes et al., IEEE Jnl. of Solid State Circuits, Vol. 31, pp. 1862-1872, Dec. 1996 (pumped switches) 6) “Two-phase Bootstrapped CMOS Switch Drive Technique and Circuit” Singer and Brooks, USP 6118326, Sep. 2000 7) “Very Low-Voltage Digital-Audio Delta-Sigma Modulator with 88dB Dynamic Range Using Local Switch Bootstrapping” Dessouky and Kaiser, IEEE Jnl. of Solid State Circuits, Vol. 36, pp. 349-355, Mar. 2001 (bit trial error correction algorithms) 8) “A 16 bit 500ks/s 2.7v 5mW ADC/DAC in 0.8um CMOS using Error-correcting Successive Approximation” Schofield, Dedic and Kemp, Proc. European Solid-State Circuits Conference, Southampton, 1997 9) “Successive Approximation Type Analog to Digital Converter with Repetitive Conversion Cycles” Dedic and Beckett, USP 5870052, Feb. 1999 10) “Method for Successive Approximation A/D Conversion” Cooper and Bacrania, USP 4620179, Oct. 1986 11) “Analog to Digital Conversion with Multiple Charge Balance Conversions” Cotter and Garavan, USP 5621409, Apr. 1997 12) “Charge Redistribution Analog to Digital Converter with Reduced Comparator Hysteresis Effects” Hester and Bright, USP 5675340, Oct. 1997 13) “Algorithmic Analog to Digital Converter Having Redundancy and Digital Calibration” Kerth and Green, USP 5644308, July 1997 (multibit sigma delta modulators) 14) “An Audio ADC Delta-Sigma Modulator with 100dB Peak SINAD and 102dB DR Using a Second-Order Mismatch-Shaping DAC” Fogleman et al., IEEE Jnl. of Solid State Circuits, Vol. 36, pp. 339-348, Mar. 2001
41
15) “A 90dB SNR 2.5MHz Output Rate ADC Using Cascaded Multibit Delta Sigma Modulation at 8x Oversampling Ratio” Fujimori et al., IEEE Jnl. of Solid State Circuits, Vol. 35, pp. 1820-1828, Dec. 2000 16) “113dB SNR Oversampling DAC with Segmented Noise shaped Scrambling” Adams, Nguyen and Sweetland, IEEE Jnl. of Solid State Circuits, Vol. 33, pp. 1871-1878, Dec. 1998 17) “Cascaded Sigma-Delta Pipeline A/D Converter with 1.25MHz Signal Bandwidth and 89dB SNR” Brooks et al., IEEE Jnl. of Solid State Circuits, Vol. 32, pp. 1896-1906, Dec. 1997 18) “Tree Structure for Mismatch Noise-Shaping Multibit DAC” Keady and Lyden, Elec. Letters, Vol. 33, pp. 1431-1432, Aug. 1997 19) “A 74dB Dynamic Range 1.1 MHz Signal Band Order 2-1-1 Cascade Multibit CMOS Sigma Delta Modulator” Madeiro et al., Proc. European Solid-State Circuits Conference, Southampton, 1997 20) “Delta-Sigma Data Converters” Norsworthy, Schreier and Temes, IEEE Press, 1997 21) “A Monolithic 19 bits 800kHz Low Power Multibit Sigma Delta Modulator CMOS ADC Using Data Weighted Averaging” Nys and Henderson, Proc. European Solid-State Circuits Conference, pp. 252-255, Southampton, 1996 22) “A Low Oversampling Ratio 14-b 500kHz Delta-Sigma ADC with a Self-Calibrated Multibit DAC” Baird and Fiez, IEEE Jnl. of Solid State Circuits, Vol. 31, pp. 312-320, Mar. 1996 23) “Linearity Enhancements of Multi Bit Delta-Sigma D/A and A/D Converters using Data Weighted Averaging” Baird and Fiez, IEEE Trans. Circuits and Systems II, Vol. 42, pp753-762, Dec. 1995 24) “A high Resolution Multi Bit Sigma Delta Modulator with Individual Level Averaging” Chen and Leung, IEEE Jnl. of Solid State Circuits, Vol. 30, pp. 453-460, Apr. 1995 25) “Data-directed Scrambler for Multi-Bit Noise-Shaping D/A Converters, Adams and Kwan, USP 5404142, Apr. 1995 26) “Noise Shaped Multi Bit D/A Converter Employing Unit Elements” Schreier and Zhang, Elec. Letters, Vol. 31, pp. 1712-1713, 1995 27) “A High Resolution Multi Bit Sigma Delta Modulator with Digital Correction and Relaxed Amplifier Requirements” Sarhang-Hejad and
42
Temes, IEEE Jnl. of Solid State Circuits, Vol. 28, pp. 648-660, June 1993 28) “Fourth Order Two Stage Delta Sigma Modulator using both 1 Bit and Multi Bit Quantizers” Tan and Eriksson, Elec. Letters, Vol. 29, pp. 937-938, May 1993 29) “Multi Bit Sigma Delta A/D Converter Incorporating a Novel Class of Dynamic Element Matching Technique” Leung and Sutarja, IEEE Trans. Circuits and Systems II, Vol. 39, pp. 35-51, Jan. 1992 30) “A 50MHz Multi Bit Sigma Delta Modulator for 12 Bit 2MHz A/D Conversion” Brandt and Wooley, IEEE Jnl. of Solid State Circuits, Vol. 26, pp. 1746-1756, Dec. 1991 31) “Current Distribution Arrangement for Realising a Plurality of Currents having a Specific Very Accurately Defined Ratio Relative to Each Other” van de Plassche, USP 4125803, Nov. 1978 (operational amplifiers) 32) “Design of Low-power Low-voltage Operational Amplifier Cells” Hogervorst and Huijsing, Kluwer Academic Pub., 1996 33) “Circuit Techniques for Reducing the Effects of Opamp Imperfections: Autozeroing, Correlated Double Sampling and Chopper Stabilisation” Enz and Temes, Proc. IEEE, Vol. 84, pp. 1584-1614, Nov. 1996
SCALABLE “ HIGH VOLTAGES” INTEGRATED CIRCUIT DESIGN FOR XDSL TYPE OF APPLICATIONS
Domenico ROSSI Telecommunication and Peripheral/Automotive Group Wireline Communication Division ST Microelectronics, 20041 Agrate Brianza, V.Olivetti 2, Italy
ABSTRACT Service providers are largely adopting ADSL technology and telcos to deliver high-speed data communication over traditional copper twisted pair. Continuous growth of this market has led to new requirements for lower cost, higher transmission bandwidth, improved power efficiency and longer reach. Most of these targets are heavily depending on the electrical performances of XDSL Line Drivers and Receiver which for cost reasons are, nowadays, often embedded with other functions. This paper describes most recent advances in semiconductor technology and design techniques specifically adopted to comply with these technical demands. Practical examples of Line Driver realized in different technologies and adopting different circuit architectures are also reported. INTRODUCTION AND TUTORIAL ON SYSTEM REQUIREMENTS. XDSL technology features significant improvements in data transmission compared to traditional analog modems by combining advanced signal processing techniques (digital modulation, digital equalizat ion, error corrections, etc) with high performance analog interfaces. To better understand what such analog interfaces (from hybrid to the line drivers) asks for and how this translates into specific requirements for semiconductor technologies and design skills, a short tutorial XDSL system top level requirements is here reported. For sake of simplicity, this tutorial is here limited to ADSL, but most of the considerations here done, are easily extendable to any XDLS transmission. 43 J. H. Huijsing et al. (eds.), Analog Circuit Design, 43-56. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
44
Moreover, the electrical characteristics of an ADSL analog front-end, such as line driver linearity are particularly stressed in case of DMT, the mo/demodulation technique typical of this kind of transmission. As said before, ADSL relies on DMT modulation to carry digital data. For instance, ADSL spectrum is composed by individual sub -bands QAM modulated and uniformly spaced in frequency 4.3125KHz apart and extending up to 1.1 MHz (see Figure 1-a).
Viewed in the time domain, a DMT signal appears as a pseudo -random noise typically having low rms voltage level (see Figure 1 -b), but ADSL Line drivers have to be also capable of delivering high voltage peaks that sometimes occur.
45
Apart from voltage ratings, intermodulation is another key feature to carefully look at. To preserve signal integrity, the information contained in each sub -band has not to be corrupted by any signal from other sub -bands. MTPR ( multi-tone power ratio) expressed as the relative difference expressed in dBc between the measured power in a sub-band left empty and the power of another sub -band, is the parameter used to quantify this feature. As a consequence, a good line driver is a component featuring high voltage handling capability, high slew-rate and bandwidth and very good linearity.
Summarizing, the minimum specifications required to start the line driver design are: 1. Average power level required on the line (PL), 2. Crest Factor for the modulation chose 3. Line impedance assumed for the average power specification (RL), 4. Transmission frequency band (BW), 5. Target harmonic distortion. The first three of these parameters may be used to compute the initial requirements for any line driver, which at a minimum, has to deliver both the required voltage and current output swings. The maximum required line voltage, VLPP, might be computed by stepping through the following equations
The maximum VLPP on the line has to be taken as a primary design goal. For a given VLPP, the voltage handling capability of the line drivers depends on the characteristics of the hybrid used. The hybrid, for everybody who is familiar with communications over twisted copper pair, is the component used to separate Rx from Tx signals , perform line termination, isolate the line from the modem, and optimize, when possible, the power delivered to the line. Even if an example of fully monolithic line transceivers exists [1, 2], most of today hybrids are transformer based (see Figure 2 -a).
46
Transformer is, in fact, an “ almost perfect” component, since ever used to match the load impedance while meeting the obvious constraints in terms of voltage and current of the component it is driven by (changing the transformer turn ratio). In practice, for a given VLPP and impedance, the amplifier’s output voltage and currents can be traded off with the transformer‘s turn ratio. Increasing the turns ratio will not only decre ase the required voltage swing (but at the expenses of a higher current output), but will also allow lower supply voltage and, in turn, the use of low-voltage components/ technologies. There are, however limitations increase the transformer turns ratio. For instance: High peak-output currents will start to limit the available voltage swing impacting the power efficiency of the power supply. A high turns ratio in transformers can limit bandwidth and be more prone to distortion. Often the transformer is in the path of the received signal path coming down the line. An high step-up ratio will a high step-down for the received signal impacting noise characteristics of the RX path and, hence reach. Examples of peak voltage and current output requiremen ts for ADSL line driver Vs. different output power and transformer turns ratio are reported in
47
Table 1. It must be noted that 13dBm corresponds to the power transmitted in up stream, 20.4dBm to the power in down stream.
This said, it is also mandatory to matc h line impedance. This can be achieved either by adopting passive, as implemented in Figure 2-a or active impedance synthesis (Figure 2-b).
While for passive impedance synthesis, a series resistor is generally put at the output of the amplifier (but this resistor dissipates significant power, while the load is fed just by a part of the amplifier’s voltage swing), it is nowadays
48
common practice to use active impedance synthesis to match line impedance while minimizing the maximum output voltag e swing and the dissipated power. This Driver uses both voltage and current feedbacks (through R3) to independently set the output impedance Rout and voltage gain G. Calculating output voltage and current, it is possible to determine both the line driver output impedance and the gain given as:
By using active impedance synthesis is then possible to minimize the output voltage swing of the line driver. WHAT HIGH VOLTAGE TECHNO LOGY FOR XDSL? Since ever, microelectronics has been driven by the insatiable requirement for better performance and lower cost. This not only translates into smaller size for a given function but also into proper integration of different functions on a single semiconductor substrate. Approaches have been proven often feasible and on a case-by-case basis economically applicable to address different applications and market segments; both approaches have been also adopted in case of XDSL. Common to both th e two approaches is the requirement for high-speed components showing high ft and minimum parasitic capacitances even when withstanding high voltage condition. To better understand the implication involved in realizing such a kind of components it is worth referring, for sake of simplicity, to the voltage limitations of npn transistor. [3]
49
As shown in Figure 3, its collector to emitter breakdown voltage with base shorted (Biceps is usually made equal to Bacon) mainly depends on the breakdown voltage of diode D1 and D2. The net epitaxial layer W1, its resistivity and the reach-through mechanism define the breakdown voltage of D1 while the breakdown of D2 mainly depend s on the radius of curvature of the base diffusion. In standard bipolar process processes, an increase maximum sustainable voltage is achieved by increasing the thickness and the resistivity of the epitaxial layer. The bigger these two values, the bigger the lateral diffusion of the isolation layer, the bigger the size of all the junction isolated components. Minimizing the size of all these components means minimizing the epitaxial thickness and the out-diffusion of the buried layer during all thermal step s following the epitaxial growth. In the following two example of one to the other not mutually exclusive are reported Dielectric isolation is another technique often used. This technique often offers significant advantages over junction-isolated process for high-speed analog circuits. Trench lateral isolation of SOI bonded wafers drastically improves circuit density for thick epithaxy because the lateral diffusion of isolation diffusion is eliminated.
Table 2 details the difference between devices featurin g the same current density and realized in the two technologies. The junction isolated PNP transistor’s area is roughly four times the area of a comparable dielectrically isolated PNP, while a junction isolated NPN is roughly 1.5 times as large as the equivalent dielectrically isolated NPN.
50
Another way of minimizing the size of high voltage components is the adoption of high voltage DMOS components. Once fixed Bacon, the voltage capability of a bipolar technology is, in fact, defined by the breakdown volt age Bicep (emitter to collector breakdown voltage with open base). [4] Since BVceo is lower than Bvcbo and given by
A bipolar transistor can be regarded as also incapable to fully exploit the maximum technology the technology is capable of. On the contrary, a DMOS component (see Figure 4) is capable of working at a breakdown voltage Beds equal to the Bacon of the parasitic nun component provided that the base to emitter short circuit is good enough. To some extend, DMOS capable junction isolated technologies feature small component size. As an additional advantage (see in the following), DMOS technologies can also be made compatible with CMOS transistors that, in its turn, can enable the realization of highly complex mixed ICs.
51
INTEGRATION OF HIGH VOLTAGE DMOS HIGH COMPONENTS INTO SUBMICRON TECHNOLOGIES.
VOLTAGE
Designing highly complex Smart Power IC s requires taking advantage of available low voltage IP s ideally adding high-voltage devices into already existing VLSI process platform. Unfortunately, the evolution of smart power technologies toward finer and finer micro lithography asks to solve conflicting requirements such as merging manufacturing drive-in steps which, for high voltage power components are usually long and at high temperature, while, in case sub -micron technologies, must be “low temperature” to guarantee good yield and process reproducibility (mainly for thin oxide layers). This has been, for instance achieved by exploiting innovative technology steps which have made possible the realization of H.V. fully complementary NChannel and P-Channel DMOS components into a standard VLSI CMOS technologies. [5] High voltage lateral DMOS are impleme nted by realizing the Body region by means of a large angle tilt implantation masked by the gate layer and without requiring any specific thermal treatment. Energy and tilt angle implant are depend on the compromise between lateral/vertical junction depth and doping charge i.e. between required sourceto-drain punch-through sustainable voltage and component threshold voltage (while large tilt angles are more effective in pushing charge in the DMOS active channel, low tilt angles reduce channel charge and length causing premature punch-through). 45° angle is usually found as the best compromise between these two opposite requirements. In BCD6 (0.35um) the N-LDMOS P-body layer is to be directly embedded in CMOS epic-pockets. Scaling down the gate oxide thickness requires also a proper LDMOS drain structure engineering. In BCD6, LDMOS and CMOS share exactly the same gate oxide (70nm). To avoid dangerous overcrowding of the equip -potential lines at the drain side, it is possible to adopt a gate layout stepping over the field oxide, while changing the doping profile of N-Well, it is possible to properly size drain extension region. With different DMOS drain solution a voltage capability from 16V to 20V are achievable. When higher operating voltages are required, dedicated low -doping N-Well is to be added. In this way, breakdown voltages in excess of 60V are achievable. To further increase BVdss, the heavily doped N+ buried layer is replaced with a
52
low doping buried well and RESURF technique is to be adopted. In th is case breakdown voltage in excess of 100V is easily achievable. Table3 summarizes the main features concerning N-Channel Lateral DMOS realized in BCD6 (0.35 um CMOS).
Exploiting the flexibility offered by the large tilt implant technique used to realize the N-channel DMOS P-body region, it is possible to implement a Ntype body region to build P -channel DMOS Transistors. As a matter of fact, fully complementary N-channel and P-channel type of components are, nowadays available in low voltage semiconductor processes.
COMPLEMENTARY, DIELECTRICALLY ISOLATED BIPOLAR TECHNOLOGY ON SOI In case of XDSL application, SOI and dielectric isolation is, nowadays, getting more and more acceptance because of the its good characteristics in terms of speed. Minimizing the component size, translates automatically into reduced parasitic capacitances. [6] P and N buried collectors are usually formed by ion implantation after which an n-type epitaxy layer is grown to form the intrinsic collector of the NPN.
53
A pwell is added to form the intrinsic collector of the PNP. Lateral isolation is achieved by etching trenches down to the buried oxide. The t renches are usually filled with LPCVD oxide and polysilicon. Transistors emitters can be either Silicon or Poly. Always referring to Table 2, the base to collector junction capacitance of the junction isolated NPN is roughly twice that of the dielectrically isolated NPN’s, and its substrate capacitance is three times bigger. Same applies for PNP’s. Measurements reveal that the cut-off frequency for the bipolar transistors is much higher. Nowadays it is possible to easily obtain NPN and Isolated Collector PNP featuring Ft of more than 2 / 6 GHz for NPN and 2 / 4 GHz for PNP.
AN EXAMPLE OF ADSL LINE DRIVER REALIZED IN MIXED BIPOLAR, CMOS, DMOS MIXED TECHNOLOGY. Even if most of today available Line Driver for C.O. (Central Office) is realized by fully complementary high-speed bipolar processes, an example of line driver realized in Multipower BCD (Bipolar, Cmos, Dmos) technology is here reported. The functional diagram is shown in Figure 5.
It consists of a differential gain stage followed by a class AB output stage. The input stage is a simple emitter coupled pair where low voltage high speed (ft=7GHz ) npn transistors are used to achieve low input referred noise. The
54
intermediate is a classical Class AB stage used to guarantee high slew rate while featuring low quiescent current. While low voltage npn transistor (indeed cascaded) are here still used to get low noise features, the unavailability of pnp counterpart, led to the utilization of PDMOS components. The outputs of Class AB intermediate stage (Vp and Vn) directly drive the gate of a push-pull common drain output stage (PDMOS M13 and NDMOS M14). Quiescent current of M13 and 14 is controlled by current mirroring between M12 and M13 closed thorough the OTA. The key features of this ADSL Line driver are reported in Table 4.
AN EXAMPLE OF ADSL LINE DRIVER REALIZED IN HIGH SPEED COMPLEMENTARY SOI TECHNOLOGY Advanced complementary, SOI isolated bipolar processes that some times enable the capability of integrating submicron CMOS have recently developed to allow the realization of high performance ADSL line drivers. [7] High voltage technologies (Bvces>30V) semiconductor technologies offering transistors with ft in excess of 4GHz for pnp and in excess of 6GHz for npn are, as a matter of fact, nowadays available. In these technologies, current feedback is very often adopted (see Figure 6).
55
SOI superior characteristics in terms of ft and parasitic capacitances easily allow high small-signal bandwidth and slew rate, while small base resistance (often shown in SOI technologies) and reduced biasing current result in low input voltage and current noise. Some key features of a comme rcially available current feedback C.O. driver realized in SOI are reported in Table 5.
Moreover, examples of SOI technologies allowing also the fabrication of accurate laser trimmed analog filters have been recently announced [XX].
56
CONCLUSIONS The analog front-ends (AFE) of XDSL modems are typically partitioned into two technologies. Data converters, analog filters and Rx amplifiers are fabricated on low voltage technologies, while XDSL line drivers employ higher voltage processes. However, nowadays available high voltage process often embedding submicron CMOS components make it possible to conceive a different system partitioning with data converters, analog filters and Rx amplifiers integrated together with line drivers. Examples exists of ICs economically integrating all these functions and realized either on a fully complementary bipolar or on CMOS, DMOS centered technology.
REFERENCES (1) Zojer et al.,” A Broadband High-Voltage SLIC for a Splitter and Transformerless Combined ADSL-Lite /POTS Line Card “ ISSCC Digest of Technical Papers, pp.304-305, Feb.2000 (2) Berton et al., “ A High Voltage Line Driver (HVLDR) for Combined Data and Voice Services “ ISSCC Digest of Technical Pap ers, pp.302303,Feb.2001 (3) “Power Integrated Circuits: Physics, Design, and Applications” P.Antognetti, Editor, Mc Graw-Hill p.p.4.13-4.17. (4) “Smart Power ICs: Technologies and Applications” B.Murari, F.Bertotti, G.A.Vignola, Springer pp. 179-180. (5) C. Contie ro et al., LDMOS Implementation by large Tilt Implant in 0.6 BCD Process, Flash memory Compatible, Proceedings ISPS’99 (6) “A 30V Complementary Bipolar Technology on SOI for High Speed Precision Analog Circuits” R.Patel et Al. IEEE BCTV 2.3 pp 48 -50 (7) M.Cresi et al.,”An ADSL Central Office Analog Front-End Integrating Actively-Terminated Line Driver, Receiver and Filters” “ ISSCC Digest of Technical paper, pp.304-305, Feb.2001.
SCALABILITY OF WIRE-LINE ANALOG FRONT-ENDS Klaas BULT Broadcom Netherlands B.V. Bunnik, The Netherlands.
ABSTRACT Analog design in deep sub-micron technologies is a reality now and poses severe challenges to the circuit designer. Trends in technologies as well as their effects on circuit design are discussed. It is shown that, specifically for Wire-Line AFE’s, the power required for a certain dynamic range and bandwidth decreases with minimum feature size as long as a constant ratio between signal swing and supply voltage can be maintained. However, below channel-length, predictions of the threshold voltage endanger that requirement.
1. INTRODUCTION In Wire-Line applications (like Ethernet, Gigabit, Set-Top Boxes, Cable Modem’s, etc.), analog integration in deep sub-micron CMOS has become an economic necessity. Several papers already discussed the problems and design challenges of analog circuits integrated in purely digital deep sub-micron CMOS technologies [1] - [16]. This paper will discuss trends in technologies and their effects on circuit design, specifically focussed on Analog Front-End’s (AFE’s) for Wire-Line applications. Emphasis will be on the effect of supply voltage scaling on circuit design and performance. After discussing a generic Wire-Line Analog Front-End in section 2, section 3 deals with process scaling. Section 4 then deals with the effect of process scaling on Power Dissipation and in section 5 experimental data from literature corroborates the findings of section 4. Section 6 puts the previous results in perspective by discussing some details and caviats. In section 7, finally, the scalabitiy of Wire-Line AFE’s is discussed. Section 8 summarizes the conclusions. 57 J. H. Huijsing et al. (eds.), Analog Circuit Design, 57-70. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
58
2.
WIRE-LINE ANALOG FRONT-END’S Wire-Line IC’s are a typical example of ULSI integration dominated by digital circuitry, with some peripheral analog circuitry. Analog signal processing is usually kept to a minimum. A generic AFE is depicted in Fig. 1. The analog input-signal either comes from the wireline hybrid (like in Ethernet), or through an RF Tuner (like in Cable applications). Gain, Gain-Control and Filtering may or may not be applied, dependent on application. The Track and Hold (T&H) and ADC function however are mandatory and form the core of the AFE. 3. PROCESS SCALING Of all the aspects of design in deep sub-micron technologies, the scaling of the Supply Voltage is the most obvious and most severely affects analog circuit design [5, 10, 12, 14, 15]. Fig. 2 shows the 1999 International Technology Roadmap for Semiconductors predicting a maximum 0.6V supply voltage for the year 2010 [17]. To get a feeling of how process-parameters have changed over time and will change in the near future, Table 1 gives an overview of 14 different processes, ranging from down to of which only the last two are predictions (data from [15, 17, 18, 21]). The Supply Voltage Oxide Thickness Threshold Voltage and Matching parameter of these 14 processes are plotted on a log-log scale in Fig. 3. For technologies larger than (or equal to) stays flat and equals 5.0V In smaller technologies, scales roughly linear with minimum feature size (although it follows a
59
staircase function). Fig. 3 shows that both oxide thickness as well as matching scale down linearly with technology. Fig. 3 also shows threshold voltage clearly not scaling linearly, but more like a square-root function. The effect of that on voltage headroom is still not that strong as is still only 25% of This might change
60
below as preliminary estimates show of approximately 300mV.
to have a lower limit
4. VOLTAGE SCALING AND POWER DISSIPATION Due to the continual down-scaling of the supply voltage over time, the Dynamic Range (DR) requires extra attention in circuit design. It has been shown that, especially in ADC front-end circuitry, matching is more dominant in determining the low-end of the dynamic range than noise [4,5]. Matching is reported to scale with oxide thickness which is clearly visible in Fig.3. Defining the Dynamic Range (DR) as the ratio between and (Fig.
61
4a), and defining [20,21], where n is the number of sigma’s necessary for a certain yield, we find:
with being the voltage efficiency Fig. 4a depicts the DR and the terms it consists of, where sqrt(WL) (height of lower
62
shaded area) is adjusted such that a constant DR is obtained over all processes. Using the inverse of (1):
the gate capacitance may be derived:
where a constant depending purely on technology. This capacitance is the gate capacitance of the transistor with a matching requirement to support a certain Dynamic Range (DR). Assuming a maximum signal frequency the Slew-Rate current to support a swing of at frequency is:
If this current is delivered by a driver with an efficiency than the power dissipation P follows:
The relationship described by (5) is depicted in Fig. 4b. The process data of Table 1 is used in Fig. 5, where power (P) is plotted against technology according to equations (5). As can be seen from this figure, the data follows the same shape as predicted by the curves in Fig. 4a and 4b. Expression (5) consists of 4 separate terms. The first term is process dependent only and is mainly dependent on oxide thickness. The second term is the product of voltage efficiency and current efficiency and reflects circuit “smartness”. The third term is the a result of yield requirements and the last term depicts the system needs. These components are also visible in Fig. 4b. If constant circuit smart-
63
ness, yield and system requirements are assumed, power scales down with oxide thickness. From this point of view the future of analog design in deep submicron does not seem so bleak. The crux of the above assumption obviously lies in maintaining a constant product As can be clearly seen from Fig.3, will be negatively affected by the fact that is not scaling linearly with
5. EXPERIMENTAL DATA FROM LITERATURE As a test to the above derivation of circuit performance versus technology, data was gathered from 15 different 6-bit ADC’s [24-38]. An overview of this data is shown in Table 2. A figure of merit for 6-bit ADC’s can be defined as:
and is plotted against technology
in Fig.6. Assuming the majority
of the layout scales with (i.e. source and drain diffusion area’s, contacts and wiring) and (5) predicting the power P to scale with
64
is expected to scale with as is shown clearly in Fig. 6. Compensating for this effect yields a technology independent figure of merit and is show in Fig. 6 by the open dots. The best fitting straight line indeed is independent of technology (i.e. has a slope of 0).
6.
DISCUSSION The derivation of Power Dissipation as a function of Technology scaling given in section 4 was done under the assumption of Matching and Slew-Rate being the dominant design issues. Fig. 7 shows the Power Estimate versus Technology based on this assumption (curve a). It also shows 2 other Power Estimates. Curve b) is based on the assumption Matching and Bandwidth are dominant. It can be shown that this requirement is basically independent of Technology and is currently Technology) still significantly less important than Bandwidth and Slew-Rate. Curve c) shows the required Power Dissipation to meet the Thermal Noise specifications. As is shown also by other authors [4,5], the required Power under this condition is currently still several orders of magnitude lower than curve a), but increases with smaller Technologies. Flicker Noise still has to be added to that, but Flicker Noise predictions for future technologies
65
have proven to be hard. In any case, the effect of Flicker Noise is that curve c) will be raised dramatically and ultimately, Noise will be the dominant requirement as far as Power Dissipation is concerned. The question is how many Technology generations we are away from this point. Moreover, all of the above estimates are the required Power for one single Transistor meeting either the Matching, Slew-Rate, Bandwidth or Noise specifications. To obtain the Power dissipation of a complete circuit, one has to multiply this estimate with the number of Transistors (or rather branches) in the circuit having to comply with these requirements. Moreover, the estimated power dissipation also assumes no circuit tricks such as Dynamic Element Matching [39], Chopping
66
[39], Auto-Zero Techniques [29, 34] or Averaging [40]. Use of such techniques can reduce the power requirements based on matching by as much as an order of magnitude and will lower curves a) and b) in Fig. 7 equivalently. Noise usually is affected for lower frequencies only and as a result curve c) remains more or less at its place. Although the effect on the current situation is not dramatic, it does however move the cross-over point several Technology generations earlier.
7.
SCALING OF WIRE-LINE ANALOG FRONT-ENDS Consider again the generic block diagram of Wire-Line AFE’s in Fig.l. The PGA and the LPF, if present, are usually primarily passive
67
and do not contribute considerably to the overall Power Dissipation. The main blocks to consider are the Track & Hold Amplifier and the ADC. As discussed above, the ADC is a perfect example of a circuit dominated by Matching and Noise is much less of a problem. Therefor, ADC Power Dissipation will follow curve a) as a result of Technology scaling. Amplifier design is usually not affected by Matching and is usually governed by it’s Noise requirements. However, as the Load-Capacitance of the Track & Hold circuit is formed by the input capacitance of the ADC and hence is dominated by Matching requirements, also T&H Power Dissipation will follow curve a) as a result of Technology scaling. This leads to the conclusion that Wire-Line AFE’s will require less Power as a result of Technology scaling. Flicker Noise however, may change that picture at some point in the future.
8. CONCLUSION Analog design in deep sub-micron technologies has become a reality now and poses severe challenges to the circuit designer. Trends in technologies and their effects on circuit design have been discussed. It has been shown that specifically for Wire-Line AFE’s the power required for a certain Dynamic Range and Bandwidth decreases with minimum feature size. This is primarily due to the fact that Wire-Line AFE’s are dominated by the ADC design, which in turn is dominated by Matching requirements and Matching improves with thinner Oxides. The reduction of power dissipation with Technology scaling is based however, on a constant voltage and current efficiency. This is where the design challenge lies, as below predictions of the threshold voltage endanger that requirement.
REFERENCES [1] W. Sansen, “Mixed Analog-Digital Design Challenges”, IEEE Colloq. System on a Chip, pp. 1/1 - 1/6, Sept. 1998. [2] B. Hosticka et al., “Low-Voltage CMOS Analog Circuits”, IEEE Trans, on Circ. and Syst., vol. 42, no. 11, pp. 864-872, Nov. 1995. [3] W. Sansen, “Challenges in Analog IC Design in Submicron CMOS Technologies”, Analog and Mixed IC Design, IEEE-CAS Region 8 Workshop, pp. 72-78, Sept. 1996.
68
[4] Peter Kinget and Michiel Steyaert, “Impact of transistor mismatch on the speed-accuracy-power trade-off of analog CMOS circuits”, Proc. IEEE Custom Integrated Circuit Conference, CICC96, pp.333-336, 1996. [5] M.Steyaert et al., “Custom Analog Low Power Design: The problem of low-voltage and mismatch”, Proc. IEEECustom Int. Circ. Conf., CICC97, pp.285-292, 1997. [6] V.Prodanov and M.Green, “Design Techniques and Paradigms Toward Design of Low-Voltage CMOS Analog Circuits”, Proc. 1997 IEEE International Symposium on Circuits and Systems, pp. 129-132, June 1997. [7] W.Sansen et al., “Towards Sub 1V Analog Integrated Circuits in Submicron Standard CMOS Technologies”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp. 186-187, Feb. 1998. [8] Q. Huang et al., “The Impact of Scaling Down to Deep Submicron on CMOS RF Circuits”, IEEE J. Solid-State Circuits, vol. 33, no. 7, pp. 1023-1036, July 1998. [9] R.Castello et al. “High-Frequency Analog Filters in Deep-Submicron CMOS Technologies”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp. 74-75, Feb. 1999. [10] Klaas Bult, “Analog Broadband Communication Circuits in Deep Sub-Micron CMOS”, IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers, pp.76-77, Feb. 1999. [11] J. Fattaruso, “Low-Voltage Analog CMOS Circuit Techniques”, Proc. Int. Symp. on VLSI Tech., Syst. and Appl., pp. 286-289, 1999. [12] Daniel Foty, “Taking a Deep Look at Analog CMOS”, IEEE Circuits & Devices, pp. 23-28, March 1999. [13] D. Buss, “Device Issues in the Integration of Analog/RF Functions in Deep Submicron Digital CMOS”, IEDM Techn. Dig., pp. 423-426, 1999. [14] A. J. Annema, “Analog Circuit Performance and Process Scaling”, IEEE Trans. on Circ. and Syst., vol. 46, no. 6, pp. 711-725, June 1999. [15] M.Steyaert et al., “Speed-Power-Accuracy Trade-off in highspeed Analog-to-Digital Converters: Now and in the future...”, Proc. AACD, Tegernsee, April 2000.
69
[16] J.Burghartz et al. “RF Potential of a 0.18-um CMOS Logic Device Technology”, IEEE Trans, on Elec. Dev. vol. 47, no. 4, pp. 864870, April 2000. [17] Abrishami et al., “International Technology Roadmap for Semiconductors”, Semiconductor Industry Assoc., 1999. [18] C.Hu, “Future CMOS Scaling and Reliability”, IEEE Proceedings, vol. 81, no. 5, pp. 682-689, May 1993. [19] B. Davari et al., “CMOS Scaling for High Performance and LowPower - The Next Ten Years”, IEEE Proceedings, vol. 83, no. 4, pp. 595-606, April 1995. [20] K. Lakshmikumar et al., “Characterization and Modelling of Mismatch in MOS Transistor for Precision Analog Design”, IEEE J. of Solid-State Circ., vol SC-21, no. 6, pp. 1057-11066, Dec. 1986 [21] M.Pelgrom et al., “Matching Properties of MOS Transistors”, IEEE J. of Solid-State Circ., vol 24, no. 5, pp. 1433-1439, Oct. 1989. [22] T. Mizuno et al., “Experimental Study of Threshold Voltage Fluctuation Due to Statistical Variation of Channel Dopant Number in MOSFET’s”, IEEE Trans. on Elec. Dev. vol. 41, no.11, pp. 22162221, Nov. 1994. [23] M.Pelgrom et al., “Transistor matching in analog CMOS applications”, IEEE IEDM Techn. Dig., pp. 915-918, 1998. [24] K.McCall et al. “A 6-bit 125 MHz CMOS A/D Converter”, Proc. IEEE Custom Int. Circ. Conf., CICC, 1992. [25] M.Flynn and D.Allstot, “CMOS Folding ADCs with CurrentMode Interpolation”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp.274-275, Feb. 1995. [26] F.Paillardet and P.Robert, “A 3.3 V 6 bits 60 MHz CMOS Dual ADC”, IEEE Trans. on Cons. Elec., vol. 41, no. 3, pp. 880-883, Aug. 1995. [27] J.Spalding and D.Dalton,”A 200MSample/s 6b Flash ADC in 0.61m CMOS”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp. 320-321, Feb. 1996. [28] R.Roovers and M.Steyaert, “A 175 Ms/s, 6b, 160 mW, 3.3 V CMOS A/D Converter”, IEEE J. of Solid-State Circ., vol 31, no. 7, pp. 938-944, July 1996.
70
[29]S.Tsukamoto et al., “A CMOS 6-b, 200 MSample/s, 3 V-Supply A/D Converter for a PRML Read Channel LSI”, IEEE J. of SolidState Circ.,vol 31, no. 11, pp. 1831-1836, Nov. 1996. [30]D.Dalton et al., “A 200-MSPS 6-Bit Flash ADC in 0.6-1m CMOS”, IEEE Trans. on Circ. and Syst., vol. 45, no. 11, pp. 14331444, Nov. 1998. [31]M.Flynn and B.Sheahan, “A 400-MSample/s 6-b CMOS Folding and Interpolating ADC”, IEEE J. of Solid-State Circ., vol 33, no. 12, pp. 1932-1938, Dec. 1998. [32]S.Tsukamoto et al., “A CMOS 6-b, 400-MSample/s ADC with Error Correction”, IEEE J. of Solid-State Circ., vol 33, no. 12, pp. 1939-1947, Dec. 1998. [33] Y.Tamba and K.Yamakido, “A CMOS 6b 500MSample/s ADC for a Hard Disk Drive Read Channel”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp.324-325, Feb. 1999. [34]K.Yoon et al., “A 6b 500MSample/s CMOS Flash ADC with a Background Interpolated Auto-Zero Technique”, IEEE Int. SolidState Circ. Conf., Dig. Tech. Papers, pp. 326-327, Feb. 1999. [35]I.Mehr and D.Dalton, “A 500-MSample/s, 6-Bit Nyquist-Rate ADC for Disk-Drive Read-Channel Applications”, IEEE J. of Solid-State Circ., vol 34, no. 7, pp. 912-920, July 1999. [36]K.Nagaraj et al., “Efficient 6-Bit A/D Converter Using a 1-Bit Folding Front End”, IEEE J. of Solid-State Circ., vol 34, no. 8, pp. 1056-1062, Aug. 1999. [37]K.Nagaraj et al., “A 700MSample/s 6b Read Channel A/D Converter with 7b Servo Mode”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp.426-427, Feb. 2000. [38]K.Sushihara et al., “A 6b 800MSample/s CMOS A/D Converter”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp.428-429, Feb. 2000. [39]R.v.d.Plassche, “Integrated Analog-to-Digital and Digital-to-Analog Converters”, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1994. [40]K.Bult and A.Buchwald, “An embedded 240-mW 10-b 50-MS/s CMOS ADC in IEEE J. o7 Solid-State Circ., vol 32, no. 12, pp. 1887-1895, Dec. 1997.
Reusable IP Analog Circuit Design Jörg Hauptmann, Andreas Wiesbauer, Hubert Weinberger Infineon Technologies, Design Centers Austria GmbH Villach, Austria
ABSTRACT As ‘Time to market’ plays a crucial role for successful System on Chip (SoC) business, all chip companies try to drastically reduce development cycle times. Especially in analog circuit design this is an extraordinarily challenging target. Decreasing supply voltages along with the fast introduction of new sub micron technologies and increased performance and functionality would rather suggest an increase of design efforts. But making use of IP-reuse can help a lot to achieve development cycle time reduction. A review of possible reuse methods and comments on their feasibility are presented in this paper
1) INTRODUCTION In the last 10 years the development and introduction of new submicron technologies was very aggressive, as every other year a new technology was released. Today’s sub micron technologies allow integration of millions of digital gates on one silicon die, thereby creating complex SoC designs, which are requested by the market. Due to the cost saving potential, the market demands to migrate existing system solutions into the most recent and smallest technologies available, additionally trying to further increase the onchip functionality. 71 J. H. Huijsing et al. (eds.), Analog Circuit Design, 71-88. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
72
Not only the size of the transistors is scaled down, but also the supply voltage has to be drastically reduced. Coming from 5V for 0.5u technologies and 3.3V for 0.35u technologies, the voltage has been reduced to 1.8V for 0.18u or even below for 0.13u and 0 . 1 u (see Fig.1). Deep sub-micron processes are optimized for digital circuits, making it more difficult for analog designers to shrink designs into more recent technologies. Down to a feature size of the threshold voltage decreased almost proportional with the supply voltage. For smaller feature sizes the threshold voltage decreases more slowly, leaving less room for linear analog voltage swing. In addition,
the specific capacitance is reduced and gds is increased. There are also some beneficial changes, such as increased speed and improved matching properties, which help to implement the analog functionality [4]. All these circumstances however, ask for changing building-block topologies in order to fulfill the specified performance. Since direct shrinking without architectural changes is almost impossible, maintaining an efficient reuse strategy is difficult. Many of these systems, such as xDSL transceivers, Ethernet PHY’s or first IF wireless receivers, need complex analog functions on the same die with complex digital circuits. In other cases the analog
73
functionality is rather simple, e.g. in micro controllers one or two analog building blocks are sufficient. According to the complexity of the analog functions different levels of reuse can be defined: Section 2 deals with the reuse of complete analog front-ends (AFE) for SoC designs, basically showing that there is a huge challenge for the system architects and concept engineers to define several SoCs such that the same AFE can be reused without major changes. Another type of reuse, focusing on standard building blocks is described in Section 3. Here, the strategy is to design one analog building block with some overhead for reusability and make it available to many SoC designers. Very often the reused block is not optimized for the specific application and therefore consumes more power and/or more silicon area than necessary. Whenever the efforts in power consumption and silicon area for the analog functionality are much smaller than the efforts for the digital part, this approach seems to be feasible. Limits of this strategy, such as competitiveness, power optimization, performance optimization and number of necessary reuses are discussed. Section 4 describes possibilities of reuse for high volume state of the art designs, where, for reasons of competitiveness, compromises in performance or power consumption are not acceptable. Usually this AFE’s take a significant part in area and/or power consumption of the SoC. Also the performance is typically close to the physical limits of the used sub-micron technology. Thus optimum AFE design is required, challenging the designers to find efficient reuse possibilities. Within each reuse level we will be discussing different types of reuse. Plug & play is given, if a specific building block can be inserted in the design without changing anything inside the macro. Of course, there might be some programming features to adapt the module to the specific application. Essentially, the designer does not need a module specific know how. Mix & match reuse, on the other hand, is somewhat less restrictive. A module from a different design is taken as a basis and then adapted to the new requirements. The designer needs to know the building block very well and can change it at the required nodes, e.g. changing aspect ratios or bias currents. While plug & play reuse requires library type of modules with all kind of different views, mix & match reuse can be handled less formal and is based on interpersonal contacts towards IP reuse of the designer. In
74
terms of quality assurance, mix & match reuse is much more person specific than plug & play reuse. The ‘time to market’ issue together with a limited number of available analog resources leads to a very strong demand for IP reuse in analog circuit design. Additional aspects towards cycle time reduction, such as the use of appropriate design tools and the need for innovative project structures are discussed in Section 5.
2) Reuse of complete analog front-ends for SoC In this section several examples on the reuse of complex Analog Front Ends (AFE’s) are presented. Several applications allow defining one analog module (front-end macro), which can be reused in all these applications in the same manner. A macro can be even standardized, like it is done for 10/100base-T Ethernet PHY’s. If a macro is once defined carefully by the system engineers, it can be used in several derivatives of a whole product family. An often proven example is the standard analog voice macro, which could be used in many different ISDN or plain old telephony services (POTS) applications. Only with a strict discipline in system definition and sometimes also draw-backs in digital design, a common AFE design specification is possible.
75
Technology roadmap, supply voltage, functionality and power consumption are only a few parameters, which have to be aligned in all the different applications. Bug fixing problems may also occur, if several of these projects are done in concurrent engineering, together with the macro itself. Figure 2 shows the block diagram of a cable-modem AFE, designed for SoC usage [5]. It consists of two downstream channels, one upstream channel, a biasing block, an automatic filter tuning and a low jitter PLL, designed in 0.18µm CMOS with 1.8V power supply only. The architecture was defined in such a way, that its downstream part fits also a digital terrestrial TV receiver (DVB-T) SoC application [1], Since also the PLL, the central biasing and the filter tuning could be reused, a total reuse of 95% could be achieved. The most significant changes were the use of a different sampling rate (PLL) and a different filter order for the anti-aliasing filter. This AFE also fits quite nicely to the requirements of a hiperlan or LMDS SoC application, such that a reuse rate of more than 90% would be possible.
76
Some system solutions may have similar architectures per definition, as it is in the family of xDSL products. Then it is possible to adopt the circuits with low efforts to the new system requirements. For example it was possible to design a complete analog front-end for HDSL2 within 2 months by making reuse of an existing ADSL analog frontend. Fig.3 shows the architecture of the ADSL frontend chip [2]. The main difference between the Analog Front Ends (AFE) for ADSL and HDSL2 from system point of view is the analog bandwidth, which is 1.1MHz for ADSL and 450kHz for HDSL2 respectively. Of course, the modulation schemes are different (PAM for HDSL2 and DMT for ADSL) and also the data rates. However, the requirements for the AFE’s are nearly the same: We need 14bit A/D and D/A converter’s and Harmonic Distortion better than 75dBc at half Full Scale Signals. In Table 1 the key perfomance date of both systems are shown.
As you can see in Table 1 the AGC has the same dynamic range for both AFE’s, so no change in the topology was necessary. We only reduced the powerconsumption due to the lower bandwidth. By simply reducing the bias current of the opamp no layout effort was necessary for this adaptation. The PREFI had to be redesigned for the lower cornerfrequency and therefore also a new layout had to be done. In theA/D converter, a order multibit sigma delta converter, we just redesigned the OTA’s of the integrators for the lower clockfrequency, thereby reducing the power consumption dramatically. Only minor layout changes were necessary. In the D/A converter - a 7 bit current steering DAC - we only optimized the power consumption by reducing the bias current of the opamp, so also no layout effort had to be spent. For the POFI the same effort had to be spent as for the
77
PREFI. Also no change was necessary for the Linedriver, only a change in the supply voltage due to different output voltage swings. The reuse rate in this case was very high and came close to 80%. Reuse of AFE’s always requires the concept engineering, digital designers and analog designers to work closely together when specifying the system requirements. For all the mentioned products the necessary changes could be done in a very short time-frame, because the analog design team was not changed for one product family.
3) Reuse of analog standard building blocks In chapter 2 we briefly discussed the reuse of complete analog cores. Another approach for reuse can be found in the building blocks itself. Here we have to distinguish between standard building blocks, which have moderate performance and can be standardized, and high performance state of the art building blocks, which have to be designed in a particular way for each project. Standard building blocks, which can be designed as ‘ready to use’ modules are for example comparators, bandgaps, power-on-resets, standard PLL’s and oscillators. These basic plug & play building blocks need additional design effort in order to guarantee quality and reusability without the need of special knowledge of the block or even analog design knowledge. Due to the additional efforts, the reuse rate (# of reuses per technology) must be larger than 3 to benefit from this library element. Table 2 shows elementary building blocks and gives an estimation of their reuse number within the same process technology. The numbers are for a mid size SoC group with approximately 100 employees, including designers, concept engineers layout and product definition. Especially high potential for saving efforts can be seen for standard PLL’s and standard ADCs, used for example in micro-controllers as standard interface to the analog world.
78
A 10 bit SAR ADC was designed once with a design effort of 15MM. This was the basis for 28 ADC modules with an average effort of about 2MM per module. Fig.4 shows the tree of all these converters. The converters were designed in different technologies by means of mix & match strategy, and were reused in most of these technologies several times in a plug & play manner by digital designers. All these modules have been delivered with very high quality – an important aspect for plug & play macros in digital projects.
79
A similar strategy is possible for standard PLLs, used for generating appropriate clocking for digital IC’s. The design must have some overhead for flexible programming of the output in order to have more plug & play reuse possibilities of one dedicated design. Digital crystal oscillators and central biasing are candidates for almost 100% reuse. But again care has to be taken in the grounding strategies of the central bias, which may differ from application to application.
4) Reusable IP in high volume, state of the art designs In products designed for state of the art technologies (xDSL, cable applications, fiber optics), the probability of finding modules ready for reuse is low. The performance of the blocks has to be close to the physical limits of state of the art circuitry and the area and power consumption must be absolutely optimized in order to be competitive.
80
Too many parameters besides the used technology, such as bandwidth, signal swing, appropriate load, supply voltage, open loop gain for opamp’s on one hand, or bit accuracy, signal bandwidth, clock rate, supply voltage for ADC’s and DAC’s on the other hand, have to match the specification. This makes it hardly possible to reuse blocks in different projects. But this doesn’t mean that there is no reuse at all and everything has to be designed from scratch. The IP in analog design teams is usually very high and can be reused in all different blocks. Opamps: The IP reuse in opamps is very high (about 70%), so that new opamps are ready including layout within 2 days. Mathematical documents and schematics from former projects ease the design of new opamps drastically, so that the design can be done within one day. 60% of this effort is simulations, which can be additionally reduced by using automatic simulation shells. The fact, that the pin structure of opamps didn’t change at all (2 inputs, 2 outputs for diff. opamps), automatic simulation shells for opamps can be always reused and are also suitable for the future. The IP in opamp design can be further programmed into commercial tools for automatic design including layout, but this is limited to fixed structures, which may change with decreasing supply voltage and in this way quickly limit the capability of such tools. Anyway, compared to the overall design effort of a project - about 40 to 60 MM initial design and 100 MM till production release - the contribution of opamp design effort is minor. Amplifiers, Filters: For designing amplifiers and filters, nearly the same mix & match approach can be used, only the IP reuse is in this case in average lower (40-50%) and the design takes about 1 to 2 men weeks. Automatic simulations are in this case also not very useful. Converters: ADC’s and DAC’s are usually the most critical parts in high performance products (e.g. xDSL, cable modem,...) and need a lot of design effort. New technologies with low supply voltages and state of the art specifications always require to find new circuit structures and circuit improvements. Nevertheless, in Fig.5 it can be seen, that there are only a few topologies of ADCs commonly used for telecommunication applications.
81
Sigma Delta converters are widely used in high resolution, medium bandwidth applications, like ADSL, HDSL and SDSL, whereas 2 step flash sub-ranging converters are best suited for medium resolution (up to 11 bits) and high bandwidth, needed in VDSL, cable modem, DVBT and Gigabit Ethernet. Although each of the mentioned products has more or less different specifications, IP can be reused in a highly manner (60 to 70%). Once you have designed one converter type in a typical technology, the design effort and risks are significantly reduced for each additional converter of the same topology and technology. A typical IP reuse is described next: A 2step flash converter (Fig. 6) with 10bit eff resolution and 150MHz sampling rate with 1.8V supply voltage in technology was designed for a cable modem frontend, using the IP of a version with 5V supply. The identical converter could be reused afterwards in a COFDM project, a terrestrial receiver for digital TV (DVB-T). By introducing oversampling and adding digital filters, the same converter core was again suitable for VDSL with 11 bit effective resolution and 12 MHz
82
signal bandwidth. Only the driver circuit and the reference buffers had to be optimized for the VDSL-requirements.
Fig.7) shows the layout of this converter type in a) technology 5V supply voltage and in b) technology 1.8V. Both converters have the same performance of about 11bit effective for 12 MHz signal bandwidth. The power consumption could be reduced from 250mW to 180mW and the silicon area is drastically reduced for 0.18u version.
83
Using available parts of this converter and adapt them to new specifications (different bit resolution, different bandwidth) is a good mix & match approach to come easily to converters suitable also for a QPSK satellite-receiver (DVB-S) or Gigabit Ethernet. It is the same mix & match strategy, as for opamps, amplifiers or filters. Similar reuse is possible for multi-bit Sigma Delta converters, needed in xDSL products, with the need of adjustments to different resolutions and bandwidth. The reference design was a order multi bit sigma delta converter used in an ADSL-RT (Remote Terminal) chip, see Fig. 8 for the block diagram. A cascade 2-1 structure with 3bit resolution in the first stage and 5bit resolution in the second stage was chosen [3]. The analog bandwidth is 1.1 MHz with 14 bit resolution and a sampling frequency of 26 MHz. The first design was done in a technology with 5 volts supply, designed with an effort of 15 MM.
84
Then this converter was redesigned for a HDSL2 application with 450kHz bandwidth in the same technology. Due to the smaller bandwidth the second stage resolution could be reduced to a 3 bit structure and also the sampling frequency was reduced to 16MHz, which resulted in a smaller area and lower power consumption of the converter. The design and layout effort for this converter was only 5MM. The next step was a redesign for an ADSL-COT (Central Office Terminal) application with a bandwidth of 250kHz. Again we changed the structure of the second stage to 4bit resolution and the sampling frequency to 4MHz. The effort reduced to 3MM. This two reuse designs where done in the same technology, the next step was a technology change from to for the ADSLCOT converter. Due to the very fast technology we could again change the topology. We increased the sampling frequency to 53MHz and we decreased the converter order from 3 to 2 with 3 bit resolution, resulting in a very small silicon area as you can see in Fig. 9. Since we made a technology change, each block had to be designed new, and also a completely new layout was made. The effort for this new converter increased to 7MM.
85
As a summary, in table 3 several projects are listed to demonstrate the percentage of IP- and schematic reuse by means of mix & match. The percentage of reuse can differ from block to block from 5% up to 90%. Although the probability of 100% reuse of designed blocks is pretty low, 60% to 90% reuse capability in some projects is still very high by just using available IP, schematics, simulation shells, layout cells, testing facilities etc. and doing the new design by means of mix & match.
86
5) Additional aspects towards cycle time reduction Up to now, the paper was focused on cycle time reduction in analog layout and analog design by reuse within these tasks. However, the product development speed can also benefit from improvement in tooling and speeding up of other tasks besides analog layout and analog design. High potential for tooling is in support and automation of standard design tasks such as: definition and execution of block specific simulation runs, efficient (higher level) modeling of analog building blocks, interactive back annotation of layout data and re-simulation, efficient modeling and simulation of substrate effects, thermal coupling of building blocks and packaging impact. Some of the tooling aspects target towards quality improvement, which can help to make the design first time right. Tools for automatic design of specific building blocks are not very efficient due to their restrictions to low circuit complexity, predefined circuit topology and the minor saving potential in design time. As a typical example an Opamp was discussed in section 4.
87
For a typical project, Fig. 9 shows the percentage of effort for design, definition, architecture, layout and management with respect to the overall project effort. Design and layout makes about half of the product development efforts. Clearly, minimum cycle-time can be achieved only by attacking all tasks in the product development. For sure, a lot of IP reuse is possible in definition and architectural work.
Reuse strategies require good cooperation within a design team and between different design teams. Thus the human component must be considered as well. Team-building, motivation and information flow are essential to make reuse work. Besides reuse and adaptation work, each project should have some innovative parts. This helps to have motivated engineers and keep their know-how up-to date.
6) Conclusions Driven by technology roadmap, increasing system requirements and ‘time to market’ targets, reuse of analog IP is nowadays very important. But this does not only mean using plug & play analog modules or macros, it also means IP reuse with a so-called mix & match strategy. Architectural considerations should also not be neglected as an important factor in this strategy. The key enabler for IP reuse is the team spirit within a company and thus special attention
88
has to be paid to interpersonal relations. Last but not least with all the reuse don’t forget to design new and innovative circuits in order to have innovative steps in the product roadmap and to keep up with the leading edge of mixed signal design.
7) Acknowledgements Special thaks to B. Seger for the contributions concerning layout, F. Cepl for providing the SAR ADC reuse-tree, R. Schledz for contributing table of reuse categories. Furthermore we appreciate the valuable discussions with M. Clara, Ch. Fleischhacker and Ch. Sandner.
8) References [1] M. Christian, et. al, “0,35u CMOS COFDM Receiver Chip for terrestrial Digital Video Broadcasting”, ISSCC 2000, page 76-77 [2] H. Weinberger, et. al., “A 800mW, Full-Rate ADSL-RT Analog Frontend IC with integrated Line Driver” CICC 2001 [3] A. Wiesbauer, et. al., “A 13.5 Bit Cost Optimized Multi-Bit Delta Sigma ADC for ADSL” , Proceedings of ESSCIRC, September 1999, pp 82-88 [4] K. Bult, "Analog Design in Deep sub micron CMOS", Invited Paper, Proceedings of ESSCIRC, September 2000, pp 11-17 [5] A. Wiesbauer, et. al.,“ A Fully Integrated Analog Front-end Macro for Cable Modem Applications in CMOS”, submitted to ESSCIRC 2001, unpublished.
PROCESS MIGRATION TOOLS FOR ANALOG AND DIGITAL CIRCUITS Kenneth FRANCKEN, Georges GIELEN Katholieke Universiteit Leuven, ESAT-MICAS Kasteelpark Arenberg 10, B-3001 Leuven, Belgium e-mail : francken
[email protected] ABSTRACT The rapid progress in CMOS VLSI technologies together with the shortening time-to-market constraints of a competitive market and the shortage of designers necessitates the use of computer-aided design (CAD) tools for the automatic porting of existing designs from one technology process to another. Both horizontal and vertical technology porting are considered, where during vertical porting the intrinsically better capabilities of the new process can be exploited to either improve the performance of the circuit, or to keep the same performance while reducing power and/or chip area consumption. This paper presents CAD techniques for the automatic porting of both analog and digital circuits. Both the circuit resizing and the layout regeneration are discussed. For the circuit resizing, a scaling step is followed by a finetuning step. For the layout regeneration, a template-based approach is suggested. Experimental results illustrate the capabilities of the presented methods. Finally, the importance of proper design documentation will be stressed as a necessary means to facilitate easy technology porting. 1. INTRODUCTION Advances in very deep submicron CMOS VLSI integrated circuit processing technologies offer the possibility to integrate more and more functionality on one and the same die, enabling today the integration of complete systems that before occupied one or more printed circuit 89 J. H. Huijsing et al. (eds.), Analog Circuit Design, 89-112. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
90
boards onto a single piece of silicon. An increasing part of these integrated systems contain digital as well as analog circuits, and this in application areas like telecommunications, automotive and multimedia among others. The growing complexity of these integrated systems in combination with the tightening time to market constraints, however, poses a serious challenge to the designers’ productivity. That is why new design methodologies are being developed, such as the use of platform-based design, object-oriented system-level design refinement flows, hardwaresoftware co-design, and IP reuse, on top of the already established use of CAD tools for logic synthesis and digital place & route. For analog circuits the basic level of design abstraction, however, is still the transistor level, although commercial CAD tool support for cell-level circuit and layout synthesis is emerging [1], allowing designers to concentrate more on the high-level architectural design issues as well as on the design of key critical blocks only. One serious problem that challenges both analog and digital designers is the extremely fast pace of the introduction of new deeper and deeper submicron CMOS technologies, at a rate which is even faster than the predicted technology roadmaps [2]. Before any new process can be used, however, a library of digital standard cells or selected IP blocks, such as a processor core or a memory generator, has to be developed and qualified. Developing this from scratch is very time-consuming and expensive, and delays the production use of the new process. At the same time, many existing analog and digital blocks are reused in new system designs for new applications, or in newer versions of an existing system that is redone in a newer process to reduce cost. The effort, however, that is needed to guarantee at least the same performance for these blocks in the new technology is not negliglible and is not at all regarded as very creative by designers. Computer-aided or even automated technology porting of integrated circuit blocks, both analog and digital, is therefore getting more and more attention today. Two types of process migration or process retargeting can be distinguished, as shown in Fig. 1. The first one is called horizontal porting where the same cell performance has to be obtained in a process with the same minimum transistor length but from a different foundry (e.g. CMOS of company ABC to CMOS of company EDF). The second one is called vertical porting where the same cell performance or better has to be obtained in a process with a smaller minimum transistor length from the same or another foundry (e.g. CMOS of company GHI to CMOS of company JKL). For the vertical porting, the intrinsically better
91
capabilities of the new process can be exploited to either improve the performance of the circuit, or to keep the same performance while reducing power and/or chip area consumption.
This paper will discuss techniques for the automatic process porting (both vertical and horizontal) of both analog and digital cells. Both advantages and limitations will be presented. Section 2 will describe a possible flow for an automatic porting tool, distinguishing between the sizing retuning phase and the layout retargeting phase. Section 3 will then illustrate this for an analog design case (a modulator), while section 4 will illustrate this for the porting of a digital standard cell library. Guidelines or measures to be taken into account during design to facilitate an easy porting of that design later on will be discussed in section 5. Finally, conclusions will be drawn in section 6.
2. PORTING METHODOLOGY While digital circuits can often be retargeted to a new technology by geometrically scaling the layout, this procedure does not automatically guarantee success for analog circuits and not even for digital standard cells. This is due to the different scaling needed for different components in the circuit to keep at least the same performance. We can distinguish two steps in the technology porting task (see Fig. 2) : 1) the circuit tuning or resizing in which the device sizes and biasing are modified such that at least the same performance is obtained in the new process, and 2) the regeneration of the layout with the new layout rules and the updated device sizes. Since new technology parameters can influence the circuit performances, it is imperative to perform simulations at the circuit level to verify the correct performance of the circuit after tuning and after layout. Both the tuning and the layout steps will be further discussed in the next sections. Note that for a more complex circuit, like
92
an analog-to-digital converter, this process will be performed hierarchically, first at the level of the converter and secondly at the level of the circuit blocks, as will be illustrated for the modulator later on.
In the remainder of this paper, we assume that the new technology process (called the target process) is compatible with the original process (called the source process) in the sense that we can use the same circuit topology. If this is not the case, for instance because the target process has a supply voltage that is much lower than the source process (e.g. 1.8 V versus 3.3 V), then new topology structures (e.g. low-voltage structures) will have to be used and the design becomes a completely new design instead of the process migration of an existing design. We exclude such cases here. The largest difference between the porting of a design on the one hand and the creation of a new design, be it by manual handcrafting or with a circuit and layout synthesis tool, on the other hand, is that, in the case of porting, a good reference design exists that serves as a basis to start from. This is not the case with a new design that basically starts from scratch.
93
In the case of porting, the existing design has already proven to be working, and only needs updating in the sense of slight modifications to the device sizes and a regeneration of the layout tailored to the new layout technology rules and the tuned device sizes. In many cases the designers even prefer the new layout to look very much like the old one, which makes it easier for them to “read” the new layout. Therefore, advantage can be taken of both the existing device sizes and the existing layout to reduce the complexity of the porting task. This will be discussed in detail in the next section.
3. PORTING OF ANALOG CELLS 3.1. Sizing step For the porting of analog cells, we perform the resizing in two steps. Keeping the same topology, the first step is to perform an initial scaling of the original design, which gives us a starting point already close to the final solution in the target technology process. Following this step, a finetuning phase using optimization takes place to correct for possible violations of certain performance specifications or to reduce power and/or area while keeping the same performance. This is graphically illustrated in Fig. 3. Both steps can be automated, although they require some information from the original design.
94
3.2.1. Initial scaling The first step taking place in the resizing process is an initial scaling. The existing netlist is linked to the new technology file and all transistor model parameters are updated. Then the biasing currents and transistor widths W (relative to ) are altered. If the supply voltage stays the same and bias currents and W's are scaled, we choose not to alter the bias voltages. The scaling factor is determined by writing down the current equations (we neglect the Early effect) under constant constraint. (Note that other scaling factors are obtained if other constraints are used.) If index A stands for the source process and index B indicates the target process, we get:
Or: with
and The minimal transistor lengths of the two processes are known. The parameters KP (both for n and for pMOS) and (consisting of and a term dependent on the body effect) are - on the contrary – typically not given in the technology file of deep-submicron CMOS processes. They can be obtained by simulating a test circuit in SPICE and then fitting the output data. For a digital circuit we can take but for analog circuits this is dependent on the sizing. We assume that the supply voltage is the same in both technologies and we assume that the transistors will be designed with the same gate overdrive voltage in the new technology, which is equivalent to making equal to 1. Hence, we get for the scaling of the transistor widths :
with The numerical value of these factors is different for nMOS and pMOS transistors. Capacitor values on the other hand are scaled depending on the function
95
of the capacitor. To keep poles and zeros at the same frequency, their value is kept constant, and their area is updated according to the new perunit capacitance value. When the matching of the capacitors is important (like for sampling and integration capacitors in switched-capacitor circuits), the scaling is performed according to the new mismatch data (i.e. keep the ratio, but alter the size to meet the matching specification). This depends on the function of the capacitor in the original design and this information should therefore be available from the original designer (see section 5). Capacitors can cause even more difficulty during porting when the same type of capacitor implementation (e.g. poly-poly capacitor) is not present in the new technology. New CMOS deep-submicron processes will always be first available in the digital-only version. Analog extensions or analog options to this technology (like a poly/poly capacitor) will only be available in a later phase, if at all. Therefore, if the original design was implemented with poly-poly capacitors, these might now have to be replaced with metal-sandwich (MiM) capacitors (if linearity is important) or MOS gate capacitors (otherwise, because of small area). Similar problems can arise for other passive components like resistors and certainly for on-chip spiral inductors, which completely have to be regenerated based on the new technology data (especially the substrate resistivity is important).
Table 1 summarizes the initial scaling factors for different device parameters. Applying these formulas results in a sized circuit in the new process. The circuit is verified with numerical simulations (e.g. SPICE) which yields numerical data for all performances of concern. We can then compare these performances with the specifications to see whether they are all satisfied or not. If they are, the whole porting process has successfully finished, unless we want to additionally reduce power and/or chip area. In this case and in the case when not all the specifications are satisfied, we have to continue with the finetuning step.
96
3.2.2. Finetuning step After the initial scaling, the next step is to simulate the circuit blocks, verify their performances and - if necessary - adjust some device sizes. This problem can be cast as an optimization problem in which an optimization algorithm minimizes a cost function. This cost function consists of terms that penalize any violations of the performance characteristics compared to the original design, and possibly in addition contains the implementation cost of the solution, i.e. power consumption and/or chip area, that has to be minimized. The optimization variables are the device sizes and biasing. The mathematical formulation is as follows:
with where is a penalty function that assumes a large value when performance does not satisfy specification and and are weighting coefficients. As compared to circuit optimization during synthesis [1], the optimization algorithm used here is preferrably a local method, since a good starting point is already available from the original design after the scaling. In addition, the method can even be speeded up using information from qualitative reasoning, which indicates in a tabular format, called dependency matrix, which parameter has to be improved to improve a certain performance [3]. This is close to the human way of thinking. The information in the table can for instance be generated using sensitivity analysis, and the different possible parameter changes are prioritized according to their impact on the violated specifications but also considering their (possibly negative) impact on already satisfied specifications.
3.1.3. Example : resizing of a
modulator
The approach is now illustrated for the resizing of a modulator designed for ADSL specifications from the Alcatel Microelectronics CMOS technology with analog options (poly-poly capacitor) to the digital CMOS technology of Alcatel Microelectronics. As no poly-poly capacitors could be used in this target process, a 5-metal-layer sandwich capacitor was chosen instead for sampling and integration capacitors.
97
Circuit structure The ADSL specification requires an accuracy of 12 bits, but the goal of the prototype was set higher, namely a 13-bit accuracy. This means a dynamic range of 80 dB. The required signal bandwidth for ADSL is 1.1 MHz Furthermore, an oversampling ratio (R) of 24 was chosen for the original design, resulting in a sampling frequency of 52.8 MHz. A 2-1-1 (4th-order) cascade structure was selected for the original implementation as shown in Fig. 4. The complete block diagram is shown in Fig. 5. More details on this original design can be found in [4].
98
In the original design, the size of the sampling capacitor of the first stage was determined by the required kT/C noise floor. From with a and OSR=24, the minimum size of C turned out to be 3pF. The capacitor size of the last stage was mainly determined by matching considerations. A matching of 1% was sufficient for these capacitors, so a unit capacitance of 0.25 pF was chosen, which has a matching that is smaller than 1%. The other capacitors were scaled down from the first to the last integrator. This means that the capacitive loads of all the integrators are reduced and the OTA’s were scaled down as
99
well over the different stages. From behavioral simulations a minimum OTA gain of 80 dB, a closed-loop OTA pole of 190 MHz, a minimum slew rate of 400 and a maximum switch resistance of were derived. For the comparator the requirements for offset and hysteresis were maximum 100 mV and 40 mV, respectively. All building block specifications are summarized in Table 2.
We will now discuss the integrator and the comparator. We used 3 scaling factors to perform the initial scaling: scale_n (for nMOS transistors), scale_p (for pMOS transistors) and scale (for capacitances and bias currents). All three have been calculated using Table 1 as 0.54065, 0.51639 and 0.7 respectively.
100
The integrator. Table 3 illustrates the specifications and the performances of the integrator building block in the and the process after the initial scaling step. As can be seen, the switch resistance specification is violated, and thereforee a second step, namely the finetuning phase, is necessary to correct for this violated specification. The finetuning is also used to reduce the power consumption of the integrator as Table 3 shows that there is margin on performances like GBW and slew rate. The schematic of the employed OTA, a gain-boosted differential folded-cascode, is shown in Fig. 6 (without common-mode feedback or biasing circuitry).
The sizes for the OTA, for the gain-boosting stages and for the switchedcapacitor integrator together with their scaling factors are shown in Table 4, 5 and 6 respectively. Note that – due to the finetuning – the effective scaling factors are different from the initial ones, precluding a simple geometric scaling. The bias current of the OTA was changed with a factor 0.7 from 2.5 mA to 1.75 mA. Due to the different parasitics of the metal-sandwich capacitor, Cload was changed from 18 pF to 12 pF. The 3 other integrators were scaled in the original design by factors 0.5, 0.35 and 0.35 compared to the first integrator (due to the sampling capacitances decreasing in each stage). The same factors were applied during the porting.
101
Part of the qualitative dependency matrix that can be used to finetune the performance is shown in Table 7. Possible parameters to tune the
102
performance are the bias current of the input transistors qualitative numbers in the table.
and the width as illustrated by the
An additional finetuning step was performed to correct for the violated switch resistance specification (see Table 2) and to additionally reduce the power consumption by means of the OTA bias current while keeping the performances at least equivalent to the ones in the original design. The scaling factors of the switches are determined to be 0.6758 and 0.6455 for nMOS and pMOS transistors respectively (see Table 6). This results in a switch on-resistance of like in the original design. As can be seen from Table 3, the limiting performance was the slew rate. If we want to make the slew rate in C035) equal to the original design ( in C05), the current can be reduced with 15% down to 1.5 mA. After comparing the simulated results of this final design with the performances of the original design, we can see that the tuned version performs at least equally well as the original design with a 40% lower power consumption for the integrator.
The comparator. Table 8 illustrates the specifications and the performances for both processes of the comparator building block. The schematic of the comparator is shown in Fig. 7. Table 9 summarizes all device sizes for both the source and the target process, together with the
103
effective scaling that was applied. To avoid kickback noise, the input of the comparator is sampled on a 0.25 pF capacitor of which the value is left unaltered. The design variable Ibias was scaled from 100 to Like in the original design, the second comparator is a scaled version of the first one to reduce the load on the C2a clock signal; the third one is identical to the second one.
104
existing layout as much as possible, a template-based approach is preferred in this case [5,6]. The layout then looks like the original layout, and also the parasitics will likely be similar (in a scaled sense). The template fixes the relative position and interconnection of the devices. The layout is then completed by correctly regenerating the devices (with the possibly updated device sizes) and the wires for the new process according to this fixed geometric template, thereby trying to use the area as efficiently as possible. These approaches work best when the changes to the circuit’s device sizes are not too large, so that there is little need for global alterations in the general circuit layout structure and hence the existing template can be used. Fig. 8 shows for example three different instances of the layout of a circuit generated with a template-driven layout methodology [7]. The main problem, however, as already stated above, is the automatic extraction of the template from an existing layout. Most template-based approaches published in the past [5,6] a priori generated a template for every circuit and stored that in some library, to be used during layout generation. If no such template is available, then it will have to be extracted from the layout, which is much more difficult. In practice, this will often be the case, unless the design has been documented properly, as will be discussed in section 5 later on.
105
3.2. Layout step Analogous to the sizing problem, we want to take as much advantage of the original layout as possible to synthesize the new layout based on the new device sizes. This means that we want to generate the new layout as much as possible with the original layout as guide or reference, called a “template”. The preferred layout approach here is therefore “templatedriven” layout. There are however a few practical limitations. One of them is that, with the original layout at our disposal, we still are confronted with the fact that we must be able to automatically recognize all the devices of the original circuit and their interconnections on that layout. This is a very complex task that to some extent is also performed in LVS tools, but these tools don’t provide the full information needed to build up a template from an existing layout. Another practical inconvenience that one is likely to encounter when trying to recognize and resynthesize devices is a possible different technological implementation of certain devices. For analog circuits this can be the case for special resistor and capacitor layers available in one technology but not in the other. For both analog and digital circuits the number of metal layers (used for interconnection) can differ between the two processes. We will distinguish between the top-level layout or floorplan, which is needed for more complex cells, and the layout regeneration of the basic cells.
3.2.1. The floorplan For more complex cells like a modulator, the layout is generated hierarchically according to a floorplan that is defined first. The layout of the floorplan is an important step which has impact on all blocks that are part of it. Mostly, a lot of reasoning has preceded the final floorplan of the original layout. It is therefore only logical to reuse the original floorplan in the new target process, or more specifically : to keep the relative positions of the building blocks to keep the aspect ratios of the building blocks All of this should be done as far as practically managable.
3.2.2. Template-based cell layout generation Once the floorplan has been determined, the layout of the different blocks can be generated accordingly. In order to take advantage of the
106
4. PORTING OF DIGITAL CELLS Digital standard cell libraries are a key element in every modern VLSI design flow. The most important issues are compactness and speed of the cells. Therefore, the performance of these cells and their layout are individually tuned. This job is not only complex, but also very timeconsuming considering the fact that this is mostly handcrafted work. Of course, this only needs to be done once for every technology. But also in the case where multiple foundries are used for reasons of multi-sourcing or where different flavors of the same process (e.g. with or without germanium option) are used, a new optimized library is needed. Let us also keep in mind that new and smaller feature-size technologies are becoming available at an increasingly faster rate [2] and that even “older” processes get tweaked over time to increase performance and yield. On the other hand, market pressure demands quick product introductions and the availability of the standard cell library is therefore often a bottleneck to adopt a new technology today. It would therefore be beneficiary to have very quickly access to a first version of the new library, generated by the computer from an existing previous library, and which can still be tuned manually afterwards if the need arises to squeeze out the last square micrometer. We will now discuss such a porting methodology for digital standard cell sizing based on a genetic algorithm. Since we use a SPICE –level circuit simulator for the transient delay simulations of the cells, accurate performance results are guaranteed. Our approach is optimization-based in combination with SPICE simulations, as this is the only approach that provides the necessary accuracy for the library cell performances similarly to what is normally used in hand–crafted sizing. The optimizer therefore iterates for different values of the device sizes to tune the cell’s performances to the required specifications while minimizing cost such as power and/or chip area. At each iteration transient SPICE simulations are performed to extract the desired performance characteristics (propagation delays, rise or fall times, etc.). To this end, parameterizable netlist descriptions for the different cells have been developed. These descriptions are standard SPICE syntax and the desired performances are also represented in each netlist as measured variables, which are automatically parsed by a tool implementing the porting approach. For the porting itself, a user can choose that the performances in the target process can be kept equal to those in the source process or they can be tuned by relaxing some specifications or making them more stringent. Of course, in practice, one will set the specifications – mainly in terms of
107
delays – more stringent for the target technology; otherwise there would be no need for the new process.
4.1. Flow of the tool Fig. 9 shows the flow of the tool. The user provides the specifications of the performances, mainly delays and rise/fall times, that have to be evaluated by means of the measurement statements in the SPICE netlist. These specifications for the target technology can be chosen to be the same as in the source technology or other, more stringent, values can be specified. The properties of both source and target technology are specified in an ASCII configuration file. The tool then returns the optimum cell sizes that ensure that every performance satisfies its specification.
As optimization algorithm guiding the parameter selections we employed the differential–evolution genetic–based program described in [8], which we altered slightly. It is a genetic algorithm that searches for a global optimum and uses continuous parameter values. Among the changes compared to [8] are the inclusion of parameter bounding and stop criteria. Every population member in the genetic algorithm is represented as shown in Fig. 10. The different genes represent the lengths and widths of the transistors in the circuit. These parameters are passed to the simulator which performs the requested analyses. The simulation results together with the specifications are then used to evaluate the fitness of the
108
member by means of the following cost function :
This is a minimax problem formulation. The algorithm will try to minimize the cost, which is equal to the maximum normalized deviation of a performance from its specification. Each performance is thus normalized to have an equally important influence. Also, a weight factor W is included which is different when the specification is met (100) or not (100000). Note that with W = 100 and a cost threshold stop criterion of 1, a tolerance of 1% is achieved. It is, however, also possible that the genetic algorithm proposes bad combinations of parameters (e.g. out of range). Then, a “high” cost is assigned (e.g. 1e+8) to such solutions. 4.2. Digital standard cell porting examples We will now demonstrate the capabilities of the tool to automatically find the scaling factors for the transistor widths (nMOS and pMOS) that are necessary to migrate digital standard cells from one technology to a newer one. To have an optimal performance, the scaling factors are not necessarily the same for each type of cell. The source technology is a CMOS process and the target process has a gate length. Since all cells have minimum gate length, we don’t optimize the transistor lengths. A first experiment keeps the performance specifications of the original cell. We migrate a simple inverter cell from a CMOS to a process, where we try to keep the performances. So, the question is: how small can the transistors be sized in the technology as to still have the same performance as in the technology? Note that the scaling factor for the transistor length is 0.714 (1/1.4). The results
109
are given in Table 10, where a comparison is made with actual (“real”) data of cells that were hand–crafted by the manufacturer in the same target technology process. The final cost function value is given together with the time taken by the tool and the number of generations of the genetic algorithm. Although a genetic algorithm is very well suited for parallel execution, the numbers presented here are the results of execution on a single host computer (SUN Ultra 30). We can conclude from the table that the optimized performances are within the given tolerance of 1% (0.5% for low–to–high (PLH) and 0.2% for high–to– low propagation delay (PHL) respectively). Nevertheless, the optimized parameters – the nMOS and pMOS scaling factors – deviate by as much as 62 % from the “real” values that we had in the manufacturer’s library. This is, of course, due to the fact that the speed specification for a standard cell library in practice is always increased when moving to a faster technology process; otherwise no advantage of the faster process would be taken.
Hence, in the second experiment, also the speed specification is increased. In Table 11, we present the results of an experiment similar to the first but where the target performance specification is entered to be equal to the real simulated target cell specifications from the manufacturer’s hand–crafted library. This is done for three different cells (inverter, 2–input and, exor). Again, a comparison has been made between results from the tool and the actual cell data. It is clear that the scaling factors now match better with the real values. Nevertheless, they deviate by 3 to 8%, even though the optimized performance is within 1% of the specification (as requested). This is likely due to extra design margins that are taken in a real design. The above mentioned experiments show that the migration flow for digital standard cells works and that the user can arbitrarily set the target specifications. The performances of the optimized cells are within the accuracy specified by the user (1% in our example). Also, the optimization times are well within reasonable limits since the library migration will be done only once for every new process. In addition, we didn’t make use of parallel execution on different host computers, which would speed up the optimization even further. Therefore, by making templates of the library cells only once, a fast migration at the level of
110
cell sizing is possible for every subsequent technology. Again, it is assumed that the cell topology does not change when porting to the new process.
5. DOCUMENTATION FOR PORTING In order to allow an easy porting, the original design should be somehow “prepared” for the porting. It is difficult for another designer unfamiliar with the previous design, and even more so for a computer tool, to understand all the details, the intent and the little “twists” in the mind of the original designer when completing his/her original design. Therefore, to facilitate porting, a minimum amount of documentation should be generated by the original designer and should be delivered together with the design itself. This small initial overhead certainly pays off on the long run for the company when the design has to be ported to other processes later on. And it is only the original designer who has the information that is needed for this and who therefore has to provide this. Besides, “design flow capturing” tools that operate in the background could be set up here to help the designer in this job. Also, standardized verification tools that generate a standardized datasheet for each circuit would certainly be useful here. As a start in this discussion, we will specify here what kind of information should be included in the documentation accompanying the original design. For a design to be portable, we propose the following mandatory set : 1. System specifications + derivation of the specifications of each block (top-down) in order to ensure that the system will work, together with other essential specifications/performances 2. Top-level architecture + external PIN connections + the topology of
111
all blocks in the hierarchy + their interconnections to form the system ( = hierarchical netlist) 3. The circuit sizes for each block together with a list of the critical devices, possible problems and the relation between important performances and the device sizes having the most impact on these specifications/performances (e.g. the GBW increases with increasing channel width of the input transistors, etc.) 4. Simulation or verification method, applied inputs, outputs to be checked, how to verify that the specification is met, simulation examples (graphs) We understand the extra effort needed for the original designer of the circuit to document all this information in an orderly fashion. On the other hand, almost all of the information in the list is generated at some point of time in one or another file during the course of the design anyway. Moreover, the designer him/herself can also benefit from this documentation, either for a next design or for some kind of reporting. Finally, we want to point to the fact that documentation will play an increasingly important role in the trend towards complex integrated systems-on-a-chip. Organisations like the VSI (Virtual Socket Interface) Alliance [9] have acknowledged this need and have proposed an open interface to make design re-use possible. The circuit that is being reused is then a so-called VC (Virtual Component) and will have to be accompanied by a minimum standardized set of documentation. Retargeting benefits from the same documentation.
6. CONCLUSIONS This paper has presented CAD techniques for the automatic horizontal and vertical porting of both analog and digital circuits. Both the circuit resizing and the layout regeneration are discussed. In both cases, advantage is taken as much as possible of the existing design as a reference to start from. For the analog circuit resizing, a scaling step was followed by a finetuning step. For the layout regeneration, a templatebased approach was presented. For the digital standard cells, a simulation-based optimization approach was adopted. Experimental results have illustrated the capabilities of the presented methods. Also the importance of proper design documentation has been discussed as a necessary means to facilitate easy technology porting. Future work will have to concentrate on improving the methods and integrating them into a flawless automated environment for both analog
112
and digital circuit porting. Also the role of documentation and techniques to minimize the overhead of design for reuse will have to be further investigated and implemented.
ACKNOWLEDGEMENTS This work has been supported in part by the ESPRIT project NAOMI and the IWT project FRONTENDS.
REFERENCES
[1] G. Gielen, R. Rutenbar, “Computer-aided design of analog and [2] [3] [4]
[5]
[6] [7]
[8] [9]
mixed-signal integrated circuits,” Proceedings of the IEEE, Vol. 88, No. 12, pp. 1825-1854, December 2000. “International Technology Roadmap for Semiconductors,” 1999 version + 2000 update, http://public.itrs.net. K. Francken, G. Gielen, “Methodology for analog technology porting including performance tuning,” proceedings International Symposium on Circuits and Systems (ISCAS), June 1999. Y. Geerts, A. Marques, M. Steyaert, W. Sansen, “A 3.3 V 15-bit ADC with a signal bandwidth of 1.1 MHz for ADSL applications,” IEEE Journal of Solid-State Circuits, Vol. 34, No. 7, pp. 927-936, 1999. G. Beenker, J. Conway, G. Schrooten, A. Slenter, “Analog CAD for consumer ICs,” chapter 15 in “Analog circuit design” (edited by J. Huijsing, R. van der Plassche and W. Sansen), Kluwer Academic Publishers, pp. 347-367, 1993. H. Koh, C. Séquin, P. Gray, “OPASYN: a compiler for CMOS operational amplifiers,” IEEE Transactions on Computer-Aided Design, Vol. 9, No. 2, pp. 113-125, February 1990. R. Castro-López, M. Delgado-Restituto, F. Fernández, A. Rodríguez-Vázquez, “Reusability methodology for IC layouts,” proceedings Workshop on Advanced Mixed-Signal Tools, ESDMSD Mixed-Signal Design Cluster initiative of the European Union, March 2001. R. Storn, “On the usage of differential evolution for function optimization,” in NAFIPS, pp. 519–523, 1996. Virtual Socket Interface Alliance, several documents including VSIA Architecture Document and Analog/Mixed-Signal Extension Document, http://www.vsi.org.
Introduction to High-speed Digital-to-Analog Converter Design Rudy van de Plassche Broadcom Netherlands BV Bunnik Abstract
In this paper limitations in static linearity (INL, DNL) and dynamic range (Effective Number of Bits, ENOB’s) of digital-to-analog converters due to clock jitter, finite linearity, component matching and switching uncertainty will be calculated. Secondly quantization error spectra are analyzed and the influence on distortion and cross modulation effects is derived. Practical design examples will be discussed.
1
Introduction
Digital-to-analog converters are the link between digital signal processing and the analog world. In Fig. 1 the different signal conditions present in a converter are given. From Fig. 1 it is seen
that a digital signal is a discrete time, discrete amplitude signal. 115 J. H. Huijsing et al. (eds.), Analog Circuit Design, 115-150. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
116
An analog signal is a time continuous, amplitude continuous signal. To convert from digital into analog a signal reconstruction takes place. Sampling limits the maximum frequency range to the Nyquist frequency. A filtering operation is needed to limit the maximum signal frequency and avoid aliasing. Amplitude quantization discretizes the amplitude into well known steps. A quantization error is introduced. This quantization error limits the dynamic range of a system. The quantization error depends on the number of steps used in the system.
2
Ideal converter
In an ideal converter the sampling time is fixed and constant and does not introduce any error. Only the amplitude quantization error causes a limitation to the system. Because all errors discussed in this paper will be referred to the quantization error this error will be calculated first. In Fig. 2 the quantization of a signal at the sampling moment to the amplitude level is shown. The quantization error determines the error between the analog signal and the quantization level In the lower part of Fig. 2 the probability density of the error over the quantization interval is shown. Here is the quantization step of the converter. The uniform probability error function shows that there is no correlation between the signal frequency and the sampling frequency. The quantization error power can be calculated using as the quantization step. The average quantization error power becomes: Solving this equation we get the well known result:
Applying a sine wave with a peak-to-peak amplitude of to an n-bit system, then the RMS signal am-
117
plitude can be calculated as:
The dynamic range (Signal-to-Noise ratio) of the n-bit system becomes:
Inserting values for amplitude and quantization error we get:
Converting this into Decibels we obtain:
These results give a global analysis of quantization error. To obtain a better knowledge about what quantization errors are an analysis of the quantization error spectra will be given. This model can be applied to analog-to-digital and digital-to-analog converters.
118
2.1 Quantization error spectra Suppose a quantized ramp signal shown in Fig. 3 is reconstructed then the error signal can be determined as a sawtooth with amplitude and repetitions as shown in Fig. 4. Note that at this moment ONLY AMPLITUDE QUANTIZATION is used. SAMPLING of the signal will be performed at a later stage. By shifting the DC value of the signal as shown in Fig. 4 then a Fourier analysis of that signal gives only odd harmonics described as:
In case a sine wave is applied then the output spectrum becomes
119
more complex. From [1] we obtain:
Simplifying this equation we get:
With
defined by:
The amplitude of the harmonic with index is given by from equation 10. The quantization error spectra can be plotted using this equation. In Fig. 5 a spectrum with up to 30.000 odd components of a 10-bit quantizer is shown. The spectrum slowly decreases with increasing number of harmonics and has a length of infinity. The spectrum shows furthermore peaks that can mathematically be determined to occur at the harmonic of the input signal. A more detailed part of the frequency spectrum of the same quantizer of Fig. 5 is shown in Fig. 6. Lower order odd harmonic amplitudes can be estimated from this figure. Third harmonic is about 90 dB down with respect to full scale. A relation between for example the third harmonic component and the number of bits of a converter needs some mathematical manipulations that go beyond the scope of this material. The result for the third order distortion, however, can be expressed as: As a result a 10-bit converter has a third order distortion component 90 dB down with respect to full scale. Increasing the
120
121
resolution of the converter with 1-bit, then the distortion component reduces with In Fig. 7 the quantization error, the third order distortion component and the intermodulation component (IM3) as a function of the number of bits are shown. The
IM3 products will be described in one of the following sections.
2.2 Amplitude dependence of the quantization components So far the calculation of the quantization spectra has been performed for signals exactly fitting within the quantization levels. In case a signal varies within a quantization level then the total spectrum changes. Suppose that the amplitude varies as [0 1], then equation 10 is modified into:
With Taking as an example p=3 and p=31 then the result for a 6-bit converter is shown in Fig. 8. This figure (8) shows that depending
122
on the signal amplitude distortion components can be reduced to zero. Maximum third order distortion is found roughly at the quantization levels.
2.3 Multiple signal distortion At the moment two or more signals are quantized, then it is important to know what the so called cross-modulation or intermodulation products (IM3) will be. A formula for two input signals will be given. Suppose we apply the following signals
Using the previously described analysis and using some mathematical manipulations we get for the cross-modulation:
Applying signals with equal amplitude then the intermodulation product changes with the number of bits of
123
the converter as This means that with an increase in resolution of 1-bit, the intermodulation product will decrease with 12 dB. Again this needs some mathematical manipulations of the equations to obtain this result. Quantization of analog signals results in errors that are correlated with the signal mostly as odd harmonics of the signal. The analyzed spectra show basically an infinite number of spectral components.
3
Sampling of a quantizer
In a converter a quantization and a sampling operation is performed as has been shown in the introduction. At the moment the quantizer spectra are sampled using a sampling frequency then a large distortion called aliasing occurs for all frequencies outside the band As a result of this aliasing all higher order frequency components are shifted to this baseband. This operation is shown in Fig. 9. This figure shows a single sided frequency picture of the sampling operation. From this figure
124
it can be seen that if a correlation between the sampling frequency and the signal frequency exists, then distortion products will add, while in the case of uncorrelated sampling and signal frequencies these components fall in between the harmonics and result into a ”noise-like“ quantization error. Increasing the sampling frequency will result in a reduction of the amplitude of the quantization error components, however, the total power of all the error components in the baseband never exceeds
4
Non-ideal converter
Practical converters show deviations from the above given analysis. Especially the finite matching of practical elements such as resistors and transistors show a big influence on the maximum performance of a converter. These non-idealities can be split up in timing errors and non-linearity errors. The timing errors can be random due to noise and jitter introduced by the sampling clock and systematic errors mostly introduces by the layout of a converter. These systematic errors must be avoided as much as possible, but at a certain moment are unavoidable. Sometimes a change in converter architecture is needed to avoid systematic timing errors. In a layout for example an interconnection wire of about 100 introduces a time error of about 1 psec. At high signal and clock speeds these errors will introduce glitches and/or mostly third order distortion products of the signal frequency. First we will start with timing errors.
4.1
Timing
Timing in a practical system is not ideal. Timing errors can be: Random timing errors due to clock jitter or noise on clock circuits Systematic errors due to layout, differences in wire lengths or the system architecture
125
Random timing errors result mostly in an increase of quantization errors at high signal frequencies. Systematic errors result in distortion that in term results in a reduction of dynamic range. Mostly systematic errors are architecture or layout dependent. Wires from the sampling clock to the bit-switches can have different lengths in case a layout is not carefully designed.
4.2 Random timing errors In practical systems sampling clocks show a certain instability called clock jitter, while additional to this clock jitter noise in the clock distribution circuitry can increase this jitter. In Fig. 10 the influence of clock jitter on the sampling of an analog signal is shown. The influence of this clock jitter on the amplitude
quantization of an analog signal can be calculated. From Fig. 10 it is seen that clock jitter only has influence on the amplitude quantization at high input frequencies. This random clock jitter exhibits itself as an extra quantization error and thus reduces the dynamic range of the converter. This effect will be calculated in case of an n-bit converter with quantization steps. A sine wave will be applied as signal because it is the highest frequency possible in a band limited system avoiding aliasing. With we obtain after differentiation:
126
With The step size
and A the amplitude of the signal. of the converter equals:
A quick indication about the peak-to-peak clock jitter can be obtained stating that the amplitude uncertainty may not exceed the quantization step we obtain after rearranging of the equation:
The worst case condition is found if tion simplifies into:
so this equa-
With MHz and n = 10 bit, the peak-to-peak clock uncertainty must be below 65 psec. However, we want to know the influence of the RMS clock jitter on the Effective Number Of Bits (ENOB’s) of a converter. With ENOB defined as:
Here is the ”measured“ effective resolution of a converter in dB and includes all the non-ideality effects of a practical converter. The error power due to jitter equals: In this power equation the slope of the signal determines the sensitivity of the converter to clock jitter. We can average this slope over the signal period and get:
Inserting 26 into 25 gives:
127
With The total error power due to quantization and jitter becomes:
Where
defined as:
Using equation 28 we can rearrange equation 29 and obtain for
We will call the sample clock phase noise (rms). The dynamic range of the converter due to clock jitter noise power changes according to:
The reduction in ENOB’s as a result of jitter then equals:
If then the ENOB’s reduce with bit or 3 dB. The ratio between the clock jitter and the signal frequency can be calculated as a function of the number of bits in the converter. In Fig. 11 the decrease in ENOB’s as a function of is shown for converters having a resolution between 4 and 16 bits.
4.3 Glitches Glitches can be seen as a systematic error occuring in the reproduction of an analog signal via a digital-to-analog converter. Especially when a binary weighted converter architecture is used and a small signal around a major carry transition is converted, then a glitch can be produced. As an example of this phenomenon
128
a binary weighted converter with offset binary coding reproducing an LSB code step at the 011111.. to 100000.. transition. At this code transition the MSB value will be switched on or off at the same time that all other values are switched off or on. In case switching time errors occur, then the output code can reach full scale (1111..) or all zeros (0000..) during a short period of time. This produces an unwanted signal glitch. Filtering off this glitch will reduce the amplitude, but will NOT reduce the amount of distortion produced by this glitch. Suppose that the MSB switch is faster in switching then all the other bits, then the influence of the glitch energy can be calculated. With is the LSB step size, then half scale equals The glitch area becomes:
With
is the sample time, then the LSB area is found as:
Suppose that an acceptable reduction in dynamic range is obtained when the glitch energy equals the LSB energy then we have:
129
With and n = 14 bits we obtain that psec. Such a small value indicates that an accurate layout of the converter concerning the switching is needed. Changing the architecture into a step by step switching of the information having 1023 switches with unit currents for example in a 10-bit converter avoids the glitch problem. However, at high output frequencies close to the Nyquist frequency a switching time uncertainty is introduced by the layout of a converter. Every single switch can be seen as having a certain switching uncertainty compared to an ideal switching system. As a result the reproduced signal moves in time giving after filtering a third order distortion. Again this third order distortion depends on the signal frequency and the timing accuracy that can be designed in the layout.
4.4
Linearity
In a non-ideal (practical) converter the quantization steps have a limited accuracy because of finite matching of components. This results in an Integral Non-Linearity (INL) and a Differential NonLinearity (DNL) of the converter. The INL is important for large signals because it determines the overall linearity, while the DNL of a converter is important for small signals [2]. Basically the DNL determines the accuracy of the quantization step from quantization level to quantization level. This non-ideality results in distortion of the signal and thus in a reduction of the dynamic range. Mostly the INL is specified as ± LSB. DNL depends on the construction of the converter but is at maximum 2*INL in case of a binary weighted converter. In Fig. 12 the INL and DNL characteristics of a converter are shown. Note that sometimes codes are missing giving a large DNL or even the output signal can step back with an increase in digital input code. Nonmonotonicity of the converter is observed at that moment.
130
4.5
Matching accuracy of converter elements
When designing converters with resolutions from 8 to 16 bits the following question arises: How accurate do I need to design the unit currents or resistors to obtain a certain INL and/or DNL? To answer this question a Matlab program has been used to obtain information about the INL and ENOB’s of converters. The converter has been modeled using unit current sources or unit resistors to determine every quantization level. In Fig. 13 the results of this program for a 10-bit converter are shown. A total number of 1000 ”converters“ have been analyzed using this program. At the same time the ratio between the largest distortion component and the signal component defined as Spurious Free Dynamic Range (SFDR) is analyzed too. The matching of the unit elements has a of 2.5 % in this simulation. In Fig. 14 a histogram of the converters as a function of the INL is shown. This
131
histogram shows that of the converters reaches ± LSB INL and are within 1 LSB INL. In the range of 8 to 12 bits of resolution of a converter identical simulations have been performed. As a result of these simulations Fig. 15 shows the relation between the required unit element matching and the number of effective bits (ENOB’s). Results for and yield of a converter are shown too. It must be noted, that in case a segmented or binary weighting in a converter is used, then the matching accuracy between the segmented elements or the binary weighted elements increases according to the value or the amount of elements used to obtain the required weight. In practice mostly a number of elements is put in parallel to increase the unit value. As a result the accuracy increases with The finite matching of components in a converter results in a limited linearity of such a converter. A very useful relation between INL and reduction in ENOB’s of a converter is proposed. The finite INL results in a systematic error signal that introduces
132
133
distortion. Because the INL is directly related to the LSB of a converter the distortion introduced will be related to the quantization error using a simple ”fitting“ model. Identical to what has been done with clock jitter the dynamic
range of a converter changes according to:
Here is the peak to peak systematic signal distortion component due to finite converter accuracy. This value gives the worst case condition, because it is not known how the INL curve as a function of the signal value behaves. A Fourier analysis would give exactly the value of the different distortion components and in that way a better estimation of the total distortion can be obtained. To verify the model the ENOB reduction will be calculated. The ENOB’s reduce with:
134
This model is valid for yield of converters. In Fig. 16 the simplified model is inserted into the ENOB simulation using Mat Lab. In this figure only a limited range of INL is shown. However, the model has been verified over larger variations of INL. In Fig. 17 the worse case reduction in ENOB’s of a converter as a function of the INL is shown. This graph is very usefull to get quick information about the converter resolution and the linearity.
5
MOS matching models
Designing converters with unit elements, then the matching accuracy between the elements as a function of the number of bits is known. A 10-bit converter needs for yield and ± LSB INL a matching of 2.5 %. In case we want to increase the yield to then the matching must be increased to 1.25 %. In MOS technology information about matching of components is available as a function of technology and a limited amount of model parameters
135
[3]. Suppose the MOS devices axe in saturation then:
The first condition that will be considered is: 1) Equal Drain Currents so:
This results in:
Defining small difference between the two MOS devices using:
We obtain:
Working out this equation we obtain:
136
The matching of an MOS pair is equally influenced by the threshold matching or by the slope mismatch if:
In practical MOS technologies
In Fig. 18 the threshold matching of MOS devices having a by device size versus the gate oxide thickness of the technology is shown. From this figure it can be seen that the mismatch reduces with decreasing gate oxide thickness. The validity of this relation has been proven even for submicron technologies. The
gain mismatch of MOS devices versus gate oxide is shown in Fig. 19. From this figure we see that the gain mismatch is nearly independent of the gate oxide thickness. This means that
137
with increasing drain current the mismatch of a differential pair or a current mirror will become independent of technology limited). If
then In practice this means that the current density in the MOS device must be below a value corresponding with the given gate-source voltage. The calculated offset is valid for MOS devices with a by gate size. Increasing the size of the devices it is known from literature that the offset decreases with increasing device area or:
In Fig. 20 the measured threshold mismatch has been plotted against the device size of an MOS transistor. Increasing the size reduces the mismatch according to equation 57. The designer has the option to size the devices regarding offset. In a
138
practical situation the device size variations are limited. A 1 to 100 size variation is still possible, however, the capacitance of the devices increases. This results mostly in an increase in biasing current and thus power. 2) Equal gate-source voltage:
With the devices in saturation we obtain:
Solving these equations for a difference in drain currents:
139
Inserting small difference between the drain currents using:
we obtain
Using a first order approximation for the square root we get:
The variable
can be replaced by:
Then we obtain for the current mismatch:
At small current densities we have that:
The current mismatch at small current densities can be simplified into:
At large current densities the matching is determined by:
Note that the calculated current offset is again valid for 1x1 sized devices! When the size of the devices is changed then the offset varies depending on of the gate area.
140
The final mismatch a small current densities and device size dependent becomes:
In case of large current densities we obtain finally for the current mismatch:
In Fig. 21 the measured gain mismatch versus the device size WL is shown. The designer has again the option to scale the
device size to reduce the current offset.
6
Digital-to analog converter architectures
In this part different architectures to construct digital-to-analog converters in CMOS technology will be given. What architecture will be used depends on the application field and the choices a designer makes. Only a few examples can discussed. Output signals can be a current or a voltage. Mostly differential structures
141
will be used producing the converted output signal and its complement. In a differential operation of a system mostly a very good symmetry exists resulting in the absence of even order distortion components in the output signal and in the quantization error. Differential systems can furthermore apply a twice as large output signal to the load. This is important in CMOS submicron technologies that have smaller breakdown voltages (about 1 V). Single ended systems on the other hand might show odd and even distortion components at half the output swing. A large output swing is preferred to improve the dynamic range in a system application. Cross-talk from other system parts may limit the dynamic range in such an application. Differential operation improves the performance by rejecting part of the cross-talk.
6.1
10-bit current mode digital-to-analog converter
Suppose we want to design a 10-bit digital-to-analog converter with a 1σ INL of ± .5 LSB. The technology we have shows a of 2 % and we want to use 1023 equal devices to generate all the current steps [4]. The DC current is set at a value that the threshold mismatch equals the gain mismatch. This means that the average element mismatch becomes:
To obtain a 1 INL of ± .5 LSB an element matching accuracy of 2.5 % is required. This means that we have to increase the unit device size to at least Depending on how the output signal is generated, a cascode current source construction might be needed to make the matching independent of the drain-source voltage of the current generating elements. Mostly cascoded stages are used to avoid output signal dependent matching problems. The next design choice is: switching unit current sources using 1023 switches or using a binary weighted construction of the digital-to-analog converter. The unit current switch-
142
ing has the advantage of generating small glitches and having a good Differential Non-Linearity. A problem is the systematic error that can be introduced because of different lengths in clock wires to control the switches in the layout. A very careful layout is needed having in mind that 100 metal interconnect gives a systematic timing error of 1 psec. The binary weighting of the currents by connecting over a layout area distributed current sources to get the binary weighting causes mostly larger glitches because of a more accurately needed timing in the on and off switching of the currents and increase the DNL to about ± 1 LSB. In many designs a combination of segmented current sources (equal to 8 or 16 times the LSB current) and unit weighting is used. In Fig. 22 an example of a 10-bit digital-to-analog current generating network is shown. In this network only equal sized MOS devices are used. Note that in a layout of such a network at least one row of dummy transistors must be added at the outside to improve the overall matching. In case a current to voltage converter is used to sum the output currents of this network, then a cascode current source may not be needed. However, the voltage drop across the switches must be equal in all cases to avoid current modulation due to a variation in the drain-source voltage of this network.
143
6.2 10-bit Coarse-Fine voltage mode digital-to-analog converter In most IC processes the matching of resistors is a lot better than the matching of the active elements. Resistor matching depends on size and mask accuracy of the technology [5, 6]. Without too much difficulty a resistor matching better than 0.25 % is obtained. This means that the resolution and accuracy limits of resistor matching dominated designs are between 12 to 14 bits without needing special precautions. In Fig. 23 an example of a 10-bit coarse-fine resistor matched digital-to-analog converter is shown. As is seen from Fig. 23 the system consists of a coarse
ladder using rather low valued resistors to obtain the coarse converter levels. Across each coarse converter level a fine ”ladder“ is connected to obtain the fine steps. At each step a switch has been connected that will be controlled by the input digital data and then an output voltage is generated. Analyzing this system we can see, that the output impedance of the total system depends on the digital code applied to the converter. Furthermore
144
all these switches are at the output terminal connected together giving a large output capacitance. As a result of this variable output impedance a different signal dependent delay of the analog output signal is found resulting in distortion. Secondly a high impedance loading is required. A buffer amplifier can be used to decouple the output load from the converter, however, this buffer can introduce distortion due to slew rate limitations and finite bandwidth. Furthermore generating output voltages from 0 to makes the buffer difficult to design.
6.3
Continuous current calibration converter
When the resolution of a converter increases, then the matching accuracy of the individual elements must increase. However, the increase in matching can so far be only obtained by increasing the device size WL. In submicron technology, however, this increase in size can become unpractically large. At that moment other techniques are required to obtain the high accuracy [7]. Furthermore scaling of technology does NOT reduce the size of the current network because the gain mismatch dominates the accuracy. As has been shown the gain mismatch is technology independent and therefore the sizing of the devices can not be used. Calibration or Dynamic Element Matching techniques can be used to improve matching accuracy without increasing size. The continuous current calibration principle is another possibility. In Fig. 24 the basic idea of current calibration is shown. As is seen from Fig. 24 the calibration principle has two states. During calibration the MOS device is via connected as a diode and switch supplies the calibration current to the diode. At this moment across the gate input capacitor a voltage is generated that fits exactly the input current During the operation of the system, the switch is opened and the switch connects the drain of to the output terminal. The voltage on the gate in principle remains fixed, resulting in an output current to be exactly equal to In a practical situation, however, the
145
operation of the system is not as expected before. Because of leakage currents introduced by the drain-substrate diode of the switching MOS and the charge feed through of this switch, a rather large error is found. To overcome these problems the basic system has to be modified into the circuit shown in Fig. 25. The
basic operation of the system is identical to the circuit shown in Fig. 24. However, in this system an extra constant current
146
being 95% of is added. The calibration now takes place on the ERROR signal and NOT on the full signal This means that errors only influence the accuracy of the calibrated ERROR signal. An improvement of at least a factor 20 with respect to the original system is obtained. The application of the continuous current calibration system into a 16-bit digital-to-analog converter is shown in Fig. 26. The 16-
bit converter consists of 65 current sources that are continuously calibrated using an interchanging system. One output current of this high-accuracy 6-bit coarse converter network is subdivided using a MOS only 1024 element binary weighted current divider. The output currents of the coarse and the fine elements are supplied to the output switches. These switches are controlled by the digital input signals and so the digital-to-analog conversion
147
takes place. Depending on the practical design limitations the switching spikes and small calibrated current mismatches extra quantization errors are introduced. As long as the sampling clock and the calibration/interchanging clock are not correlated these errors can be below the quantization error. In that case only a slight deterioration of the dynamic range of the converter is found.
7
Conclusion
The following conclusion can be obtained from this paper: Spectra of quantization errors and the influence of the amplitude on distortion and cross-modulation products have been calculated. Quantization errors have minor influence on the performance of practical converters with finite linearity A relation between element matching and overall linearity (INL and DNL) has been practically determined A practical ”fitting“ model giving the relation between linearity and Effective Number of Bits has been demonstrated Distortion in a converter is dominated by the matching accuracy of the elements used The influence of sampling clock jitter on the Effective Number of Bits of a converter has been determined Systematic layout problems resulting in timing errors have been determined and analyzed Matching parameters of MOS devices have been determined Practical solutions for converters using element matching parameters and system solutions to obtain a very high accuracy have been discussed Depending on the required performance of a digital-to-analog converter a designer can find a number of design rules to help with architectural and circuit design issues.
148
8
Acknowledgment
The author wants to thank Frank van der Goes of Broadcom Netherlands for the Mat Lab programming.
149
References [1] N.M. Blachman, “The Intermodulation and Distortion due to Quantization of Sinusoids” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-33, No. 6, pp. 1417-1426, December 1985. [2] R.J. van de Plassche, “Integrated Analog-to-Digital and Digital-to-Analog Converters” Kluwer Academic Publishers, ISBN 0-7923-9436-4, 1994. [3] M.J.M. Pelgrom, A.C.J. Duinmaijer, A.P.G. Welbers, “Matching properties of MOS transistors” IEEE Journal of Solid-State Circuits, vol. 24, pp. 1433-1439, October 1989. [4] H.J. Schouwenaars, D.W.J. Groeneveld, H. Termeer, “A stereo 16-bit CMOS D/A converter for digital audio” IEEE Journal of Solid-State Circuits, vol. SC-23, pp. 1290-1297, Dec. 1988. [5] M.J.M. Pelgrom, “A 10-b 50-MHz CMOS D/A converter with buffer” IEEE Journal of Solid-State Circuits, vol. 25, pp. 1347-1352, December 1990. [6] P. Holloway, “A trimless 16-bit digital potentiometer” ISSCC Digest of Technical Papers, pp. 66-67, February 1984. [7] D.W.J. Groeneveld, H.J. Schouwenaars, H. Termeer, “A self calibration technique for monolithic high-resolution D/A converters”, IEEE Journal of Solid-State Circuits, vol. SC-24, pp. 1517-1522, Dec. 1989. [8] A.W.M. van den Enden, N.A.M. Verhoekx, “Discrete-time signal processing,” Prentice Hall, 1989. [9] W.R. Bennett, “Spectra of quantized signals” Bell System Technical Journal, vol. 27 pp. 446-472, July 1948. [10] M. Schwartz, “Information transmission, modulation, and noise,” McGraw-Hill, 1980.
150
[11] A.B. Carlson, “Communication systems” McGraw-Hill 1975. [12] K-C. Hsieh, Th.A. Knotts, G.L. Baldwin, T. Hornak, “A 12-bit 1-Gword/s GaAs digital-to-analog converter system,” IEEE Journal of Solid-State Circuits, vol. 22, pp. 10481055, Decemebr 1987. [13] G. Wegmann, E.A. Vittoz, “Analysis and improvements of accurate dynamic current mirrors”, IEEE Journal of SolidState Circuits, vol. 25, pp. 699-706, June 1990.
Design Considerations for a Retargetable 12b 200MHz CMOS Current-Steering DAC J. Vital, A. Marques1, P. Azevedo, J. Franca ChipIdea-Microelectrónica, S.A., Porto Salvo, Portugal
Abstract This paper addresses design considerations for highspeed moderate-to-high resolution current-steering digital-to-analogue converters (DACs) in CMOS technology. A design example of a 12b 200MHz DAC in CMOS digital technology is used to illustrate the design techniques, which are then validated through experimental results obtained from the integrated prototypes. Additionally, some techniques used to render the layout of this DAC easily retargetable are also explained.
1. Introduction High-speed, medium-to-high resolution digital-to-analog converters (DACs) are essential blocks in graphical interfaces and in many transmit ports of modern communication systems. In these applications, the current-steering DAC architecture has become a widely used platform, owing it to its linearity, dynamic behaviour, robustness and power efficiency. This paper makes an overview of the most well known techniques for designing current-steering DACs, and describes in more detail a specific implementation of a 12b 200 MHz DAC in a CMOS 1
Augusto Marques was with ChipIdea - Microelectrónica, S.A. until May 2000. Since then he has been with Silicon Laboratories, TX., U.S.A.
151 J. H. Huijsing et al. (eds.), Analog Circuit Design, 151-170. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
152
technology. Section 2 is dedicated to various aspects of importance for architecture selection, and in Section 3 the requirements for static performance are analysed. Section 4 is dedicated to the circuit design alternatives, focusing on the implementations used in the 12-bit design example. Finally, the integrated prototype is described in Section 5, together with the experimental results.
2. Architecture Selection 2.1 Basic Architecture The basic topologies of a current-steering DAC are shown in Fig. 1. The simpler forms of this type of DACs are just digitally programmable current sources/sinks that dump their output current into the load. The resistive part of the load is responsible for the static current-to-voltage conversion function, whereas the capacitive part represents the ultimate limitation for the settling behaviour of the resulting output voltage. As a remark, it is important to notice that most of the implementations use, in fact, two complementary outputs, such that the internal elementary current sources can be steered from one output to the other without the need to be shut-off. This is extremely important for achieving the best performance at high frequency.
The basic topologies can be further refined by adding a fixed halfscale current sink/source, such that the total output current can assume both positive and negative values, as represented in Fig. l(c). Finally, a generic topology employing an additional output block can also be considered, as shown in Fig. l(d). This block can represent a
153
transimpedance amplifier for current-to-voltage conversion, allowing more flexible driving capabilities, and/or an output re-sampler for more sophisticated output formats other than the zero order hold, which have benefits from the frequency domain performance point of view [1]. In this paper only the simplest topologies will be considered, since they are the most suitable alternatives for very high update rates. 2.2 Decoding Options One of the most important distinguishing factors in current-steering DACs is the way the digital input data is decoded to drive the internal elementary current sources and implement the digitally programmable output current. The two opposite alternatives for an N-bit DAC are represented in Fig. 2. The simplest scheme is obtained by organising the current sources in N elementary binary weighted current sources, as represented in Fig. 2(a). This requires no decoding, because the bit weights are directly assigned to the current sources, resulting in a digital part with low complexity. However, there are significant drawbacks of this scheme, especially related to major bit transitions [2]. In this topology, major bit transitions result in the most significant bit (MSB) current source being switched to one output and all the others being switched to the other. On one hand, it is very difficult to guarantee good differential nonlinearity (DNL) and monotonicity, since it is necessary to ensure that the MSB current source matches to within 0.5 least significant bit (LSB) the sum of all the other current sources plus one unity. This imposes a very stringent requirement on the allowable mismatch on the elementary current sources. On the other hand, the fact that all the elementary current sources are simultaneously switched at these major bit transitions produces large glitch areas, resulting in large spurious components in the frequency domain. The opposite decoding scheme uses a thermometer decoder to individually control all the elementary current sources. This has advantages in terms of required matching to guarantee good DNL and monotonicity, because the unit current sources are incrementally
154
switched from one output to the other as required. In addition, the glitch area is proportional to the amplitude of the transition steps in the output current. This means that glitches are linearly related to the signal, resulting in signal filtering rather than distortion [2]. The only drawback of this decoding scheme is area, because the number of decoding elements is now proportional to The best alternative for thermometer decoding is achieved by organising the current sources in a matrix and by using fast row-column decoders, which, in turn, address local decoders associated to the current sources.
2.3 Segmentation In order to get the advantages of the thermometer decoding without penalising too much the area of the DAC, some sort of segmentation of the input digital word is normally introduced. In the most popular segmentation scheme, the M MSBs address elementary current sources of value using thermometer decoding, whereas the L least significant current sources have a binary weighted arrangement directly addressed by the LSBs [3].
155
The selection of the right segmentation into thermometer decoding and binary-weighted arrangement depends on the trade-off between such factors as required area for DAC implementation, simplicity of the decoding scheme and tolerable level of dynamic non-idealities. The decoding scheme must be kept simple and compact to allow very high update rates without significant degradation of the dynamic behaviour [4]. A 6-bit (3+3) row-column decoder uses 3-input gates, whereas an 8-bit (4+4) row-column decoder uses 4-input gates. As the number of inputs required in the basic gates is increased, their intrinsic speed is progressively decreased if no pipelining schemes are employed. Therefore, in high-speed practical implementations the number of MSBs involved in the thermometer decoding scheme has been limited to a maximum of 6 to 8. The selected segmentation for the 12-bit design example considered here is 6+2+4 [4]. The M=6 MSBs use a 3+3 row-column decoding to simultaneously address four 6-bit DACs connected in parallel and arranged in a fully symmetrical way with respect to the centre. This specific arrangement, together with the adopted switching scheme, implements an effective compensation for the systematic errors present across the matrix of current sources. This will be further analysed in the next section. The I=2 intermediate bits also use thermometer decoding and are implemented with the non-used current source in the 8×8 matrix of each of the four 6-bit DACs. The required scaling factor of 4 between the MSB elementary current source and the one for the intermediate bits is intrinsically obtained in this way [4]. The remaining L=4 LSBs directly address 4 binary weighted current sources, which are obtained by subdivision of the elementary current sources in the matrixes. The overall segmentation is, in fact, logically identical to the one presented in [4], but its electrical and physical implementation was simplified and made more compact.
3. Static Performance The static behaviour of the DAC is affected by a number of factors, the most important of which are random mismatches in the current sources, systematic errors due to gradients on process, stress and
156
temperature, errors due to voltage drop in the power distribution lines, and finite output impedance. These factors are discussed in the following sections. 3.1 Random Mismatch of Current Sources Due to the adopted segmentation, the current source matching requirements to satisfy the condition DNL<0.5 LSB are largely relaxed. In fact, the most critical transitions for DNL occur when bit transitions in the MSB segment correspond to a simultaneous change of state of all the other least significant bits. In this particular segmentation scheme, unit current sources must be matched to within 0.5 LSB to other unrelated unit current sources. The condition to satisfy can be written as [5] The relative standard deviation of the unit current source is obtained by considering L=4 and I=2 in (1), leading to The requirements to guarantee a good INL can be obtained with the help of a Monte Carlo analysis to understand the relationship between the relative standard deviation of the unit current sources and the yield to achieve INL<0.5 LSB. The results of such an analysis with a simple MATLAB model using Gaussian distributions for the unit current sources are presented in Fig. 3, for 8-bit, 10-bit, 12-bit and 14bit DACs. Closed form analytical expressions to obtain such results have also been derived in [6]. These results depend only on the equivalent number of total unit current sources employed in the DAC, and are independent on the type of segmentation adopted. It can be concluded that a relative standard deviation of must be considered for a 12-bit DAC to obtain an INL yield of more than 99%. The requirements imposed by the INL condition are, therefore, more stringent than those imposed by the DNL condition, owing it to the use of thermometer decoding in the MSBs. Given the above constraint and assuming that a statistical mismatch characterisation of the process [7] is available, the minimum gate area
157
to be used in the MOS transistor that defines the unit current source can be estimated by
On one hand, the overdrive voltage of the MOS transistor defining the current source must be the largest the possible to minimise the area of the current sources. On the other hand, the maximum overdrive voltage of these transistors is limited to the headroom available between the supply voltage and the combination of output full-scale voltage together with the various drain-to-source voltages of saturated transistors in the current source and switch structure. The present 12-bit DAC case study must be supplied at a minimum of 2.7 V. This value, together with an output swing of 0.5 V, limits the of the current source to a maximum of 0.8 V. The result is and for a full-scale current of 20 mA.
3.2 Systematic Errors and Switching Strategy Systematic errors produced by various processing and environmental factors are well known disturbing elements in current-steering DACs. However, their effects can be partially compensated by using spatial
158
arrangements in the matrix of current sources controlled by the MSBs, together with specific sequences for switching the current sources as a function of the MSB code. Many different strategies were proposed so far, with different properties of error accumulation depending on the type of spatial error profile considered [2, 4, 8, 9, 10]. From a brief analysis we can conclude that complex switching sequences can be very effective to compensate the DC performance of the DAC, but they suffer from a fast degradation of the characteristics when the input signal frequency is increased. A good trade-off between DC performance and switching scheme complexity must be obtained. The present 12-bit design example uses the spatial arrangement and switching sequence proposed in [4]. The matrix is arranged in four quadrants fully symmetric with respect to the central horizontal and vertical axis, as shown in Fig. 4. This topology implements a twodimensional (2D) centroid-switching scheme capable to better compensate for 2D linear and parabolic errors [4]. In fact, 2D lineartype gradients are fully cancelled, as it expected from a commoncentroid arrangement. For a 2D parabolic gradient, as shown in Fig. 5(a), the resulting INL is represented in Fig. 5(b). This represents an improvement by a factor of more than two when compared to a conventional hierarchical switching scheme.
159
3.3 I×R Drop Effects High speed current-steering DACs are normally designed for large full-scale currents, to be able to generate moderate voltage output swings on small resistive loads. Typically, full-scale currents of 10 mA to 20 mA are considered, which means that significant voltage drops can be generated along the power distribution lines in the matrix of current sources if these lines are not well sized. This fact can be responsible for a degradation of the INL characteristic in high-speed high-resolution current steering DACs. To correctly estimate the sizing of the power lines in the 12-bit case study DAC under analysis, a model of the interconnections was used. Fig. 6 shows the distribution of the voltage drop of the positive supply across the matrix of current sources, and its influence on the INL with the adopted 2D centroid-switching scheme.
160
3.4 Output Impedance The finite output impedance of the current source is the last considered effect in this paper for static performance degradation. It is a source of linearity degradation, since the value of each unit current source will be a function of the output voltage of the DAC. It can easily be concluded that the DAC output voltage can be expressed as [3]
where is the input code, corresponding to the number of unit current sources switched to the output, represents the resistive output load and is the output conductance of the unit current source. The condition to meet the INL<0.5 LSB is then given by
This condition is normally easy to satisfy using cascode current source topologies, due to the large channel lengths required to satisfy the conditions imposed by matching on the current sources. Furthermore, the nonlinearity present in (3) predominantly generates second order harmonic distortion, which in differential applications is further suppressed. In the 12-bit DAC under analysis, the output load is (doubly terminated cable), which requires a total output impedance for the DAC larger than
4. Circuit Design for Dynamic Performance 4.1 Current-Source Design In order to satisfy the output impedance requirements for the 12-bit DAC under consideration, a cascode current source must be considered. In addition, it was decided to implement the DAC current sources with PMOS transistors since the output voltage is naturally built on a grounded load. In addition, improved substrate noise isolation could be achieved, although this may only be effective at low frequencies. Fig. 7 shows two possible alternatives for PMOS
161
cascode current source implementation. The alternative on Fig. 7(b) was proposed in [4] as a means to reduce the feedthrough of the control signals q and qz to the outputs. However, this topology has an inherent asymmetry of the falling and rising transitions due to the fact that the pole at the source of the cascode transistor starts to move to lower frequencies when the current in the corresponding branch is cut-off. Therefore, the falling transition settling is slower than the rising transition, being therefore a potential source of distortion. The implementation presented in this paper adopted the more conventional current source topology represented in Fig. 7(a).
4.2 Switch Drivers The control nodes q and qz must swing in a limited voltage range, sufficient to steer the tail current from one branch to the other while keeping the feedthrough from these nodes to the outputs at controlled levels. In addition, the impedance defined at these nodes is important, as it may be responsible for low-frequency poles degrading the settling behaviour of the DAC. In this design was set to a clean analogue ground to simplify the switch driving scheme. For a singleended 0.5V output swing, this value for guarantees that the PMOS switches are saturated when steering the current to the output. This represents another contribution to increase the output impedance of
162
the DAC. In addition, the low impedance of results in a good settling behaviour of the steering action. The generation, less critical for the settling behaviour, is performed by diode connected NMOS transistors biased at a constant current. The complete switch driver scheme is shown in Fig. 8.
4.3 Synchronization and Timing Equalization The synchronization of the switching instants in all the elementary current sources in the DACs is fundamental to get a good dynamic performance and a low glitch area. Additionally, as PMOS switches are employed, the crossing point of the control signals q and qz must be kept low to guarantee that the switches never cut off simultaneously. A clocked latch scheme can be used to provide the required synchronisation together with the necessary overlapping between q and qz. In this implementation a ratioed-logic latch was used, as depicted in Fig. 9 [4].
163
This scheme easily guarantees that synchronisation is achieved in the elementary current cells in the matrix. However, to further reduce the glitch area in code transitions, it is also necessary to equalise the switching timing in the binary-weighted LSB current sources to the timing present in the matrix, because the circuitry is scaled-down. In theory, this could be performed by scaling down the latch according to the load imposed by the switch drivers in the LSB section. As this is not easy to achieve, in this implementation a different strategy was adopted. The latches and switch drivers in the LSB section are exactly the same as in the matrix, and the switches, whose gate widths are scaled-down in width in the same proportion as the current source does, are complemented with the geometries removed in the scalingdown process as dummy structures. This results in an effective timing equalisation and also in a simple and very regular implementation in the layout, which also improves timing.
5. Integrated Prototype and Measured Results 5.1 Retargetable DAC Prototype The layout of the DAC prototype in a digital CMOS technology is shown in Fig. 10. It was conceived with a similar basic principle of the DAC described in [4], which consists on the removal of latches and local decoders out of the matrix for improved matching on the current sources and reduced coupling of digital circuitry into the sensitive analogue part. The organization of the matrix closely follows the explanation in Fig. 4. Two additional columns and rows of dummy cells have been added to the surrounding edges of the matrix for improved matching. The LSB section of the current sources is implemented in one of these dummy rows on the top, while the current mirrors for bias generation are implemented in the dummy rows on the bottom. The switches, switch drivers, latches, local decoders and column and row decoders were organised in a stack along the left side of the matrix, leading to a very compact layout. The core cell area is 1mm×2mm.
164
The layout of this DAC was developed for easy retargetability. This was made possible by its own modularity and by conceiving a number of parameterised cells integrated in Cadence Design Framework. The full layout can be instantly modified for different sizing of current sources, decoding logic and driving circuitry. Some examples of different instantiations of parameterised cells used in the DAC layout are presented in Fig. 11.
165
5.2 Test Set-up An RF test set-up was prepared for the characterisation of the highspeed DAC prototypes. The die was mounted on a ceramic substrate containing local terminating resistors for the digital signals and for the output voltages, and also decoupling capacitors for the supplies. This assembly was enclosed in a metalic case, as shown in Fig. 12.
166
The full-scale current was set by an external precision current source, and the output load was defined by the local terminating resistor together with a cable terminated by the equipment. The input data was supplied to the DAC by a high-speed pattern generator, and the measurements of the output were performed in either single-ended or fully-differential mode, depending on the test. 5.3 Static Characteristics For the static characteristics, the DAC was clocked at the nominal rate of 200 MHz and a very slow staircase was applied to its input code, while the single-ended output was measured by a digital multimeter. The resulting INL and DNL characteristics obtained with a best straight line method are presented in Fig. 13. The INL is within ±0.65 LSB, while the DNL is less than ±0.3 LSB. The good results are an indication that the adopted measures for improving static performance were effective.
5.4 Frequency Domain Performance The frequency domain characteristics have been obtained by programming full-scale sinewaves of various frequencies in the pattern generator running at the nominal rate of 200 MHz, and by coupling the differential output of the DAC to a high frequency spectrum analyser by means of a wide-band transformer.
167
The first results presented in Fig. 14 were obtained with a low noisefloor spectrum analyser, and reflect the performance of the DAC operating at nominal update rate of 200 MHz with output frequencies up to 20 MHz. In these conditions the spurious free dynamic range (SFDR) is always above 70 dBc.
168
For higher output frequencies a high-bandwidth spectrum analyser was used. The result in Fig. 15 (a) corresponds to the same situation indicated in Fig. 14(e) and is included here to compare the type of noise-floor existing in the high-bandwidth spectrum analyser. Fig. 15(b) shows that the SFDR of the DAC clocked at 200 MHz starts to fall very quickly for output frequencies above 20 MHz. At 40 MHz the SFDR is 51 dBc. Although the DAC was designed for a nominal clock rate of 200 MHz, the design had to satisfy all the corners of process, temperature and supply voltage. This means that in typical conditions the frequency of operation can be higher. Fig. 15(c) and Fig. 15(d) show the type of performance that can be reached at 500 MHz and 800 MHz clock rate.
169
5.5 Power Dissipation The current consumption can be divided into a static part, which is independent on the clock rate and input activity, and a dynamic part. In this prototype the measured static current consumption is 40 mA, while the dynamic current consumption is 20 mA for a clock rate of 200 MHz and an output frequency of 20 MHz. This leads to a total power dissipation of 180 mW at 3 V power supply. The overall characteristics of the DAC are summarised in Table I.
6. Conclusions Design considerations have been presented for high-speed currentsteering DAC, with a special focus on a specific implementation of a 12-bit 200 MHz DAC in a CMOS digital technology. The presented design techniques were supported by experimental results of the integrated prototype. Some considerations for layout retargetability of such DACs were also introduced in this presentation.
Acknowledgements The authors would like to acknowledge to ESAT-MICAS, K.U. Leuven, in particular to Prof. W. Sansen, for having kindly supported the experimental characterisation of the prototypes at the Laboratory of the University. The contributions of P. Jesus to the design and characterisation of the prototypes are also acknowledged.
170
References [1] A. Bugeja, B.-S. Song, P. Rakers, S. Gillig, "A 14b l00Msample/s CMOS DAC Designed for Spectral Performance", Proc. ISSCC1999, 148-149, Feb. 1999. [2] C.-H. Lin, K. Bult, "A 10-b, 500-Msample/s CMOS DAC in 0.6 ", IEEE JSSC, Vol.33, No. 12, pp. 1948-1958, Dec. 1998. [3] T. Miki, Y. Nakamura, M. Nakaya, S. Asai, Y. Akasaka, Y. Horiba, "An 80-MHz 8-bit CMOS D/A Converter, IEEE JSSC, Vol. SC-21, No. 6, pp. 983-988, Dec. 1986. [4] J. Bastos, A. Marques, M. Steyaert, W. Sansen, "A 12-bit Intrinsic Accuracy High-Speed CMOS DAC", IEEE JSSC, Vol. 33, No. 12, pp. 1959-1969, Dec. 1998. [5] A. Bosch, M. Borremans, M. Steyaert, W. Sansen, "A 10-bit 1GSample/s Nyquist Current-Steering CMOS D/A Converter", IEEE JSSC, Vol. 36, No. 3, pp. 315-324, Mar. 2001. [6] A. Bosch, M. Steyaert, W. Sansen, "An accurate Statistical Yield Model for CMOS Current-Steering D/A Converters", Proc. IEEE ISCAS 2000, Vol. IV, pp. 105-108, May 2000. [7] M. Pelgrom, A. DuinMaijer, A. Welbers, "Matching Properties of MOS Transistors", IEEE JSSC, Vol. 24, No. 5, Oct. 1989. [8] Y. Nakamura, T. Miki, A. Maeda, H. Kondoh, N. Yazawa, "A 10b 70-MS/s CMOS D/A Converter", IEEE JSSC, Vol. 26, No. 4, pp. 637-642, Apr. 1991 [9] T.-Y. Wu, C.-T. Jih, J.-C. Chen, C.-Y. Wu, "A Low-Glitch 10 bit 75-MHz CMOS Video D/A Converter", IEEE JSSC, Vol. 30, No. 1, pp. 68-72, Jan. 1995. [10] J. Vandenbussche, G. Plas, A. Bosch, W. Daemens, G. Gielen, M. Steyaert, W. Sansen, "A 14b 150Msample/s Update Rate Q2 Random Walk CMOS DAC", Proc. ISSCC1999, pp. 146-147, Feb. 1999.
HIGH SPEED CMOS DA CONVERTERS FOR UPSTREAM CABLE APPLICATIONS. Raf ROOVERS Philips Research, Prof. Holstlaan 4 5656AA Eindhoven, The Netherlands.
ABSTRACT
In the first part of this paper the function of Analog to Digital (AD) and Digital to Analog (DA) converters in communication systems is discussed. The relation between the system data rate and the converter data rate is explored and the impact on the power dissipation in the AD and DA converter is shown. The second part describes the realisation of a DA converter for cable upstream application.
1. INTRODUCTION
The demand for higher data bandwidth delivered to the home is a driving factor for AD and DA converter design. As the developments in AD and DA converter design evolved from audio and video signal converters in the eighties, a driving force in converter design is nowadays found in digital communication systems, both wired and wireless. To avoid the costs and delays associated with digging new cables to the homes, systems are built to get highest bandwidth out of the existing physical links to the homes: telephone and cable wires. The increasing possibilities of digital signal processing made systems feasible that maximise the transmitted data rate up to the theoretical limits. These are imposed by the physical constraints of the transmission medium related to the bandwidth and signal to noise limitations. 171 J. H. Huijsing et al. (eds.), Analog Circuit Design, 171-187. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
172
Various standards have emerged that coexist with already present services over copper wires and put specific requirements on AD and DA converters. After some general considerations on the AD and DA converters for digital communication systems, a specific design for cable upstream is discussed. 2. HISTORY AND EVOLUTION OF SYSTEMS
The evolution of copper wired digital data communication to the home is shown in figure 1. It started in the eighties with low bit rate voice band modem communication and led to modem-speeds of 33.6kbit/s which is close to what can be theoretically achieved according to the Shannon information theorem in telephone voice band. To increase data rates over existing phone wires, ISDN was defined, with data rates that are still far below the theoretical limits of the telephone wires. Presently the channel limits are nearly reached with the developments in xDSL technology with its different flavours. The only way to increase the data rate any further is to use another type of wire with higher bandwidth and/or better SNR.
Apart from telephone wire, cable networks connect homes with a high bandwidth cable, offering a theoretical capacity that is orders of
173
magnitude larger than phone wires. Originally the cable system was intended for one way broadcast services. When this cable is used to deliver bits to and from the home, its wide bandwidth has to be shared with other services and users. The effective available bandwidth for a single user is therefore orders of magnitude lower than the total cable capacity, but it is usually higher than the telephone wire capacity.
In contrast to the past, the physical constraints of the wires have become the limitation of the data rate. In order to get the maximum amount of data through the wire, both analog and digital signal processing is needed on both sides, as shown in figure 2. Moore’s law made digital signal processing abundantly available to implement complex coding and filtering while coding schemes have been developed that maximise data throughput and minimise symbol interference in the link. The digital signal has to be converted to an analog signal and conditioned to the required levels according to the wire properties. Both the DA and AD converters have data rates (n.fs) that are a factor above the netto system data rate and can even exceed the wire capacity. 3. AD AND DA CONVERTER: BIT/S , BW AND SNR
As AD and DA converters link the analog world to the digital world, it is here that analog bandwidth (BW) and signal to noise ratio (SNR) meets digital bandwidth or bit/s. This is shown in figure 3. The AD or DA converter can be regarded as a limitation to the analog BW and SNR, which is shown in the SNR - input signal frequency graph. This
174
SNR (or SINAD) defines an effective number of bits (enob) while the effective resolution bandwidth (ERBW) defines the usefull bandwidth of the converter. For an ideal converter the conversion capacity equals the number of bits at the output of the converter: i.e. every bit is an unit of information. For real AD converter, this is somewhat lower as both enob
4. AD AND DA CONVERTER PERFORMANCE
The performance of AD and DA converters is graphically plotted in a bandwidth-resolution graph (see figure 4). In this type of graph the bandwidth (X-axis) is defined as the useful bandwidth of the converter i.e. minimum of effective resolution bandwidth and Nyquist rate while the resolution (Y-axis) is defined as the effective resolution for low frequency input signals: enob=(SINAD-1.76dB)/6.02 with SINAD the
175
signal to noise and distortion for low frequency input signals. As a reference point, state of the art AD and DA converters from JSSC 89-90 and JSSC99-00 are plotted [JSSC]. A steadily progress has been made during the decade towards higher resolution and bandwidth. The two regions audio and video indicate the performance required for AD-DA conversion of these signals which is determined by the BW and SNR of microphone and image sensor signals. The audio converters became mainstream in early eighties and video converters in the late eighties and early nineties. Also plotted in the same graph the requirements for direct AD and DA conversion of a transmitted signal: twisted pair, coax cable. It is clear that for direct conversion of a cable signal (1GHz, 60 dB) with a single AD or DA converter is not yet feasible. Hence for cable signals, analog signal processing blocks are required to bring part of the cable signal into the bandwidth-resolution area where AD-DA conversion is feasible. On the other hand, even if conversion with a single AD or DA converter is feasible, this is not always the optimal solution from power consumption point of view. 5. ANALOG-DIGITAL SIGNALS IN A COMMUNICATION LINK.
The effective AD or DA bit rate is at least equal to the netto system data rate, and its maximum is related to the capacity of the transmission medium. In general the AD and DA converters effective data rate is positioned somewhere in between, depending on the architectural choices. The position of AD and DA within the signal path determines the factor by which the AD or DA effective bit rate exceeds the actual netto system data rate. This factor is called implementation factor as shown in figure 5. This implementation factor indicates how much of signal conditioning is done in either the analog or digital domain. Wired single user systems show the lowest factors while cellular (multi-user, wireless) systems have the highest factor. For some system realisations the implementation factor is
176
indicated in table 1. As long as AD-DA power dissipation is not dominant in the system, architectures with higher factor are preferred as these reduce the analog functionality in favour of more flexible digital functionality. Hence it is important to know the power dissipation as function of data rate.
177
6. POWER DISSIPATION COMMUNICATION LINK
OF
CONVERTERS
IN
A
The estimation of power dissipation of AD and DA converters is a rather difficult task as bandwidth and resolution can span several decades. The presented formula’s are only taking into account the effective resolution and bandwidth and the power consumption and excludes all other parameters (input range, technology used, need for external components or trimming, ...) and is shown in figure 6. A figure of merit is calculated from existing AD and DA realisations based on the formula in figure 6.
The same formula can be used to predict the power dissipation in AD converter based on effective data data rate:
This formula states that power dissipation is proportional to the effective AD performance in signal to noise and useful bandwidth A state of the art realisation has a figure of merit ranging from 1 to 10 pJ. For DA converters a similar is defined with state of the art values ranging from 0.5 to 10 pJ. The power dissipation can be plotted in the bandwidth resolution graph as equal power dissipation lines as shown in figure 7. It is Interesting to compare these lines with lines of equal information capacity (bit/s):
178
both the power dissipation and the information capacity are linearly proportional to the bandwidth the SNR proportionality is different due to the log function in the information capacity expression
Based on figure 7 an energy / information bit can be calculated by dividing the information capacity by the power dissipation and normalise it for BW. This is an ideal case with the assumption that implementation factor is 1: i.e. an effective output bit of the AD-DA equals an information bit. This also implies a theoretical optimal coding and ideal an analog circuit implementation without any losses.
7. AD-DA CONVERSION POWER PER BIT
Finally all these assumptions can be used to make a prediction of the power dissipation of AD-DA converters in a telecom link as a function of the netto data rate, the implementation factor and the bandwidth used, together with the introduced FoM. In figure 8 the power dissipation (Y-axis) is plotted as function of data rate (X-axis) for given bandwidth and implementation factor. It shows that for relative low data rates the implementation factor is not that important and can be chosen high. However, for high data rate systems, the right choice of implementation factor can make a big difference in power dissipation.
179
8. CABLE MODEM UP-STREAM : FUNCTIONALITY AND STANDARDS.
A part of the cable frequency spectrum is reserved for upstream data communication as shown in figure 9. This spectrum is shared by several homes and different standards have emerged (Davic, Harmony, Docsis, ..) that define the channel bandwidths, access schemes, modulation scheme, power control, out of band spurious, ... These standards are developed to coexist with other services on the cable. QPSK and QAM modulation schemes have been proposed with different data rates, channel allocation and power control.
180
9. WHERE TO PLACE THE DA IN SIGNAL PATH ?
The balance of what part of signal processing is done in analog or digital domain is determined by several factors : Feasibility of DA converter function Digital complexity vs analog complexity of function ~ silicon area Power dissipation for digital vs analog implementation of function For a QAM 16 modulation scheme the different options are shown in figure 10.
A first option is to use two DA converters with a relative low number of bits and sample rate. The implementation factor is low but additional analog circuits are required for upconversion, filtering and amplification. A second approach is to directly generate the modulated carrier in the digital domain, requiring a DA converter with about 10 bit resolution and sample rate of about 200 MS/s. The implementation factor is higher but the upconversion is done in digital domain offering a higher flexibility. A third approach is to implement also the complete variable gain range in the digital domain, requiring more than 16 bit of resolution. This last option is not feasible from DA converter design point of view. The second is very well feasible and turns out to be a cost effective and flexible solution: modulationupconverting is implemented digitally and the variable gain is analog.
181
10. CONDITIONS FOR DA SPECIFICATIONS DA converters for telecom application require dynamic performance. These DA are often specified by distortion and spurious free dynamic range when a single full scale sinewave is applied. In the actual application a modulated signal is generated with the DA, requiring spectral purity under these conditions. The use of single full-scale sinewave for testing or specifying the DA converter is not completely adequate. This is illustrated in figure 11 by showing a QPSK signal and its square component both in time and frequency domain. The square component has a power of 0.62 and a power density 0.47 compared to the QPSK signal.
Figure 12 shows the second and third order distortion components of – 50dBc of a single sinewave applied to a DAC together with the distortion components of a modulated signal applied to same DAC.
182
11. EXAMPLE OF UPSTREAM DA REALISATION The realisation of an integrated upstream signal path is shown in figure 13. It is based on a 10 bit DA core and a variable gain amplifier (VGA). The DA and VGA is realised in a CMOS technology and requires no process options as thick oxide transistor, dual or double poly. The power supply is 2.5V for the complete design. The nominal sample rate is 162 MS/s which is sufficient for channel frequencies up to 42MHz. The 10 bit DAC core provides 0.8Vpp differential signal level and the VGA adds 12 dB scaling. This results in 0.3-1.2 Vpp differential signal swing in 75 ohm load resistors. The remaining part of the power control is realised in the external line driver. The complete realisation of power control on CMOS is not feasible due to the required voltage levels.
183
12. CIRCUIT REALISATIONS
The integrated up stream signal path required a gain of 12 dB and a resolution of 10 bit in the converter. The DAC core and the VGA are separated by a Track and Hold (T/H) to de-couple the static and dynamic performance requirements. The timing accuracy is now localised in one switch (clk2) while the actual DAC core can operate with a less demanding clock signal. Only a 2.5 V power supply was available which significantly reduces the available internal signal swing. The realisation consists of three circuit parts : the DAC core, a Track and Hold (T/H) as de-glitcher and the VGA as shown in figure 14.
The 10 bit DAC core circuit is shown in figure 15. It consist of an array of 32x32 p-MOS current sources combined in 6b binary / 4b segmented configuration. Transistors are sized for static accuracy level (INL) of 0.3 LSB. The output of the current source array is internally converted into a voltage with a swing of 0.8 Vpp diff.
184
The T/H is added as a de-glitcher as shown in figure 16, reducing the demands on the clock accuracy for the DAC core. The DAC core switches can be driven by a digital clock, while the T/H uses the only accurate clock signal required. The T/H is based on a simple n-Most switch in between the in- and output buffers.
The variable gain amplifier / output buffer consists of a degenerated voltage to current converter and the gain steps are realised by switching additional transistors in a current mirror. Figure 17 shows the transistor implementation of the VGA.
185
186
The dynamic performance is limited by the and harmonics in the VGA. Figure 19 shows measured QPSK signal with 160 kSymbols/s. The carrier frequency is such that aliased order distortion components occur in the neighbouring channel at A measurement of single sine wave is shown in figure 20.
187
13. CONLUSIONS
Some considerations on AD and DA power consumption are made based on figure of merit, implementation factor in a digital communication system frontend and the netto data rate. Very high data rate communication system will require a low or moderate implementation factor to keep the power consumption in the AD or DA affordable and puts a limit on the digitisation of the frontend. A cable upstream signal path implementation is shown that uses a Track and Hold as a de-glitcher and has a part of the required power control range on chip. 14. REFERENCES
[JSSC] Selected AD and DA papers from Vol.24, Vol.25, Vol.34 and Vol.35 of IEEE Journal of Solid State Circuits.
SOLVING STATIC AND DYNAMIC PERFORMANCE LIMITATIONS FOR HIGH SPEED D/A CONVERTERS Anne Van den Bosch, Michiel Steyaert and Willy Sansen Katholieke Universiteit Leuven, Afdeling ESAT-MICAS Kasteelpark Arenberg 10, 3001 Heverlee, BELGIUM
ABSTRACT In this paper the factors determining the static and the dynamic performance of a current-steering CMOS D/A converter will be discussed. The impact of these factors will be converted in some design guidelines that have to be implemented in order to realize a D/A converter with a state-of-the-art performance.
1. INTRODUCTION The recent growth of the telecommunication market pushes the designer to put an increasing amount of effort in the integration of digital and analog systems on one chip. Consequently, the interface between these systems is becoming one of the most challenging blocks to design in the telecommunication devices of today. High performance D/A converters find applications in the area of broadband and wireless communications. Because they are inherently fast and cost effective, CMOS current-steering D/A converters are the ideal candidates for such applications. Until a few years ago, open literature used to mention mainly static specifications of D/A converters [1,2]. Recent publications [3,4,5] have revealed that a combination of a high update rate, a high 189 J. H. Huijsing et al. (eds.), Analog Circuit Design, 189-210. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
190
resolution and a good linearity up to the Nyquist frequency is difficult to achieve. Except in [6], where 60 dB of spurious free dynamic range (SFDR) was achieved up to a 200 MSample/s clock, the degradation of the output signal linearity starts at a few (tens of) MHz. In this paper both the static as the dynamic performance of a currentsteering D/A converter will be discussed in detail. In the first section, the D/A converter’s basic operation principles and the topology selection are discussed. Based on this analysis, some high speed 10 and 12-bit D/A converters have been realized that show a significant dynamic performance improvement. A 12-bit realization will be presented in section 6 of this paper as an example. Finally, a figure of merit will be defined that provides a fair method for a D/A converter’s performance comparison.
2. CURRENT-STEERING TOPOLOGY TRADE-OFF Current steering D/A converters are based on an array of matched current sources that are switched to the output according to the digital input code. Three different architectures are possible depending on the implementation of this array, namely the binary, the unary and the segmented architecture. Each architecture will be briefly discussed including some advantages and disadvantages. A comprehensive schematic overview is given in Fig. 1. 2.1. Specifications
The DNL error is the worst case deviation from an ideal one LSB step between two subsequent output codes. The INL error is defined as the maximum deviation from the D/A converters ideal transfer function. Both the INL and the DNL errors are static non-linearity specifications that determine the limit of the D/A converter’s performance at a low frequency. The INL specification has the same requirement independent of the architecture. The influence of this specification on the design of a current-steering D/A converter will be described in detail in section 3.1, where the relation between the technological matching properties and the performance of the D/A converter will be presented.
191
2.2. The Binary Weighted Architecture
In the binary implementation, every switch controls a current that is twice as large as that of the next less significant bit. The digital input code directly steers these switches. The advantages of this architecture are its simplicity (since no decoding logic is necessary) and the small required silicon area. On the other hand, a large DNL error and an increased dynamic error are intrinsically linked with this architecture. At the half scale transition, unit sources are switched on/off architecture and other independent sources are switched off/on. Assuming a normal distribution for the unit current sources with a standard deviation this step has a determined by:
This sigma, is a good approximation for the DNL. The this most significant bit transition is approximately a factor
at larger
192
than at the other bit transitions (with N the total number of bits and x the number of the most significant switching bit). 2.3. The Unary Decoded Architecture
In the unary decoded architecture every unit current source is addressed separately. The digital input code is converted to a thermometer code that controls the switches. The advantages of this architecture are its good DNL error and the small dynamic switching errors. In this architecture, the D/A converter has a guaranteed monotone behavior since only one additional current source has to be switched to the output for one extra LSB. The major disadvantage of the unary decoded architecture is the complexity, the area and the power consumption of the thermometer decoder. Performing similar calculations as in (1) for the unary architecture leads to the following results:
This formula mathematically represents the idea behind the unary decoding. The error between two consecutive codes is just the deviation on the additional unity current source. The DNL was defined as the maximum deviation at a single LSB transition. For an N bit converter, this means that the DNL is determined by the maximum when taking samples from a normal distribution with the sigma defined in (2). 2.4 The Segmented Architecture
To get the best of both worlds, most current-steering D/A converters are implemented using a segmented architecture. In this case, the D/A converter is divided into two sub-DACs : the B LSBs (least significant bits) are implemented using a binary architecture while the (N-B) MSBs (most significant bits) are implemented in a unary way. In this architecture, a balance between good static and dynamic specifications versus a reasonable decoder power, area and complexity can be found. Since the segmented architecture is a mixture of the previous two
193
architectures, the result for the most critical transition is of the same form.
Note that the formula (3) for the segmented architecture is a general formula that is valid for the binary (B=N-1) and the unary implementation (B=0).
3. THE STATIC PERFORMANCE OF A CURRENTSTEERING CMOS D/A CONVERTER 3.1 The influence of random mismatch
Due to the mismatch of the current source transistors, the INL specification of different D/A converters made in the same process technology will vary randomly. It is therefor important to be able to predict this specification within certain boundaries. For this purpose, the concept of the D/A converter's INL_yield has been introduced. This yield is defined as the percentage of functional D/A converters with an INL specification smaller than half an LSB (least significant bit). To obtain an accurate estimation of the INL_yield, the Monte Carlo simulation [7] is frequently used since the available yield expressions [8,9] do not provide the wanted accurate results. However, these simulations need a large amount of CPU time. Running a Monte Carlo simulation for a high-resolution D/A converter takes several hours and that is a major drawback for this approach. The statistical relationship has been investigated analytically, resulting in a new and accurate formula expressing directly the relationship between the INL yield specification, the resolution and the relative unit current standard deviation for the D/A converter. The basic idea behind this theory is based on the assumption that if at any point the error between the calculated and the ideal output value reaches half an LSB, there exists a 50% chance that the error increases and 50% chance that it decreases again since a normal distribution with mean
194
value zero is used. For the mathematical derivation, the reader is referred to [10]. The result is given by :
with is the relative unit current standard deviation, N is the resolution of the D/A converter and the value of the coefficient C is given by:
The is the inverse function of the normal cumulative function integrated from to x. In fig.2 the INL_yield of a 8 bit, 10 bit and a 12 bit D/A converter calculated using the new formula (eq.4) and simulated values using the Monte Carlo approach are depicted. From this figure, it can be concluded that the formula is in good agreement with the Monte Carlo simulations.
195
To gain more insight in eq.4, the unit current standard deviation is plotted in logarithmic scale versus the resolution of the D/A converter (fig.3). As can be seen from fig.3 these are straight lines since
Furthermore, one can easily conclude from this figure that for the design of a high accuracy current-steering CMOS D/A converter the matching parameters play a significant role. A small deviation of the required sigma(I)/I can lead to a severe yield degradation. The time to create a figure like fig.3 using Monte Carlo simulations in MATLAB is given in table 1. In this table the results for the INL_yield from 100% to 10% for a current-steering D/A converter with different resolutions can be found. For all the simulations twenty values for the relative unit current standard deviation were taken. This can be understood as follows. In a first coarse approximation a simulation using 10 values for sigma(I)/I -that span a wide range- is run. From
196
the obtained result the interval for the sigma(I)/I that obtain a high INL_yield can be specified. In this interval another 10 points are simulated. In almost all cases this procedure gives accurate results. Constructing fig.3 using the new formula takes only a few minutes. The time to write the short program is so to speak the most time consuming. It is also worth noting that the time necessary to calculate the yield is independent of the resolution of the D/A converter while the time consumption of the Monte Carlo simulations “explodes” with an increasing D/A converter’s accuracy. Based on these results and the size versus matching relation [11] for MOS transistors, the dimensions of the current source transistors are given by (7):
Increasing the gate overdrive voltage of the current sources reduces the area consumed by the current source array. However, the value of is limited by the fact that the switch transistors and have to operate in the saturation region (fig.4.a).
197
3.2 The influence of systematic errors
Apart from the random errors, the static performance of the D/A converter is determined by the following systematic errors : Although the transistor mismatch effect of the current source has already been taken into account during the sizing of these transistors, it can still have a negative influence on the static performance of the D/A converter due to the "edge effect" [12]. This effect states that the mismatch behavior of transistors is heavily dependent on its immediate surroundings. To avoid this error, the current source array has to be expanded by inserting dummy rows and columns as to provide identical surroundings for all the active current source transistors. The voltage drop along the ground line will slightly change the output current of the different current source transistors placed on the same row. This error is given by [13]:
where is the derivative of the full scale current to the current source bias voltage, R is the total resistance of the ground line and f is a factor depending on the used switching scheme. If all the current sources are switched sequentially from the left to the right, the value for f equals 9. This error can be reduced by either using sufficiently wide power supply lines (reducing the resistance R) or by using a special switching scheme (increasing the factor f). If the resolution of the D/A converter increases by a single bit, the number of current sources in the current source array doubles. The area occupied by a single unity current source also doubles because of the random matching constraint. This leads to a four-times area increase for the current source array for each additional bit. For D/A converters with a resolution of 10 bits and higher, the dimensions of the current source array become so large that process- and temperature gradients have to be considered. The non-linearity errors introduced by these gradients can be (partially) compensated by the introduction of a special switching scheme.
198
If the error contributions of the current sources are totally random and uncorrelated, the yield of the D/A converter dictates the minimal requirement for the matching precision of these current sources as is indicated in the previous paragraph. The random error can then be kept within the specified boundaries (INL<0.5LSB) by adjusting the active area. This implies that in order to guarantee a good static performance of the D/A converter, the systematic errors introduced by linear and/or symmetrical gradients have to be compensated in order to keep the random errors dominant. This is done using optimized switching schemes for the current sources. A switching scheme is actually a layout technique that determines the interconnection between the thermometer decoder and the inputs of the switches of the current source matrix. Several switching schemes have been presented in literature [5,13,14]. In section 6, an example of an advanced switching scheme is given for a 12 bit current steering CMOS D/A converter.
4. THE DYNAMIC PERFORMANCE OF A CURRENTSTEERING CMOS D/A CONVERTER 4.1 Introduction
To obtain a thorough understanding of the behavior of current steering D/A converters, system designers nowadays are not only interested in the static performance of current steering D/A converters but also in their frequency-domain performance since both the dynamic and the static non linearities are visible in the frequency domain as noise and distortion. Where open literature used to mention only static specifications [1,2], recent papers [3,4,5] reveal the problem that high speed Nyquist D/A converters are difficult to design. The limited spurious free output signal bandwidth is the major bottleneck in high speed high resolution designs.
4.2 The influence of the timing errors of the switch control signal If the control signals of both switches and are not exactly matched in time, a glitch error will be directly visible at the output of the D/A converter. This problem can be solved by placing a synchronization block immediately in front of the switch transistors.
199
In this way any delay introduced by the digital decoding logic is canceled and the timing error is minimized. However, one should keep in mind that at the layout level, the implementation of this circuit has no use unless identical connections between the synchronizing circuit and the switching transistors are drawn. 4.3 The influence of capacitive feed-through
The gate-drain capacitance of the switch transistors and form a feedthrough path that allows the digital control signals to have a direct impact on the output of the D/A converter. The glitch energy error that is generated in this way can be significantly lowered by the use of a reduced voltage swing at the input of the switches or it can be minimized by placing a cascode transistor on top of the switch transistors [3] . Since the introduction of these cascode transistors (that also have to be switched on/off) does not solve the problem entirely and leads to a higher area consumption and distortion of the fully symmetrical operating principle of the basic current cell, recent D/A converter designs opt for the first solution. In some designs the implementation of a reduced voltage swing can be done by the same synchronization circuit used to solve the problem described in the previous section. Hence, no extra layout work is necessary. 4.4 The influence of voltage stability errors
If the crossing point of the switch control signals is situated at exactly the value, the following problem will occur : a time interval exists in which both switch transistors are simultaneously in the off-state. Since the current source transistor is still delivering current, the capacitance at its drain node will discharge. At the moment one of the switches starts conducting, an extra amount of current will flow through these transistors as to restore the DC voltage at that node. This will result in a glitch error at the output of the D/A converter leading to a deterioration of the dynamic performance. This problem can be solved by the use of a special switch driver circuit [2]. However, also for this building block a trend exists towards an integration with the synchronization circuit [4,5].
200
4.5 The influence of the output impedance
As is generally known, the output impedance (fig.4.a) of each current cell has to be made large so that its influence on the INL (integral non-linearity) specification of the D/A converter is negligible. The relation between this output resistance and the achievable INL specification is given by [15]:
with the load resistor, the LSB current and T the total number of unit current sources. In most cases the cascode configuration of the switch and current source achieves the INL specification. However, this is only true over a limited frequency bandwidth as can be concluded from the following calculation. Fig.4.a shows the figure of the unit current cell of a current-steering D/A converter where the parasitic capacitance is indicated.
201
The impedance (the impedance seen from the output node into the drain of the switch transistor can be calculated (fig.4.b) and equals:
This formula indicates that the impedance has a pole and a zero at the following frequencies :
The possibility to shift this pole and zero to a higher frequency is determined by the flexibility in adjusting the following four parameters : the output resistance of the current source transistor and the switch transistor the transconductance of the switch and the capacitance According to eq.(11) the pole can be shifted towards a higher frequency by minimizing the output resistance of the current source transistor. However, the value of this resistance can not be freely adjusted since the gate-length L of the current source transistor is dictated by matching considerations [11] and the current through this transistor is determined by the full scale output signal. Since the current through the switches equals the current through the current source transistors and the gate-length L of the switch transistor is chosen to be minimal for speed reasons, nothing can be gained by the output resistance of these transistors. Also the transconductance is fully determined since the gate overdrive voltage of the switches is the result of an optimization process between the area occupied by the current sources and the optimum settling time of the D/A converter.
202
At this point, the frequency dependency of the output impedance has been discussed in detail but the question remains if this impedance has a significant effect on the dynamic performance of the D/A converter. In the remainder of this paragraph the value for the required minimal output impedance for a unit current switch will be discussed in function of the resolution of the D/A converter. It will then become clear that for high resolutions and designs with a large interconnect capacitance the non-linearity introduced by the output impedance severely limits the output signal bandwidth. For the mathematical derivation of the required impedance, the reader is referred to [16]. Here the resulting formulas will be presented and evaluated. The ratio Q gives a value for the SFDR determined by the second order harmonic.
From this formula the value for the required resolution can be easily determined and equals :
for a given
203
Eq.(13) is plotted in fig.5 for a D/A converter with a resolution between 8 and 16 bits. For a resolution of 10 bits the has to have a value of about which is still relatively easily to implement. However, for a 12 bit current steering DAC the ratio Q has to be at least equal to 72 dB. If the load resistor is a double terminated cable and N equals 4095, the value for the required has to be at least in the Nyquist frequency range. This is no longer a straightforward design specification since for high speed, high accuracy circuits the effect of the interconnect capacitances on the output impedance can no longer be neglected. 5. LAYOUT ISSUES Having a good D/A converter in the design phase does not necessarily lead to a good D/A converter at the measurement stage if not enough attention has been paid at the layout of the circuit. Several aspects that are worth mentioning are the following : The coupling between the digital and the analog part of the chip has to be minimized. This is not only done by using different power supply lines but also by placing guard rings around the analog and the digital part of the chip and by using a separate array for the switches together with their drivers. Another advantage of these separate arrays is that the layout area of a unity cell in the current source array can be minimized. In this way the distances between the transistors are reduced resulting in improved matching properties. To reduce the voltage drop in the ground line of the current source transistors, wide supply lines are used. These are drawn on top of the transistors together with the interconnections needed to implement the switching scheme. In this way, a very compact current source array can be realized. To avoid any edge effects, the current source array has to be expanded with a number of additional rows and columns as was already mentioned earlier (section 3.2).
204
A multiple number of bondingpads is used at the output of the D/A converter as to lower the inductance of the wire bonding and as a result minimize any ringing effects that could otherwise occur. Wherever possible, all interconnections have been made identical. In this way, no timing and/or load differences have been introduced. 6. DESIGN EXAMPLE : A 12-BIT CURRENT STEERING CMOS D/A CONVERTER
In this paragraph, a high speed, 12 bit CMOS current-steering D/A converter with a segmented architecture is presented [22]. Fig.6 shows the floorplan of the realised chip. The 5 MSBs are converted in an unary way while the 7 LSBs are converted using the binary approach, where the digital input bits directly control the switches. To minimise any latency problems and to optimise the dynamic performance of the D/A converter, a dummy decoder has been inserted between the inputs and the switch transistors.
205
Based on the combination of a 99.7% yield specification for the D/A converter and the transistor mismatch equations [11], the dimensions of the unity current source have been determined Apart from the random matching errors, the systematic errors caused by technological, electrical and temperature gradients over the die have been compensated by the implementation of a special triple centroid switching scheme. Since the first 7 LSBs are implemented in a binary way, the value of the unary current source equals 128 times the LSB current This unary current source has been split up into 16 current sources with a value of The current source array has been divided into 16 squares and the current sources are placed symmetrically around the center of each square as is indicated in fig.7. As a result, any two dimensional symmetrical or graded error is fully compensated. Four additional dummy rows and columns have been added to create identical surroundings for the current sources situated at the edge of the current source array.
206
The dynamic performance of the D/A converter has been obtained by the use of a well designed synchronised switch driver and by a careful design of the DAC’s output impedance as to minimise any nonlinearity caused by its frequency dependent value. To obtain a second order harmonic distortion that is better than 72 dB, the required output impedance of the D/A converter has to be larger than The chip has been realised in a single-poly five-metal layer standard CMOS technology with a total active area of only 1 mm2. Extra attention has been paid at the layout. All measurements are single ended and have been performed with a 3V analog power supply and a 2.2V digital power supply. The measured INL error is better than 0.3 LSB proving the 12-bit accuracy. To give a more complete image of the dynamic performance of the presented 12 bit current steering D/A converter, fig.9 is given. The first part of this figure shows the SFDR in function of the update rate for a 1 MHz output signal. The SFDR for the 1MHz output signal remains above 70 dB up to a 700 MS/s update rate for the presented DAC where previous designs reach this limit for update rates smaller than 300 MS/s [1] respectively 200 MS/s [5]. The second part of fig.9 shows the SFDR in function of the output signal for an update rate of 300MS/S. Figure 8 and 9 clearly show the good static and dynamic performance of the presented DAC.
207
7. THE FIGURE OF MERIT
To be able to compare the performance of the presented D/A converter with recently presented current-steering D/A converters, a figure of merit is introduced.
with N is the resolution and P is the power consumption of the D/A converter and is the output signal frequency where the SFDR has dropped with 6 dB (=1 bit) in comparison with the expected result For a 12 bit DAC, is the output signal frequency where the SFDR equals 66 dB. In fig. 10 this figure of merit is plotted versus the inverse of the normalized area. On the same figure, the lines of equal FOM/normalized area ratio are shown. It can be concluded from this figure that the presented 12 bit D/A achieves a state-of-the-art performance in comparison to recently published 10, 12 and 14-bit D/A converters [1,6,17,18,19,20,21].
208
8. CONCLUSION Since high resolution current-steering D/A converters are strongly dependent on the matching characteristics of the technology in which they are processed, it is important to know the number of functional chips in a set of fabricated devices. It is shown in the first part of this paper that time consuming Monte Carlo simulations are no longer necessary to obtain results for the INL_yield with a good accuracy. An accurate formula has been presented that directly gives you the INL_yield of a current-steering D/A converter in function of the transistor mismatch parameters of the current sources without any loss of design time. In the second part of this paper the SFDR-bandwidth limitations encountered with high resolution D/A converters have been analyzed. A main fundamental limitation is identified to be the dynamic output impedance of the circuit. The impact of this output impedance on the SFDR has been calculated. Based on this analysis the requirements for the value of the output impedance of each unit current branch has been derived.
209
The implementation of the presented analysis results, has resulted in an important performance improvement of our recently developed current-steering CMOS D/A converters [20,22]. 9. REFERENCES
[1] J. Bastos et al., “A 12-bit Intrinsic Accuracy High-Speed CMOS DAC,” Journal of Solid-State Circuits, Vol. 33, No.12, pp.1959-1969, Dec. 1998 [2] H. Kohno, Y. Nakurama et al. “A 350-MS/s 3.3-V 8-bit CMOS D/A Converter Using a Delayed Driving scheme,” IEEE Proc.of CICC 1995, pp. 10.5.1-10.5.4 [3] A. Marques, J.Bastos et al., “A 12-bit Accuracy 300 MS/s Update Rate CMOS DAC,” Proc. IEEE 1998 Int. Solid State Circuits Conf. (ISSCC), pp. 216-217, Feb.1998 [4] N. Van Bavel, “A 325 MHz 3.3V 10-bit CMOS D/A Converter Core with Novel Latching Driver Circuit,” Proc. of the IEEE Custom Integrated Circuits Conf. (CICC), pp. 11.6.1-11.6.4, May 1998 [5] A. Van den Bosch et al., “A 12 bit 200MHz Low Glitch CMOS D/A Converter,” Proc. IEEE CICC 1998, pp.11.7.1-11.7.4 [6] C.-H. Lin and K. Bult, “A 10-b, 500-Msample/s CMOS DAC in ” IEEE Journal of Solid-State Circuits, Vol. 33, No.12, pp.l948-1958, Dec.1998 [7] C. Conroy, W. Lane and M. Moran, “A Comment on ‘Characterization and Modeling of Mismatch in MOS Transistors for Precision Analog Design,’” IEEE Journal of Solid State Circuits, vol.23, Feb. 1988, pp. 294-296 [8] K. Lakshimikumar and al., “Characterization and Modeling of Mismatch in MOS Transistors for Precision Analog Design”, IEEE Journal of Solid State Circuits, vol.21, Dec 1986, pp. 1057-1066 [9] K. Lakshimikumar and al., “Reply to ‘A Comment on : Characterization and Modeling of Mismatch in MOS Transistors for Precision Analog Design”, IEEE Journal of Solid State Circuits, vol.23, Feb. 1988, pp. 296 [10] A. Van den Bosch, M. Steyaert and W. Sansen, "An Accurate Statistical Yield Model for CMOS Current Steering D/A Converters," Proc. IEEE 2000 Int. Symposium on Circuits and Systems (ISCAS), pp. IV.105-IV.108, May 2000
210
[11] M. J. M. Pelgrom et al., “Matching properties of MOS Transistors,” IEEE Journal of Solid-State Circuits, Vol. SC-24, pp.1433-1439, Oct. 1989 [12] S. Wong, J. Ting and S. Hsu, “Characterization and Modelling of MOS Mismatch in Analog CMOS Technology”, Proc. of the IEEE Int. Conference on Microelectronics Test Structures (ICMTS), pp. 171176, March 1995 [13] T. Miki, Y. Nakamura et al. “An 80-MHz 8-bit CMOS D/A Converter,” IEEE Journal of solid state circuits, vol. 21, December 1986, pp. 983-988 [14] Y. Nakamura, T. Miki et al. “A 10-b 70-MS/s CMOS D/A Converter,” IEEE Journal of solid state circuits, vol. 26, April 1991, pp.637-642 [15] B. Razavi, “Principles of Data Conversion System Design,” IEEE Press, ISBN 0-7803-1093-4, 1995 [16] A. Van den Bosch, M. Steyaert et W. Sansen, "SFDRBandwidth Limitations for High Speed High Resolution Current Steering CMOS D/A Converters," Proc. IEEE 1999 Int. Conf. on Electronics, Circuits and Systems (ICECS), pp. 1193-1196, Sept. 1999 [17] G. Van der Plas et al., “A 14-bit Intrinsic Accuracy Random Walk CMOS DAC,” Journal of Solid-State Circuits, Vol. 34, No. 12, pp. 1708-1718, Dec. 1999 [18] A. Bugeja et al., ”A 14b l00Msample/s CMOS DAC Designed for Spectral Performance,” Journal of Solid-State Circuits, Vol. 34, No. 12, pp.l719-1732, Dec. 1999 [19] A. Bugeja and Bang-Sup Song, “A Self-Trimming 14b l00MS/s CMOS DAC,” Proc. IEEE ISSCC, Feb. 2000 [20] A. Van den Bosch et al., “A 10-bit 1GSample/s Nyquist CurrentSteering D/A Converter,” Proc. of IEEE CICC 2000, May 2000, pp.11.6.1-11.6.4 [21] K. Khanoyan et al., “A 10b, 400 MS/s Glitch-Free CMOS D/A Converter,” Symp. VLSI Circuits Dig. Tech. Papers, paper 8-1, 1999 [22] A. Van den Bosch, M. Borremans et al., “A 12b 500 Msample/s Current-Steering CMOS D/A converter,” IEEE Proc. Int. Solid-State Circuits Conference (ISSCC01), Feb. 2001,pp. 366-367
HIGH SPEED DIGITAL-ANALOG CONVERTERS - THE DYNAMIC LINEARITY CHALLENGE Alex R. Bugeja Texas Instruments, Dallas, TX 75243, USA.
ABSTRACT In this paper we examine the need for high dynamic linearity in high speed digital-analog converters for communications applications, and the challenges facing DAC designers attempting to maximize it. A brief discussion of a DAC designed for high dynamic linearity is then presented, followed by some predictions of future trends.
1. INTRODUCTION High dynamic linearity is crucial for communications applications digital-analog converters (DACs) in the transmission paths of modern cellular and wireless LAN basestations. Such DACs typically exhibit significant roll-off of their SFDR performance with increasing input frequency for a given clock rate, introducing spurs in the output spectrum which limit their use in such environments. In this paper we focus on current switched digital-analog converters, which have been generally demonstrated to be the most feasible architecture for high speed operation, are capable of driving resistive loads and passive filters directly without the need of any high speed output buffers, and may also be easily reduced to minimal power consumption designs. This paper first examines the challenges facing designers attempting to maximize dynamic linearity in current mode DACs. A practical case study from the authors’ own research is then presented, followed by extrapolation to some future trends which may be anticipated. 211 J. H. Huijsing et al. (eds.), Analog Circuit Design, 211-231. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
212
2. PRACTICAL DYNAMIC LINEARITY ISSUES For high speed and high resolution applications (>10 bits, >50MHz), the current source switching architecture is preferred since it can drive a resistive load directly without the need for a voltage buffer. Such architectures can also be reduced to minimal power designs whereby most of the power consumption is actually the signal current [1]. A conventional high performance DAC architecture as used in such applications is shown in Fig. 1. As shown in Fig. 1, the DAC consists of m-1 thermometer (linearly) decoded most significant bits (MSBs), u-1 thermometer decoded upper least significant bits (ULSBs) and 1-1 binary decoded lower least significant bits (LLSBs). The current sources, which are implemented differentially, are taken directly to a pair of differential resistive loads. Modern high speed and high resolution DACs all use variations of this basic architecture [1-10]. The ULSB/LLSB array is sometimes driven by an mth MSB to ensure the sum of the LSBs is one MSB. Also, the ULSBs are sometimes omitted, so that the DAC has an upper array of thermometer decoded bits (MSBs) and a lower array of binary decoded bits (LSBs). Thermometer decoding has the well known advantages of monotonicity and reduction of glitch at major carries but full thermometer decoded architectures are impractical to implement for high resolution [3].
213
The static performance of such DACs is well characterized by traditional measures such as integral non-linearity (INL) and differential non-linearity (DNL), and various techniques have been used to attain full n-bit static linearity for n-bit DACs. In particular such techniques have included sizing the devices appropriately for intrinsic matching and utilizing certain layout techniques [2, 6, 11], trimming [8, 9], calibration [12, 13], and dynamic element matching/averaging techniques [14]. The dynamic performance of current-switched DACs, however, has not scaled in proportion to the number of their bits. In particular, examination of the references will show dynamic performance as measured by SFDR falling off rapidly with increasing signal frequency. Effectively the larger number of bits only gives lower quantization noise at higher signal frequencies, not higher SFDR. There are several causes for this behavior; the major ones are summarized below: 1. Code-dependent settling time constants: The time constants of the MSBs, ULSBs, and LLSBs are typically not proportional to the currents switched, owing to voltage headroom and parasitic capacitance considerations in the switch devices; the problem is worse if R/2R ladders are employed in place of current dividers [15]. 2. Code-dependent switch feedthrough: This results due to signal feedthrough across switches which are not sized proportionately to the currents they are carrying, again owing to voltage headroom and parasitic capacitance considerations, and therefore shows up as codedependent glitches at the output. 3. Timing skew between current sources: Imperfect synchronization of the control signals of the switching transistors will cause dynamic nonlinearities [6]. Synchronization problems occur both because of delays across the die, as well as because of improperly matched switch drivers. Thermometer decoding can actually make the time skew worse because of the larger number of segments [9]. 4. Major carry glitch: This glitch, which occurs when switching in/out of circuit an MSB in place of a bank of LBSs, can be
214
minimized by thermometer decoding, but in higher resolution designs where full thermometer decoding is not practical, it cannot be entirely eliminated [3]. Increased thermometer decoding also brings other problems, such as timing skew. 5. Current source switching: Voltage fluctuations occur at the internal switching node at the sources of the switching devices during the switching process. Since the size of the fluctuation is not proportional to the current being switched, and is particularly dependent on second-order nonlinearities arising from the switching device physics, it again gives rise to a nonlinearity proportional in size to the parasitic capacitance at the switching device sources. 6. On-chip passive analog components: Drain/source junction capacitances are nonlinear; and any on-chip analog resistors also exhibit nonlinear voltage transfer characteristics. These devices therefore cause dynamic nonlinearities when they occur in analog signal paths. ESD protection on the output pads typically contributes substantial additional nonlinear parasitic capacitance. 7. Mismatch considerations: Device mismatch is usually considered in discussions of static linearity, but it also contributes to dynamic nonlinearity because switching behavior is dependent on switch transistor parameters such as treshold voltage and oxide thickness. These differ for devices at different points on the die [16], introducing code dependencies in the switching transients. Of course, any static nonlinearities (in the current-generating transistors) will also show up as dynamic nonlinearities. Dynamic nonlinearities increase in magnitude with increasing signal frequency since the outputs change value more frequently and a larger proportion of the clock cycles is occupied by nonlinear switching transients. This explains the pronounced frequency degradation of SFDR observed for the DACs cited above. Alternatives to the current-mode DAC have been proposed in the literature [17], but they are limited by the use of opamps and/or low
215
impedance followers as output buffers. Opamps introduce several dynamic nonlinearities of their own, owing to their nonlinear transconductance transfer functions (slew limiting in the extreme case). High gain opamps connected in feedback configurations also require buffers to drive lower impedance resistive loads. Buffers introduce further distortion, due to factors such as signal dependence of the bias current in the buffer devices and nonlinear buffer output resistance. One conceptual solution to the dynamic linearity problem is to eliminate the dynamic nonlinearities of the DAC, all of which are associated with the switching and subsequent settling behavior, by placing a track/hold circuit at the DAC output. The track/hold would hold the output constant whilst the switching is occuring, and track only once the current sources have settled to their dc value. Thus only the static characteristics of the DAC would show up at the output, and the dynamic ones would be attenuated or eliminated. The problem with this approach is that the track/hold circuit in practice introduces dynamic nonlinearities of its own which tend to be comparable to or worse than those of the DAC alone. These problems include track-tohold step, droop rate error, hold mode feedthrough, and track mode errors. A more detailed discussion of these nonlinearity sources is given in [4]. Because of these nonlinearities, track/hold circuits are not commonly used at the outputs of high speed DACs.
216
3. CASE STUDY – A 14b 100MS/s DAC – 8 A - Introduction In this section a 14b 100MS/s CMOS DAC designed for both high static and dynamic linearity is briefly presented as a case study. The DAC is composed of a segmented current-source core driving a specialized track/attenuate output circuit. Static linearity of the DAC core is enhanced by means of a calibration technique. The track/attenuate circuit is designed to enhance the dynamic linearity of the outputs. A more detailed discussion is given in [5]. The chip architecture is outlined in Fig. 3. The main DAC is a segmented current source design with 4 most significant bits (MSBs), 5 upper least significant bits (ULSBs), and 5 lower least significant bits (LLSBs). After passing through a 14b wide input latch array, the MSBs and ULSBs are thermometer decoded and used to drive 14 MSB current sources and 31 ULSB current sources respectively. The LLSBs are left in binary format since the LLSB current sources are binary weighted. An additional bank of latches resynchronizes the data prior to the current sources, and is followed by switch drivers/buffers to drive the current source switches. Thermometer decoding of the MSBs makes calibration straightforward whilst thermometer decoding of the ULSBs enhances static linearity and guarantees monotonicity within the ULSB range. The MSBs are calibrated but the ULSBs are not, so that intrinsic matching at the 10b level is required and built into the ULSB circuit by careful layout techniques. A 16th MSB current source is used to drive the ULSB array, and a 32 ULSB current source drives the LLSB array. The self-trimming circuit is composed of a number of measurement resistors and a sigma-delta modulator for accurate dc voltage measurement, a digital correction circuit which includes memory storage of the calibration error corrections, and a 12b calibration DAC (CALDAC) which reads these calibration corrections and converts them to an analog form in such a way that they can be used to trim the MSB current sources. The
217
resistors and are used to measure the currents in the sum of the ULSBs and the MSBs respectively by changing it to a voltage which the sigma delta modulator can measure. The dummy resistors are connected to MSBs not being measured, since only one MSB can be connected to the single measurement resistor at a time. The digital output of the sigma delta modulator is analyzed in the digital correction circuit, which uses an iterative measurement process to compute appropriate corrections for each of the MSBs so as to change them to the value of the sum of the ULSBs. The detailed design and operation of the self-trimming circuit [5] is beyond the scope of this paper, but the motivation for it is strong in terms of dynamic linearity. Although static linearity can be obtained by means of transistor sizing alone, (e.g. [2, 6]), such designs result in considerably larger MSB cells which exhibit larger parasitics. More seriously, owing to the need to spread the MSB cells around the die, and couple them together with metal wires, a large degree of parasitic crosstalk between MSBs results at the switching instants. These factors contribute to significant dynamic linearity degradation. In the design presented here, the MSB cells are small and isolated from eachother, thus taking advantage of the calibration circuit to improve dynamic performance.
218
As also shown in Fig. 3, the current sources are taken to the output of the DAC by means of a current folding stage which folds the total current and makes the n-type DAC current sources capable of driving a switching stage composed of n-type switches. This allows the use of fast n-type switches in both the current sources and the output stage, and reduces the power supply voltage requirement, at the expense of an extra 60mW in power consumption. The folding circuit also includes a feedback loop to regulate the current source outputs, both enhancing their static linearity, and isolating them from the switching waveforms at the outputs which would otherwise disturb their settled currents.
219
The current folding stage drives a track/attenuate circuit composed of a number of switches which attenuate the current outputs during the first half of the clock cycle while the current sources settle, and track them during the second half of the clock cycle. The design of the switches in this track/attenuate stage is optimized as described in the next sub-section. In a similar way to return-to-zero, therefore, the dynamic nonlinearities associated with current source switching are therefore greatly reduced. The track/attenuate stage drives a pair of differential current outputs to which resistive loads of or lower ohmic value may be connected as with conventional current mode DACs. The DAC full scale current is 20mA, making for a 2Vp-p differential output signal when the two differential outputs are combined. B – Track/Attenuate Circuit The track/attenuate concept is illustrated in Fig. 4. Conventional DAC outputs are full cycle as shown in the top half of the figure, with the DAC output being valid for the whole clock period T, albeit corrupted by dynamic linearities at the switching instant at the start of T. The track attenuate output stage modifies the output waveform to that shown in the lower half of the figure.
220
The output is attenuated during the first half of the clock cycle by lowering the effective output impedance by the parallel connection of a low impedance with the output load. Although the output signal on the load, including the dynamic nonlinearities, is not reduced entirely to zero (the low impedance load still has a finite impedance), it is greatly reduced, hence improving the SFDR. During the second half of the clock cycle the low impedance load is removed and the output tracks the DAC output. The effect is similar to return-to-zero in that the SFDR is improved (as well as the sin(x)/x rolloff) at the cost of halving the signal power [4]. RZ implementations inherently also increase the clock jitter, but assuming that this is random in nature and not related to the signal source, the effect is only to raise the noise floor and degrade SNR, not SFDR. This is acceptable for communications applications involving the Nyquist baseband being split into several channels, as is typically the case, since no one channel has a high noise floor. A single spur in such a channel due to
221
poor SFDR, on the contrary, would effectively wipe out the channel information and has to be avoided.
222
The track/attenuate output stage is shown in Fig. 5 in its differential implementation. It consists of 3 attenuate switches to function as low impedance loads in parallel with the external output load during the first half of the clock cycle when the ATTEN signal is brought high. Two of the switches, and are single-ended, shorting the output to signal ground, whilst one switch, is differential and shorts the outputs together. The use of all nMOS switches makes the track/attenuate action unipolar, since only a single clock signal (ATTEN) is required to drive the switches. This avoids problems with matching rising and falling clock waveforms. The analysis behind optimizing the performance of the switches will be presented in this section. The folding current sources and not strictly a part of the track/attenuate action, are also shown in Fig. 5, as well as the regulated cascode circuit that keeps the drain of these folding sources and the outputs of the DAC current sources at approximately constant potential as required for correct static linearity. The unity gain bandwidth of the regulated cascode is kept in excess of 600MHz for all values of output current by forcing a fixed dc current component through each side of the differential circuit. This dc component is obtained by excess biasing of the folding sources, and maintains the minimum acceptable bandwidth for settling the and nodes, even at the zero DAC current position. Finally the impedance looking into the regulated cascode is made sufficiently large so as to ensure that the DAC current sources are adequately isolated from the switching at midcycle and are not disturbed significantly from their settled position at that point. Special sizing of the switches is carried out to maximize the dynamic linearity performance. Consider first the circuit shown in Fig. 6(a). This shows the differential current outputs and of the current mode DAC being sent to output loads on either side. The track/attenuate circuit in this first case is composed simply of the two MOS switches connected across the loads to ground, and we call this circuit the two-switch circuit for convenience. Also, since the circuit is shown in the attenuate phase we represent the switches,
223
which are operated in linear mode, by their on-resistance as shown. We consider the effects of other parameters such as channel charge and treshold voltage shortly. During the track phase the switches are turned off and the DAC currents flow solely to the loads. We are concerned with the attenuation introduced in the DAC signal by the switches during the attenuate phase as compared to the track phase. For maximum dynamic linearity this attenuation should be as large as possible. We define the attenuation factor, or AF, as the resistance during the attenuate phase for the differential output divided by the load resistance. For the two-switch circuit For we get that , so that for an attenuation of 50 times we require that Consider now the single switch circuit shown in Fig. 6(b). It can be shown that and that for . Comparing the 2-switch scheme and the 1-switch scheme, furthermore, we see that if we consider the same total amount of MOS switch size, we can make the MOS switch twice as large in the single switch scheme (without increasing the clock driver load necessary to drive the switches). We therefore get that On the basis of attenuation factor alone, the choice is clearly in favor of the single switch scheme over the two-switch scheme, although so far factors such as charge injection and treshold voltage, which degrade the intrinsic linearity of the output stage, have not yet been compared. Consider now the three-switch scheme formed by combining the single and two-switch schemes as shown in the circuit of Fig. 6(c). During the attenuate phase and are connected together by means of the differential switch (resistance and also each connected to ground by means of the single-ended switches (resistance ). The attenuation factor is now given by , where . We have already seen that for the same total switch size . In the three switch scheme, for the same total switch size, we have allocated some
224
portion of the MOS switch size to the single-ended switches and some portion to the differential switch, so that Based on this simple analysis alone, there is no motivation to use anything but the single switch scheme, to minimize the attenuation factor. This simple analysis however ignores the effect on differential switch resistance of adding the single-ended switches as in the three-switch scheme. In particular, this addition will result in the common-mode voltage of the output nodes being reduced from in the single switch scheme to in the three-switch scheme. Since we design , is close to zero during the attenuate phase. In a p-well CMOS process as used here, and for common mode voltages of around 0.5-1V in the single-switch scheme, as also resulting from the DAC output currents in this design, this reduction in the common mode voltage increases the gate drive of the differential switch by 0.5-1V, decreases because of the lower body-effect, and hence reduces the switch resistance by a factor of approximately 1/3. When we factor this into the analysis we get that the attenuation factor of the 3-switch scheme is approximately the same as that of the single switch scheme for the same total switch size. In general if W was the original single differential switch width corresponding to , if we split the switch to obtain a threeswitch scheme by keeping kW width in the differential switch and creating two single-ended switches of width 1W each (such that k + 2l = 1), we then get that . This equation is valid so long as enough switching capacity is allocated to the single-ended switches to obtain the 1/3 improvement in differential switch resistance as described above; in practice this is satisfied so long as l is not very small. We will examine optimal allocation shortly, and quantify how small l can be. For k=0.5 and l=0.25, we get again.
225
This being the case, AF alone narrows down the selection to either the single switch or the three switch scheme but is insufficient to choose between the two. To decide between these schemes we now consider other factors, in particular charge injection and treshold voltage effect, which introduce nonlinearities in the output stage dynamic performance. The first order models for , the channel charge of the switches in linear region, and the switch resistance in linear region are used here since they are fairly accurate for large switches where . From these equations, comparing the 2-switch scheme with the single switch scheme, we observe that the former will have superior channel charge injection and switch resistance characteristics from the linearity viewpoint. In the two-switch scheme, the switches have their source node grounded. To first order, therefore, the channel charge and the switch resistance remain constant since is a constant dependent only on the clock waveform voltage and is constant and signal independent because there is no signal on the source node. Therefore the charge injection when the switches are turned off, as well as the charge uptake when they are turned on (significant in a current-limited DAC output) are both constant, and the switch resistance is also constant. In the single switch case, however, there is a signal component on both the switch nodes. The channel charge is therefore signal-dependent, as is the treshold voltage due to the backgate effect, thus reducing the linearity of the output stage. It therefore makes sense to allocate as much switch size as possible to the single-ended switches instead of the differential switch, so long as the AF is not reduced significantly. This suggests an optimization process. A computer program was written for this purpose; this program tracks the switch resistance of the differential switch as k is reduced and l is increased and calculates AF for each position accordingly. Based on this optimization, the current-switching output stage implemented for this chip was a track/attenuate three switch
226
circuit as shown in Fig. 5 with k=0.5, i.e. the differential switch size is twice as large as the single-ended switch size. C – Measurement Results
Fig. 7 shows a die photo of the fabricated chip. The die occupies an area of 3.44mm x 3.44mm in a CMOS process. The main DAC occupies the central third of the die, and is composed of the MSBs, ULSBs, and LLSBs current sources and their associated latches, buffers, and bias circuitry. Their current outputs are collected and taken to the folding sources and output stage on the right side of the die. The self-trimming circuitry, composed of the sigma-delta modulator circuit, the calibration DAC, and the digital calibration logic is shown on the left side of the die. A summary of the measured chip characteristics is given in table form in Table 1. It can be seen that after calibration the INL and DNL are within the 14b specification as designed for. The dynamic linearity at the design clock rate of 100MHz is around 6dB higher at frequencies close to Nyquist (42.5MHz) than similar DACs without a track/attenuate output stage. The effectiveness of the circuit however falls off for higher clock rates where nonlinearities due to the current mode DAC driving the circuit can no longer be expected to settle completely by mid-cycle.
227
228
5. FUTURE TRENDS The importance of the communications DAC market is such that interest in this area of development can only be extrapolated to grow significantly in the next few years. The ultimate goal, so far unrealizable, is the full software radio with the data converter being the only component between the antenna and the digital signal processing circuitry. Certainly, new circuits and architectures will be needed to meet even subsets of this challenge. Some trends can be predicted with what the author hopes is reasonable accuracy: (1)
Basic DAC cores will move towards more thermometer decoding as experience in dealing with the practical layout complexities grows. As the least significant bits are pushed into lower significance compared to the most significant ones, any dynamic mismatches between the two have a reduced impact on the overall DAC linearity.
(2)
Dynamic Element Matching (DEM), currently mostly used only in unit element (full thermometer) implementations in multibit sigma delta feedback loops, will move into a position of greater importance in segmented communications DAC implementations. DEM cannot correct for MSB-LSB mismatch, but thermometer coding of more MSBs as in (1) makes this less important. The advantage of DEM is that it matches dynamic mismatches between switches and parasitic capacitances in current sources, besides static mismatches, again improving dynamic linearity.
(3)
Calibration will continue to be used to correct for static mismatches because of its advantages over intrinsic transistorbased matching in terms of lower parasitics and MSB crosstalk, and its greater degree of process independence. Calibration
229
remains advantageous even in a DEM enviroment where it reduces the random white noise floor otherwise introduced by the DEM process due to static mismatches. New current measurement techniques such as the use of accurate sigma delta modulators open up new calibration possibilities in terms of measurement accuracy obtainable. As regards current correction, the implementation convenience of gate-charge storage indicates that such methods will retain preference over correction DACs tied to the outputs of the main DAC. Such correction DACs are severely dynamically mismatched to the main DAC and are thus unsuitable in communications applications. (4)
The use of output stages correcting in some way the current outputs of DAC cores remains a possibility in areas where the higher power consumption and complexity can be afforded; such methods however inherently carry the disadvantages of increased noise due to clock jitter and a drop in performance at higher clock rates.
(5)
On-chip isolation of signals, both analog-analog and analogdigital will have to be emphasized for better dynamic linearity. Analog-analog isolation is particularly important in the MSB array and can be addressed by calibration, increased layers of metal providing extra shielding, higher resistivity substrates, etc. Analog-digital isolation will be likely be addressed by techniques such as higher resistivity substrates, custom-specific digital coding schemes, and differential digital encoding close to the MSBs. Digital input signals are correlated to the output signal and can produce harmonic distortion besides noise if care is not exercised.
(6)
From the process standpoint, BiCMOS and pure CMOS implementations appear to be the choice communications DAC technologies of the future. CMOS obviously has strong cost considerations driving it, whereas BiCMOS offers the possibility of retaining CMOS DAC cores unchanged to a large
230
extent, but exchanging the CMOS switches for bipolar ones to increase the switching speeds. (7)
Packaging and board design will become increasingly important. Low inductance packages which offer little “voltage kickback” will be necessary in fast current mode DACs where full scale current changes can push the current sources into low impedance ranges of operation and thus adversely impact dynamic linearity. Fortunately modern BGA packages and chipon-board (COB) implementations are now starting to approach the sub-nH/pin specification. From the board standpoint, the greatest challenge will likely remain that of isolating the digital inputs and clocks from the analog outputs; it appears that new driver schemes such as LVDS will become helpful here.
6. CONCLUSIONS The sources of dynamic nonlinearities in high speed and high resolution DACs as required for modern communications applications have been summarized. A case study of a DAC which uses a special track/attenuate stage to improve dynamic performance has been reviewed. The communications market ensures that the requirements placed on the DAC component will continue to increase in the coming years. Future trends which the author expects will be visible in this area over the next few years have been presented. 7. REFERENCES [1] M.P. Tiilikainen, “A 1.8V 20mW 14b 100MS/s CMOS DAC”, Proceedings of the European Solid State Circuits Conference, June 2000. [2] G. Van der Plas et al., “A 14-bit Intrinsic Accuracy Random Walk CMOS DAC” IEEE Journal of Solid-State Circuits, vol. 34, pp. 1708-1718, Dec. 1999. [3] C. Lin and K. Bult, “A 1 0bit 500Ms/s CMOS DAC in State Circuits, vol. 33, pp. 1948-1958, Dec. 1998.
, IEEE Journal of Solid-
[4] A.R. Bugeja et al., “A 14-b 100-MS/s CMOS DAC Designed for Spectral Performance”, IEEE Journal of Solid-State Circuits, vol. 34, pp. 1719-1732, Dec. 1999.
231 [5] A.R. Bugeja and B.-S. Song, “A Self-Trimming 14-b 100-MS/s CMOS DAC”, IEEE Journal of Solid-State Circuits, vol. 35, pp. 1841-1852, Dec. 2000. [6] J. Bastos et al., “A 12bit Intrinsic Accuracy High Speed CMOS DAC”, IEEE Journal of Solid-State Circuits, vol. 33, pp. 1959-1969, Dec. 1998. [7] A. Van den Bosche et al., “A 10bit 1Gsample/s Nyquist Current Steering CMOS D/A Converter”, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference, pp. 265268. [8] B. Tesch and J. Garcia, “A Low Glitch 14bit 100MHz D/A Converter”, IEEE Journal of Solid-State Circuits, vol. 32, pp. 1465-1469, Sept. 1997. [9] D. Mercer, “A 16bit D/A Converter with Increased Spurious Free Dynamic Range”, IEEE Journal of Solid-State Circuits, vol.29, pp. 1180-1185, Oct. 1994. [10] D. Mercer and L. Singer, “12bit 125Ms/s CMOS D/A Designed for Spectral Performance”, International Symposium on Low Power Electronics and Design, pp. 243-246, 1996. [11] G. Van der Plas et al., “Systematic Design of a 14b 150MS/s CMOS Current Steering D/A Converter ”, Proceedings of the 2000 Design Automation Conference, pp. 452-457. [12] R. Hester et al., “CODEC for Echo-Canceling, Full-Rate ADSL Modems”, ISSCC Digest of Technical Papers, pp. 242-243, 1999. [13] D. Groeneveld et al., “A Self-Calibration Technique for Monolithic High-Resolution D/A Converters”, IEEE Journal of Solid-State Circuits, vol.24, pp. 1517-1522, Dec. 1989. [14] M. Moyal et al, “A 25kft 768kb/s CMOS Transceiver for Multiple Bit-Rate DSL”, ISSCC Digest of Technical Papers, pp. 244-245, 1999. [15] P. Hendriks, “Specifying Communications DACs”, IEEE Spectrum, vol. 34, pp. 58-69, July 1997. [16] M. Pelgrom et al., “Matching Properties of MOS Transistors”, IEEE Journal of SolidState Circuits, vol. 24, pp. 1433-1440, Oct. 1989. [17] K. Khanoyan et al., “A 10b, 400MS/s Glitch-Free CMOS D/A Converter”, 1999 Symposium on VLSI Circuits, Digest of Technical Papers.
A 400-MHz, 10-bit Charge Domain CMOS D/A Converter for Low-Spurious Frequency Synthesis K. Khanoyan, F. Behbahani, and A. A. Abidi Electrical Engineering Department University of California Los Angeles, CA 90095-1594
Introduction Modern integrated wireless transceivers increasingly use digital circuits in critical building blocks. One example is the direct digital synthesis of sinewaves in a frequency agile transmitter [1]. A discretetime sinewave and its quadrature phase are generated as a sequence of digital words by table lookup in a ROM. The accumulation rate, programmed by a control word, on the left of the block diagram, sets the sinewave frequency. The frequency can be instantly changed to any arbitrary value. Two D/A Converters (DACs) convert the output words into discrete-time analog waveforms. These DACs must be highspeed, compact, and most importantly for communication systems, they must not suffer from dynamic nonlinearity. Such a DAC is the subject of this paper [2]. Sources of Dynamic Distortion It is widely believed that the current-steering DAC is the only feasible circuit for operation at 100’s of MHz. This DAC is fast because the input data after being latched is merely required to steer an array of binary-weighted currents into a differential line. There the currents sum to form the analog output. A binary-weighted DAC (Figure 1 (a)) needs only N latches to convert an N-bit word. However, its DC linearity is limited by the accuracy of the Most Significant Current Source relative to the sum of all other current sources. Segmenting the current source array into units of independently switched least significant currents (Figure 1 (b)) greatly relaxes the accuracy required on the individual current source. This arrangement now needs latches to convert an N-bit word, and the binary input word must be expanded into a thermometer code to drive these latches. In practice, the explo233 J. H. Huijsing et al. (eds.), Analog Circuit Design, 233-246. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
234
sion in the number of latches limits use of the segmented DAC to only a few bits. To satisfy DC accuracy most high resolution DACs will segment the upper few bits, and binary weight the remaining lower bits. Dynamic accuracy is however another matter. The problem stems from the fact that a clock edge must latch the data word at every current switch. It is fundamentally impossible to switch an array of current sources simultaneously. In practice, because of distributed RC delay in the clock lines, the clock edge arrives at some current cells a few picoseconds later than at others (Figure 2(a)). As a result, momentarily the DAC output current over- or undershoots its final value. This current glitch is worst at the mid-scale transition. An actual glitch waveform is shown in Figure 2(b). What is worse is that the glitch dynamics are code dependent, and are largely unaffected by efforts to improve the DAC’s static accuracy. The waveform of the discrete-time sinewave synthesized by the DAC now contains code-dependent glitches, as shown in Figure 2(c). Departure from a linear setting constitutes dynamic distortion. When the synthesized sinewave frequency exceeds half Nyquist, the jumps between successive samples are larger and so are the glitches. As the clock rate rises, the glitch transient occupies a larger fraction of the clock period. Thus, the worst-case distortion arises when synthesizing sinewaves above half Nyquist at high clock rates. Figure 2(d) shows the spectrum of a commercial 10b DAC as it synthesizes a 65 MHz sinewave at 2/3 Nyquist. The glitch-produced harmonics are aliased in-band, and here the largest harmonic is only 45 dB below the fundamental tone. This is unacceptable for most wireless applications, which require SFDR of more than 60 dB. Charge Domain D/A Conversion An entirely different approach to D/A conversion is by charge redistribution [3,4] (Figure 3(a)). Depending on the data bit, a capacitor is pre-charged to either the full-scale or to zero, and on the next clock phase an equal capacitor bisects the charge by redistribution. A threephase clock forces charge to flow from left to right. Progressing to the right, each capacitor is pre-charged by increasingly significant bits of the input word. Therefore, charge introduced on later stages is bisected fewer times. This means that charge arriving on the last stage represents the required binary D/A conversion. Because charge is naturally sampled and held at each stage before being passed to the
235
next stage, this DAC is glitch free. Furthermore, the operation may be pipelined so that a new conversion completes every clock period. The simple circuit shown here is a binary weighted DAC. We reported a 100 MHz prototype of this DAC at ISSCC ’94 [4]. This DAC can be segmented by replacing the series pipeline with a parallel set of equal capacitors switched into a summing node. The charge output at the last, most-significant stage of this DAC must be buffered to voltage to drive the later circuits. An op ampbased switched capacitor amplifier is used for this purpose (Figure 3(b)). The D/A conversion capacitors are actually two quasi-differential arrays, which are differentially sampled by the balanced amplifier. This op amp is intended to drive an on-chip capacitor load. Over two of the three clock phases it acquires and holds the DAC output charge, and resets during the third clock phase. Reset after every sample means that the amplifier dynamics, and therefore dynamic distortion, remain the same whether the DAC produces a DC output or a high frequency sinewave. Uniform behaviour under all conditions is desirable. Let us now turn to practical aspects in realizing 10b accuracy in a charge-redistribution DAC. Clearly, matching between unit capacitors limits achievable accuracy. Conversion involves precharge to one of two possible values, followed by charge bisection. Non-zero voltage coefficient on two matched capacitors means that charge bisects, but voltages do not (Figure 4(a)). However, voltage coefficient is unimportant in the core of the DAC because conversion takes place in the charge domain. When the converted charge is buffered as voltage at the output, voltage coefficient in the feedback capacitor of the op amp distorts the output voltage (Figure 4(b)). All capacitors in this circuit are poly over thin oxide over heavy diffusion. The voltage coefficient is 0.1 %/V. For –60dB THD, this limits the maximum voltage swing to 0.5V, which is now the full-scale output. Unit capacitors of 0.5 pF guarantee RMS spread of < 0.1% [5] and noise < ½ LSB at 10b. Accuracy Issues in Charge Domain DAC Let’s take a closer look at one cell in the DAC core (Figure 5 (a)). At every node there is a stray capacitance to ground. As long as the cell capacitance, including the stray, matches well at every node, DAC
236
accuracy is preserved. However, strays between cells, shown by the capacitor are troublesome. For example, during node nl precharges to a reference voltage while node n2 bisects charge with the cell to its right, will now leak into node n2 and corrupt the bisected charge. The resulting distortion worsens as the frequency of a synthesized sinewave approaches Nyquist. originates mainly in the fringing capacitance across the inter-cell switch. The photomicrograph and graphic in Figure 5(b) show an unconventional layout of the switch FET to alleviate this. Metal contacts opposite halves of the source and drain diffusions, lowering fringing capacitance between the metal sidewalls to an estimated 0.2fF, which is less than 0.1 % of the unit DAC capacitor. To the first-order, charge injection by the switches does not contribute distortion. This is because switches connected to the reference voltages always turn off at the same voltage, independent of the eventual sample value, and therefore inject the same signal-independent charge. Whereas the switch shorting adjacent cells connects to capacitors only, so whatever charge it injects through its inversion layer at turn on is almost all removed at turn off. There is a secondorder error due to the fact that the switch FET’s source drain voltages at onset of turn on are unequal, but at turn off are equal. Compared to the previously reported prototype, this DAC improves on the precharge logic as well. Instead of using two pass gates in series (Figure 6(a)) to select the precharge reference voltage and then enable precharge in a particular clock phase, now the output of separate AND gates drives a single pass transistor (Figure 6(b)). This lowers the RC time constant at precharge to about 65 ps, guaranteeing 10τ settling at 400 MHz. It also eliminates a troublesome interline stray capacitance in the previous design which couples clocks into the DAC cell. The reference voltage to precharge the DAC cells is provided from off-chip. The parasitic LC network formed by the bondwire and package inductance and the cell capacitance is in fact underdamped by the very small ON resistance of the precharge switch. Simulations show that at 300 MHz clock rate the cell voltage can be in error by 5-10 LSBs because of ringing (Figure 7(a)). After considering several methods to damp the ringing, a 500-pF on-chip decoupling capacitor was found to work best at this high clock rate. In place of the external voltage source, this now delivers charge to the DAC unit cells.
237
The reference voltages are differential, so the charge flowing through the decoupling capacitor circulates entirely on chip. The decoupling capacitor is built into vacant parts of the chip. Connecting the external reference through two bondwires and package pins also halves the series inductance. Simulations show that the settling error is now lower than 1 LSB at the end of the precharge phase (Figure 7(b)). At these high conversion rates, clock skews are also a concern. The waveforms in Figure 8 (a) illustrate skew between the timing of the latch carrying data to the DAC and the three-phase clock, which samples this data into the DAC cell. Phase straddles a clock transition in the latch, which means that the DAC cells that precharge on this phase may be corrupt. As shown in Figure 8 (b), delaying the pipeline register clocks to synchronize to the rising edge of or eliminates skew. The waveforms at the bottom show that this guarantees a safety margin between the conclusion of DAC cell precharge and the update in the latch contents. In this application, the op amp used in the output buffer must be fast, and should not slew rate limit otherwise the discrete-time output waveform will distort. A standard single-cascode op amp is used (Figure 9 (a)). A gate overdrive voltage on the input stage FETs ensures that the 0.5V ptp differential voltage applied to the input stage does not drive it into slew-rate limiting. The output swing of the DAC is also 0.5V ptp. The continuous-time common-mode feedback circuit (Figure 9 (b)) is designed to operate with this signal swing. The plot in Figure 9 (c) shows the decay in time of the op amp differential input voltage on a logarithmic axis. The dashed lines correspond to perfect exponential settling with different time constants during and The 50 dB DC gain determines the steady-state error. This graph shows that the op amp settling is close to a piecewise exponential, which means low dynamic distortion at the DAC output. Experimental Results The DAC is integrated in a CMOS process with linear MOS capacitors (Figure 10). The multiphase clock generation is on-chip. The total active area is 1.2 sq. mm. A digital frequency synthesizer to drive the DAC is also integrated on the same chip, although it is not shown in the photomicrograph. During testing, the chip is mounted in a standard large cavity ceramic package, which in turn is attached to
238
a PC board with a large ground plane. The externally supplied reference voltages are capacitively decoupled to ground via low inductance connections. Figure 11 shows measured spectra of sinewaves synthesized at the DAC output as it clocks at 250 MHz. A 12 MHz synthesized sinewave is accompanied by a harmonic 58 dB below the fundamental. When the synthesized frequency goes up to 112 MHz, the largest spurious tone rises by only 3 dB, to 55 dB below the fundamental. Figure 12 summarizes the measured spurious-free dynamic range (SFDR) as a function of synthesized frequency over the Nyquist band, at conversion rates ranging from 50 to 300 MS/s. At low clock rates and synthesized frequencies, the peak SFDR is 64 dB. Two trends are apparent in this plot. At any conversion rate, the SFDR declines by only 3 dB over the full Nyquist band. This is proof that the D/A conversion is glitch free. Also, the per-sample resetting action of the DAC output buffer guarantees uniformity in the output waveform independent of synthesized frequency. On the other hand, when the clock rate is raised from 50 to 300 MS/s, peak SFDR falls by 10 dB. This is most likely due to small departures from perfect exponential settling in the op amp, which at high conversion rates take up an increasing fraction of each clock cycle. This DAC’s SFDR is compared with two other CMOS DACs. Figure 13 (a) compares SFDR at 300 MS/s rate with a 12b current steering DAC [6]. At low synthesized frequencies the comparison DAC shows superior SFDR, although it falls quickly with sinewave frequency. Our DAC’s SFDR is better for synthesized frequencies greater than 15 MHz. Figure 13 (b) compares the performance with a 10 b DAC clocked at 250 MS/s [7,8]. The comparison DAC, also current steering, was more carefully segmented and laid out for good dynamic performance. In this case our DAC shows higher SFDR for sinewave frequencies beyond 50 MHz. Figure 13(c) plots the same comparison DAC’s worst-case SFDR over Nyquist versus clock rate against our DAC’s. Beyond 100 MS/s, our DAC shows higher SFDR. Summary This paper describes a 10b DAC implemented in CMOS, which converts at rates up to 400 MS/s. The DAC and associated circuits occupy DNL is less than 0.25 LSB and INL less than
239
0.35 LSB. The DAC consumes 95 mW total from 3.3V, of which 25 mW is in the buffer op amp. This DAC’s unique feature is its relatively flat SFDR over the full Nyquist range of synthesized frequencies. At conversion rates beyond 100 MHz, op amp dynamics limit peak SFDR. This work shows that for communication applications sensitive to SFDR, D/A conversion in the charge domain is an important alternative to conventional conversion in the current-domain. References
[1] A. Rofougaran, G. Chang, J. J. Rael, J. Y.-C. Chang, M. Rofougaran, P. J. Chang, M. Djafari, M. K. Ku, E. Roth, A. A. Abidi, and H. Samueli, “A Single-Chip 900 MHz Spread-Spectrum Wireless Transceiver in CMOS (Part I: Architecture and Transmitter Design),” IEEE J. of Solid-State Circuits, vol. 33, no. 4, pp. 515-534, 1998. [2] K. Khanoyan, F. Behbahani, and A. A. Abidi, “A 10 b, 400 MS/s glitch-free CMOS D/A converter,” in Symp. on VLSI Circuits, Kyoto, Japan, pp. 73-76, 1999. [3] F.-J. Wang, G. C. Temes, and S. Law, “A Quasi-Passive CMOS Pipeline D/A Converter,” IEEE J. of Solid-State Circuits, vol. 24, no. 6, pp. 1752-1756, 1989. [4] G. Chang, A. Rofougaran, M. K. Ku, A. A. Abidi, and H. Samueli, “A Low-Power CMOS Digitally Synthesized 0-13 MHz Agile Sinewave Generator,” in Int’l Solid State Circuits Conf., San Francisco, pp. 32-33, 1994. [5] M. J. McNutt, S. LeMarquis, and J. L. Dunkley, “Systematic Capacitance Matching Errors and Corrective Layout Procedures,” IEEE J. of Solid-State Circuits, vol. 29, no. 5, pp. 611-616, 1994. [6] A. Marques, J. Bastos, A. Van den Bosch, J. Vandenbussche, M. Steyaert, and W. Sansen, “A 12 b Accuracy 300 Msample/s Update Rate CMOS DAC,” in Int’l Solid-State Circuits Conf., San Francisco, CA, pp. 216-217, 1998. [7] C.-H. Lin and K. Bult, “A 10b, 250 MS/s CMOS DAC in ,” in Int’l Solid-State Circuits Conf., San Francisco, CA, pp. 214-215, 1998. [8] C.-H. Lin, A 10b 500MSamples/s CMOS DAC in , PhD Thesis in Electrical Engineering. University of California, Los Angeles: 1998.
240
241
242
243
244
245
246
Design Considerations for RF Power Amplifiers demonstrated through a GSM/EDGE Power Amplifier Module Peter Baltus and André van Bezooijen Philips Semiconductors Gerstweg 2 6534 AE Nijmegen
Abstract
This paper describes the design considerations for RF power amplifiers in general, including trends in systems, linearity and efficiency, the PA environment, implementation issues and technology. As an example a triple-band (900/1800/1900MHz) dual mode (GSM/Edge) power amplifier module is described in this article. The RF transistors and biasing circuitry are implemented in silicon bipolar technology. A multi-layer LTCC substrate is used as carrier.
1. Introduction
Currently, many cellular systems are in use in different regions of the world, and in many places more than one system is in use simultaneously. In Europe and Asia the dominant system is currently GSM, in the US it is IS95, but also AMPS and GSM-like systems co-exist, and in Japan PHS, PDC and IS95 coexist. The handsets for these systems use a low-power transceiver to communicate with a network of base stations using radio transmissions in the frequency range of 800MHz to 2500MHz, and transmit power levels in the range of l0mW to 2W. Table 1 below shows an overview of important cellular 249 J. H. Huijsing et al. (eds.), Analog Circuit Design, 249-268. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
250
systems.
1.1 Trends in cellular systems New so-called third generation (3G) systems are being introduced which will allow for higher capacity (more users and higher user bit rates), and better compatibility across the world. Several of these systems are covered by the IMT2000 standard and include W-CDMA/UMTS (Japan and Europe) and CDMA2000 (USA).
In most countries, there will be a gradual change from existing second generation (2G) systems to future 3G systems, with both systems coexisting for at least a few years. For this reason, and because no single standard will achieve world-wide coverage in the near future, handsets that can connect to multiple systems will be required. Such handsets will provide access to the features of the 3G networks where available, while still providing access to 2G networks where this is not the case. This implies that there will be a need for multi-mode and multi-band power amplifiers as well. 1.2 Challenges for PA design The desired functionality for these RF power amplifiers is easily described
251
and modeled: it should accurately amplify an incoming RF signal by a fixed (or programmable) gain:
With
the output power,
the input power, and
the power gain.
The simplicity of this desired functionality is apparent from many modern implementations, which consist of relatively few active devices (starting at 2 transistors). Therefore, it might seem that the design of such a simple function requires little effort and deserves little attention. As with many other RF circuits in cellular phones, the justification for all the effort that goes into their design is derived from the combination of: the importance of these circuits to the overall performance of the handset the many specifications that need to be achieved simultaneously The power amplifier is important to the overall performance of the handset since it typically consumes the largest part of the power in a handset when active, and is therefore the most important factor in the talk time of a handset. For that reason, power efficiency is a very important specification of a PA. This efficiency has to be achieved while still meeting the many specifications required to have the handset work well within the system: Linearity is becoming an important issue especially in newer systems that use advanced modulation schemes to achieve better bandwidth efficiency Robustness is important since the handset is part of a rather variable environment, in which power supply voltage, load impedance, temperature, transmit frequency, and output power can vary quickly and sometimes over large ranges. Since the optimization of efficiency often results in voltages and currents close to the reliability limits of the technology, significant changes in any of these parameters can result in performance degradation or even complete failure of the device. Conversely, preventing such robustness problems often results in designs with voltages and currents that cannot be optimized for efficiency.
252
Stability, especially under load mismatch conditions. Such conditions can for example arise when the antenna environment changes. Noise, especially in the receive band of the system, since this affects the sensitivity of receivers in the system Spurious emissions, which can interfere with other electronic equipment or with transmissions from other handsets or from basestations in the same system Thermal behavior, including performance impact of temperature changes on the handset, and impact on reliability Multi-mode multi-band functionality, which requires adjustable properties of the power amplifier, and/or switches and adjustable circuits around a number of individual power amplifiers. This added complexity affects in turn the other specifications such as gain, linearity, output power, etc. This paper will give an overview of these issues. In the next section, the relation between bandwidth efficiency, power amplifier linearity and power efficiency will be discussed at system and circuit level. Section 3 describes the environment of the power amplifier, which is the basis for relating the systems considerations of the previous section (2) to the PA issues in the next section (4). Implementing the power amplifier is discussed in section 5, using a Si/LTCC integrated GSM/EDGE power amplifier module to demonstrate the relevant issues.
2. Bandwidth Efficiency, Power Efficiency and PA Linearity
In the past more capacity could be found by moving up in frequency with improvements in device technology (recently into the 2GHz range). This trend is not likely to continue in the future because the link budget is unfavorable for such higher frequencies (fig. 1). Instead, the increased capacity is achieved by more efficient use of available bandwidth through advanced modulation schemes and access methods. An important consequence for power amplifiers is that these efficient
253
modulation schemes (such as QPSK, QAM, etc) are not constant envelope anymore, and therefore require more linearity from the power amplifier. It is not likely that customers will accept significant reductions in talk time or larger and more expensive batteries and handsets, therefore the efficiency of the PA cannot be compromised too much by the new linearity requirements.
At the system level these amplifier non-linearities results in a so called spectral re-growth. In-band energy is transformed into energy out of band that might disturb reception in adjacent frequency channels. Figure 2 and Figure 3 show the effect of non-linearity (in this case hardlimiting) on vector diagrams and transmission spectra (spectral re-growth) of advanced modulation schemes (in this case UMTS). Spectral re-growth is quantified through a parameter called adjacent channel power rejection or ACPR. This parameter defines the amount of energy in the adjacent channel relative to the energy of the transmitted signal. Non-linearities also have an impact on the wanted signal in the sense that amplitude and phase information, modulated onto the carrier, are disturbed and therefore demodulation on the receiving side can result in incorrect data at base-band. This data is often visualised as a set of amplitude normalised
254
discrete I and Q values that represent the symbols being transmitted. Due to distortion the I and Q values of each symbol shift. The Error Vector Magnitude (EVM) is used to quantify this shift.
ACPR and EVM are determined by the amplifier non-linearities in combination with the signal. For the various systems both the requirements on ACPR and EVM as well as the properties of the signal (Pout, peak-to-
255
average, power density distribution, ..) are different. This makes comparison of linearity requirements for the various standards difficult.
On the other hand, AM-to-AM and AM-to-PM conversion are inherent properties of the circuit. For a given protocol ACPR and EVM are related to the combination of AM-to-AM and AM-to-PM. Therefore a maximum allowable AM-to-AM requirement can not be defined independent of the maximum allowable AM-to-PM and visa versa. In practice a whole set of AM-to-AM and AM-to-PM combination can be found that fulfil the ACPR and EVM requirements. A much larger set that doesn’t. Consequently, PA circuit optimisation can best be done by optimising for the system parameters ACPR and EVM rather than for AM-to-AM and AM-to-PM [2]. A single stage bipolar amplifier has several causes of non-linearity. We can distinguish contributions due to voltage saturation at the collector, transistor current density variations, supply voltage variations at the transistor base and supply voltage variations at the collector.
256
2.1 Voltage saturation. Power amplifiers are optimised for optimum power added efficiency and linearity. The collector load impedance at the fundamental frequency and harmonics is chosen such that at the maximum required output power the complete signal voltage headroom at the collector is used. Consequently, the transistor is driven in saturation as much as possible up to the point where saturation becomes too severe. This trade-off is limited by the emitter ballast resistors needed for thermal stability and by the transistor collector resistance and quasi saturation behaviour related to that.
2.2 Current density variations Although power amplifiers for applications like cdmaOne, Edge, W-CDMA etc. are often referred to as linear amplifiers their behaviour is non-linear. Due to class A/B operation the current through an RF transistor, biased typically at 50mA, can increase up to an average of 500mA at maximum output power. Under influence of the carrier envelope the transistor operating point changes drastically. As a result the transistor input impedance varies with the carrier envelope [3]. The impedance match with the source is power dependent and thus varies with the envelope. This effect can be used advantageously. At power levels close to saturation the amplitude of the gain tends to drop down. When the source impedance match is optimal around this power level and less optimum at lower power levels the gain can be flattened out over a wider power level range [4].
2.3 Supply voltage variations Non-linearity of the RF-transistor results in low frequency components in the collector and base current. These low frequency components are related to the data modulated on to the carrier. For Edge the modulation bandwidth is 100kHz approximately. Any resistance in the power supply of the collector will result in low frequency supply voltage variations at the collector. At high power levels this drives the RF-transistor further in to saturation. Therefore, proper LF-decoupling of the collector supply voltage is necessary for achieving maximum linearity.
257
In the base of the RF-transistor low frequency current components are present beta times smaller than in the collector. Any resistance in the voltage supply of the base (output resistance of the biasing circuit) results in low frequency supply variations at the base. The voltage drop due to the output resistance of the biasing circuit changes the operating point of the RF-transistor which results in additional distortion. In a TDMA system, like GSM, the amplifier is switched on and off in bursts by switching the biasing circuitry. Consequently, LF-decoupling of the base can not be applied because it would disturb the amplifier turn-on and turn-off behaviour. Together, this behavior results in a trade-off of linearity and efficiency as shown in the figure below (Figure 5):
3. Environment
The relation between the system requirements and the PA requirements is determined by the environment of the PA. The environment of the PA typically consists of: a transceiver IC at the input. This transceiver IC generates the RF transmitter signal at a low power level, often around 0dBm. A filter is often placed between the transceiver IC output and the PA input to
258
eliminate noise from the transmitter IC outside the transmission band. antenna interface circuits that can include matching circuits, isolator, duplexer, switches, and diplexers to connect the PA to one or more antennas a control IC that sets gain, biasing, and/or output power levels through a number of control pins a power supply, which is often directly coming from the battery, but can also be provided through a DC/DC converter A typical PA environment is shown in the figure below (Fig. 6):
The duplexer is responsible for connecting the antenna to both the transmitter and receiver in such way that the energy from the transmitter is sent to the antenna only (and not the receiver), whereas the energy received by the antenna is sent to the receiver only (and not the transmitter). Depending on the access and duplex methods of the system, the duplexer can be implemented either as a traditional duplexer filter and/or through switches. The diplexer connects transceivers for different systems and/or bands to the antenna. Again, depending on the properties of the systems, this can be
259
implemented through filters and/or switches. The isolator serves to protect the power amplifier from impedance mismatches at the duplexer input. This is not always necessary, e.g. for GSM type systems this component is typically left out. By presenting the PA a fixed and nominal load impedance independent of the actual input impedance of the duplexer, the design of the PA can be further optimized since the influence of the load impedance on linearity, reliability and stability does not have to be taken into account. To give a first impression of the type of load change that can be expected from an antenna through changes in the environment, the figure below (Figure 7) shows simulation results of a dipole with and without a conducting body at 2cm distance, not untypical for a handset antenna near the head or in free space.
As shown by the simulation results, the impedance change of the antenna is
260
quite dramatic and results in large changes of the return loss, e.g. from –10dB to –2dB around 1.37GHz. This results in a reduction of transmitted power from 90% to 37%. Considering all the effort spent on optimizing the efficiency of the PA, these numbers are very significant. Measurements on various antennas show that these numbers do occur in practice as well, and in some cases can be even worse. Since efficiency is such an important parameter, it is very useful to find out where power is lost in the total system. The figure below shows a typical situation for a GSM PA in a multi-mode system. The numbers represent the power consumption in Watt.
From this figure, it becomes clear that there is a very large power loss between power drained from the battery and power ultimately delivered to the electromagnetic field: the overall efficiency in this not so unrealistic scenario is 8%, and is composed of the following major items: PA proper: 58% efficiency Antenna interface (matching, duplexer): 43% efficiency Antenna: 37% efficiency. Considering that the theoretical efficiency of an ideal class A/B amplifier is
261
78%, it is obvious that the potential for improving overall efficiency by improving the PA proper (e.g. by going to more expensive active devices) is limited. Instead, passive devices and the antenna are more obvious candidates for overall efficiency improvement.
4. Power amplifier
After taking into account the environment of the PA, what remains is a number of issues and specifications that need to be achieved in the PA itself, through careful choices in the partitioning, implementation and technologies. It is rather common to implement GSM power amplifiers as hybrids. This allows for usage of best combinations of active and passive technologies in order to be able to meet severe specifications on reliability, ruggedness, stability, power added efficiency, size and cost. Moreover, a hybrid power amplifier solutions is attractive because, due to the matching networks at input and output and the on-module power supply decouplings, the amplifier function is well defined and therefore easily applicable. Reliability (life-time) of a GSM amplifier is mainly related to the maximum temperatures that occur. Especially for the recently defined class 12 operation, with an on/off duty cycle of 50%, the solder between PCB and hybrid module, the glue to attach the die on the LTCC substrate and the Aluminium interconnect of the die might approach critical temperature values. Moreover, the amplifier has to survive very severe conditions that might happen occasionally. For instance, the amplifier should not be damaged when the antenna is being disconnected while the battery is being charged and collector voltages up to 20V may occur. This poses rather severe requirements on the collector-base breakdown voltage. In a GSM handset the power amplifier dissipates a significant amount of power and thus determines the standby and talk time in to a great extend. In particular the final RF-transistor geometry and the output matching network
262
have to be designed for maximum power added efficiency [1]. The output matching network provides an optimum collector load impedance for the fundamental frequency as well as for the harmonics. It is realised by means of High-Q microstrip lines, integrated on LTCC, and high-Q SMD capacitors in order to minimise insertion losses. For a dual mode GSM/Edge power amplifier additional requirements with respect to linearity have to be met. There is not much design freedom for optimising the linearity of a GSM/Edge amplifier when typical GSM specifications have to be met anyway. In this example, the biasing of the RF transistor in GSM mode has been made independent of the biasing in Edge mode. Optimum linearity is achieved by optimising the DC operating points of the three cascaded RF-transistors each. 5. Implementation
In this section we will discuss implementation details of the GSM/EDGE PA. This PA is relevant for a number of reasons: This combination of systems in a single handset is likely to become popular It is an optimised combination of a saturated, strongly non-linear PA for GSM mode and a linear PA for EDGE mode, with integrated mode switching It is typical for many of the multi-mode multi-band power amplifiers that will be needed in the transition period between 2G and 3G systems. The Edge protocol has been adopted as an evolutionary path for enhanced data-rates and increased capacity in GSM. Edge is compatible to GSM in the sense that it operates in the same frequency bands and that it makes use of the same channel bandwidth and channel spacing. The data-rate, however, has been made a factor 3 higher by applying offset 8-PSK (non-constant envelope) modulation and appropriate modulation filtering. The amplifier module consists of two fully independent RF line-ups (see Figure 9). Each line-up consists of a 50 ohm input matching circuit, three
263
cascaded RF transistor with interstage matching circuits in between and a 50 ohm output matching circuit. The module can operate either in GSM-mode or Edge-mode by activating the GSM biasing circuits or Edge biasing circuits respectively. In GSM mode the output power can be controlled with Vcntrl. In Edge mode, however, the output power is determined by the input power. The gain of the amplifier is constant. The biasing circuits of Edge-mode are activated by applying a stabilised voltage Vstab. As shown in the block diagram, the module contains an output power detector for 900MHz and for 1800/1900MHz. These outputs can be used to close a power control loop for smooth up and down ramping of the power. To study and optimise the linearity in Edge-mode the bias current of the 2nd and 3rd RF-transistor can be enhanced by applying a current Iref2 and Iref3 respectively.
264
Figure 10 shows a photograph of the triple-band GSM/Edge power amplifier. The 900MHz line-up is visible at the left hand side and the wide-band 1800/1900MHz line-up at the right hand side. The die at the bottom side forms the driver IC for the final stage that is positioned at the top side. 0402 SMDs are used for decoupling of the power supply lines feeding the RFtransistors and biasing circuits. Input, interstage and output matching networks are build-up with discrete capacitors and microstrip line inductors on top of the ceramic substrate. In order to reduce DC and/or RF losses relatively wide traces are used for the RF-choke that feeds the final stage, as well as for the output matching microstrip lines. In a final product the module is encapsulated with an 0.25mm thin plastic cap. The module size is 11x13.75x1.8mm.
265
5.1 Output matching network On the left top side of the module the RC-choke to feed the 900MHz final stage is visible. The choke is RF decoupled at the supply side and made resonant using a capacitor located close to the collector bondwires. The output matching network, located next right to the RF choke, consists of several sections to transform the 50ohm load, in several steps, to an optimum collector impedance of about 2 Ohm at the fundamental frequency and 0.5+l0j at the second harmonic. Rejection of the second and third harmonic is obtained by series resonance of matching capacitors and their series self inductance plus via inductance which gives notches in the transfer function. In simulations a typical insertion loss of 0.8dB can be obtained. The attenuation at the second and third harmonic is typically 25dB and 35dB respectively.
5.2 Thermal design Under nominal operating conditions the amplifier module dissipates, during the power burst, approximately 3.5W when the amplifier output power is at a maximum of approx. 3.5W. Under antenna mismatch conditions combined with high battery supply voltage the power dissipation can be even twice that value. Thermal stability is ensured by applying emitter ballast resistors.
The heat, mainly generated in the emitters of the final stage, is being spread by the 200um thick silicon die and flows through the glue toward the die attach area on top of the LTCC. Internal LTCC layers are used to partly spread out the heat horizontally. The heat flows further through the copper filled vias of the LTCC substrate towards the PCB that contains several layers of copper to further spread out the heat into the telephone set. The module is designed for a thermal resistance of less than 30 K/W in order to keep the maximum die temperature below 125°C, for a maximum mounting base temperature of 85°C and a power dissipation of 7W in the pulse with an on/off duty cycle of 25%. 5.3 Biasing circuit topology Figure 11 shows the circuit topology for biasing the
RF transistor in Edge
266
mode. The current drives, via the PNP mirror T60/T61, the NPN current mirror formed by T62 and T1. Emitter degeneration resistances R60/R61 are added to increase output resistance of the PNP current mirror T60/T61 in order to reduce the supply voltage dependancy and to improve matching of this current mirror [5]. T63 improves the accuracy of the NPN current mirror factor with its current multiplication factor of beta. The resistors R63 and R62 are added to make the topology of the left hand side of the NPN mirror equal to the topology of right hand side where R16 can be used to provides RF isolation between T1 and the biasing circuit and R1 is used to degenerate T1. Summing voltages around the loop including T62 and T1 we find
Making the assumption that
Since
we find that
defined by emitter area ratios, the solution to (2) is
which is achieved by
and
As a result the last term in (2) goes to zero which makes the biasing of T1 almost temperature independent.
267
To guarantee stability of the circuit a resistor Rdamp is added to give at RF frequencies resistive loading at the high ohmic point. The dotted transistor T13 is part of the GSM biasing circuit and is not active in Edge mode.
Conclusions
Power amplifiers are very relevant components of a handset transmitter, since they consume a large part of the total power dissipation. Moreover, the linearity requirements in newer systems are difficult to combine with high efficiency. The total performance depends for a larger part on the passive components and antenna than on the active part. The difficulty in designing a good PA is in achieving good efficiency while meeting many other specifications (stability, reliability, linearity, gain, power, etc.) simultaneously. The GSM/Edge power amplifier module used as an example throughout this paper illustrates that multi-mode multi-band power amplifiers can be realised well in Si-bipolar technology combined with multi-layer LTCC substrate.
268
Acknowledgements This paper is based on insights built up in several teams throughout Philips, including the Philips Semiconductor PA development teams in Sagamihara (Japan), Mansfield (U.S.A.) and Nijmegen (The Netherlands), as well as the Philips Research Integrated Transceiver group in Eindhoven (The Netherlands). The GSM/Edge power amplifier module could only be realised with the help from enthusiastic team members. Being aware that their contributions to the success of this project were very essential I would like to thanks Dima Prikhodko for his work on 1C circuit development, Skule Pramm and Gerd Kahmen for designing the substrate and optimising the module and Christophe Chanlo for simulating ACPR and EVM. Last but not least I would like to thanks Reza Mahmoudi for the enlightened discussions we had and for developing dedicated ACPR and EVM simulation tools. References: [1] F. van Rijs, R. Dekker, H.A. Visser, H.G.A. Huizing, D. Hartskeerl, P.H.C. Magnee, R.Dondero. “Influence of output impedance on power added efficiency of Si-bipolar power transistor” International microwave symposium digest, Volume 3, June 11-16, 2000
[2] Private communication with Reza Mahmoudi [3] Keng Leong Fong and Robert G. Meyer, “High-Frequency Nonlinearity Analysis of Common-Emitter and Differential-Pair Transconductance Stages”, IEEE journal of solid-stade circuits vol 33, no 4, April 1998 [4] R. Mahmoudi, “Multi-Disciplinary design method for 2.5 generation of mobile communication systems” to be published in September 2001, Twente University Press. [5] Paul R.Gray and Robert G. Meyer, “Analysis and Design of Analog Integrated Circuits”, second editions, John Wiley & Sons 1984
CLASS-E HIGH-EFFICIENCY RF/MICROWAVE POWER AMPLIFIERS: PRINCIPLES OF OPERATION, DESIGN PROCEDURES, AND EXPERIMENTAL VERIFICATION Nathan O. Sokal, IEEE Life Fellow Design Automation, Inc. 4 Tyler Road Lexington, MA 02420-2404 U. S. A.
ABSTRACT Class-E power amplifiers [1]-[6] achieve significantly higher efficiency than for conventional Class-B or -C. Class E operates the transistor as an on/off switch and shapes the voltage and current waveforms to prevent simultaneous high voltage and high current in the transistor; that minimizes the power dissipation, especially during the switching transitions. In the published low-order Class-E circuit, a transistor performs well at frequencies up to about 70% of its frequency of good Class-B operation (an unpublished higher-order Class-E circuit operates well up to about double that frequency). This paper covers circuit operation, improved-accuracy explicit design equations for the published low-order Class E circuit, optimization principles, experimental results, tuning procedures, and gate/base driver circuits. Previously published analytically derived design equations did not include the dependence of output power (P) on loadnetwork loaded as a result, the output power was 38% to 10% less than expected, for values in 269 J. H. Huijsing et al. (eds.), Analog Circuit Design, 269-301. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
270
the usual range of 1.8 to 5. This paper includes an accurate new equation for P that includes the effect of
1. "WHAT CAN CLASS E DO FOR ME?"
Typically, Class-E amplifiers [1]-[6] can operate with power losses smaller by a factor of about 2.3, as compared with conventional ClassB or -C amplifiers using the same transistor at the same frequency and output power. For example, a Class-B or -C power stage operating at 65% collector or drain efficiency (losses = 35% of input power) would have an efficiency of about 85% (losses = 15% of input power) if changed to Class E (35%/15% = 2.3). Class-E amplifiers can be designed for narrow-band operation or for fixed-tuned operation over frequency bands as wide as 1.8:1, such as 225-400 MHz. (If harmonic outputs must be well below the carrier power, any amplifier other than Class A or push-pull Class AB cannot operate over a band wider than about 1.8:1 with only one fixed-tuned harmonicsuppression filter.) Harmonic output of Class-E amplifiers is similar to that of Class-B amplifiers. Another benefit of using Class E is that the amplifier is a priori designable; explicit design equations are given here. The effects of components and frequency variations are defined a priori [4, Figs. 5 and 6] and [7], and are small. When the amplifier is built as designed, it works as expected, without need for "tweaking" or "fiddling." 2. PHYSICAL PRINCIPLES FOR ACHIEVING HIGH EFFICIENCY
Efficiency is maximized by minimizing power dissipation, while providing a desired output power. In most RF and microwave power amplifiers, the largest power dissipation is in the power transistor: the product of transistor voltage and transistor current at each point in time during the RF period, integrated and averaged over the RF period. Although the transistor must sustain high voltage during part
271
of the RF period, and must also conduct high current during part of the RF period, the circuit can be arranged so that high voltage and high current do not exist at the same time. Then the product of transistor voltage and current will be low at all times during the RF period. Fig. 1 shows conceptual "target" waveforms of transistor voltage and current that meet the high-efficiency requirements. The transistor is operated as a switch. The voltage-current product is low throughout the RF period because: 1. "On " state: The voltage is nearly zero when high current is flowing, i.e., the transistor acts as a low-resistance "on" switch during the "on" part of the RF period. 2. "Off " state: The current is zero when there is high voltage, i.e., the transistor acts as an "off" switch during the "off" part of the RF period. Switching transitions: Although the designer makes the on/off switching transitions as fast as feasible, a high-efficiency technique must accommodate the transistor's practical limitation for RF and microwave applications: the transistor switching times will, unavoidably, be appreciable fractions of the RF period. We avoid a high voltage-current product during the switching transitions, even though the switching times can be appreciable fractions of the RF period, by the following two strategies: 3. The rise of transistor voltage is delayed until after the current has reduced to zero. 4. The transistor voltage returns to zero before the current begins to rise. The timing requirements of 3 and 4 are fulfilled by a suitable load network (the network between the transistor and the load that receives the RF power), to be examined shortly. Two additional waveform features reduce power dissipation: 5. The transistor voltage at turn-on time is nominally zero (or is the saturation offset voltage for a bipolar junction transistor, hereafter "BJT"). Then the turning-on transistor does not discharge a charged shunt capacitance of Fig. 2), thus avoiding dissipating the capacitor's stored energy of f times per second, where V is the capacitor's initial voltage at transistor turn-on and f is the operating
272
frequency. comprises the transistor output capacitance and any external capacitance in parallel with it.) 6. The slope of the transistor voltage waveform is nominally zero at turn-on time. Then the current injected into the turning-on transistor by the load network rises smoothly from zero at a controlled moderate rate, resulting in low power dissipation while the transistor conductance is building-up from zero during the turn-on transition, even if the turn-on transition time is as long as 30% of the RF period. Result: The waveforms never have high voltage and high current simultaneously. The voltage and current switching transitions are time-displaced from each other, to accommodate transistor switching transition times that can be substantial fractions of the RF period, e.g., turn-on transition up to about 30% of the period and turn-off transition up to about 20% of the period. The low-order Class-E amplifier of Fig. 2 generates voltage and current waveforms that approximate the conceptual "target" waveforms in Fig. 1; Fig. 3 shows the actual waveforms in that circuit. Note that those actual waveforms meet all six criteria listed above and illustrated in Fig. 1. Unpublished higher-order versions of the circuit approximate more closely the target waveforms of Fig. 1, making the circuit even more tolerant of component parasitic resistances and nonzero switching transition times. Differences from conventional Class B and C: The load network is not intended to provide a conjugate match to the transistor output impedance. The load-network design equations come from the solution of a set of simultaneous equations for the steady-state periodic time-domain response, of a network containing non-ideal inductors and capacitors, to periodic operation of a non-ideal switch at the load-network input port, at frequency f, to provide (a) an inputport voltage of zero value and zero slope at transistor turn-on time, (b) a first-order approximation to a time delay of the voltage rise at transistor turn-off, and (c) a nearly sinusoidal voltage across the load resistance R, delivering a specified RF power P from a specified dc
273
supply voltage Vcc . The transistor's operating locus on the plane is not a tilted straight line (resistance) or a tilted ellipse (resistance + reactance). The operation during the "on" state of the switch is a nearly vertical line whose lower end is at the origin (0, 0); the "off" state of the switch is a horizontal line whose left end is at the origin. By design, the operating locus avoids the remainder of the plane, the region of simultaneous high voltage and high current, i.e., of high power dissipation and consequent reduced efficiency; that region is where conventional Class B and C circuits operate. 3. ANALYTICAL AND NUMERICAL DERIVATIONS OF DESIGN EQUATIONS
Analytical derivations of design equations for the circuit of Fig. 2 can be made only by assuming that the current in is sinusoidal. That assumption is strictly true only if the load network has infinite loaded Q defined as and yields progressively less-accurate results for values progressively lower than infinity. is a free2 choice design variable , subject to the condition (obtained from exact numerical analysis [4], [6]) to be able to obtain the nominal3 switch-voltage waveform, for the usual choice of the switch “on” duty ratio4 D being 50%.) The amplifier's output power P depends primarily (derivable analytically) on the collector/drain dcsupply voltage Vcc and the load resistance R, but secondarily (not derivable analytically) on the value chosen for Previously published analytically derived design equations did not include the dependence of P on As a result, the output power is 38% to 10% less than had been expected, for values in the usual range of 1.8 to 5. This paper includes an accurate new equation for P that includes the effect of Similar restrictions apply to the analytical derivations of design equations for and R. However, the needed component values can be found by numerical methods. Table I gives normalized exact numerical solutions for output power (hence the needed value of R), and for eight values of over the
274
entire possible range of 1.7879 to infinity, for the usual choice of D = 50%.
The design equations in the next section are continuous mathematical functions fitted to those eight sets of data. (Having the numerical values of Table I, readers can derive other mathematical functions to fit the data, if they wish, to substitute for the equations given below.) Kazimierczuk and Puczko [5] published a tabulation similar to Table I here (using a different mathematical technique, but the two sets of tables agree well; see Section 5, below), but they did not include continuous-function design equations based on their tabular data. As a result, a designer using [5] can produce an accurate design at any chosen tabulated value of but the designer lacks accurate design information for use at values of in-between the tabulated values. Avratoglou and Voulgaris [8] gave an analysis, and numerical solutions as graphs, but no tables of computed values and no design equations fitted to the numerical results. Precise design values cannot be read from the graphs. To be able to make accurate circuit designs and a priori design
275
evaluations at any arbitrary value of the designer needs design equations comprising continuous mathematical functions, rather than a set of tabulated values as in Table I or [5]. The equations should give accurate results, and should be simple enough to be easy for the designer to manipulate. Such equations are given below, for lossless components. The losses are accounted for in [2], [4], [9], [10], and unpublished notes; the author intends to publish equations for all components of power loss and the resulting collector/drain efficiency. Briefly: Calculate R from (6) or (6a), using for P the desired output power divided by the expected collector/drain efficiency (see (2) below for collector/drain efficiency). Then the needed load resistance is
where is the "on" resistance of the transistor. is a generic term that represents of a MOSFET or a MESFET, or of a BJT. The expected collector/drain efficiency is approximately
where is the 100%-to-0% fall time of the assumed linear fall of the collector/drain current at transistor turn-off, T= 1/f is the period of the operating frequency f, and "0.01" allocates about 1% loss of efficiency for the power losses in the dc and RF resistances of the dc-feed choke (substitute a different numerical value, if you wish). 4. EXPLICIT DESIGN EQUATIONS
The explicit design equations given below yield the low-order lumped-element Class-E circuit that operates with the nominal waveforms of Fig. 3. (Distributed-element circuits are discussed briefly at the end of Section 9.) In the equations below, Vcc is the dc supply voltage, P is the power delivered to the total effective circuit
276
resistance lumped into a single resistor R (see (1) above), f is the operating frequency, (dc-feed choke), and are the load network shown in Fig. 2, and is the network loaded Q, chosen by the designer as a trade-off among competing evaluation criteria.2 In a nominal-waveforms circuit operating with the usual choice of D = 50%, the minimum possible value of is 1.7879 (the circuit can work well with lower values of but the transistor-voltage waveform will be off-nominal: larger than zero at the transistor turnon time); the maximum possible value of is less than the network's unloaded Q. The design procedure is as follows:
The chosen safety factor (e.g., 0.75) allows for not exceeding the transistor’s breakdown voltage by a higher-than-nominal peak voltage (in this example, up to 1/0.75 = 133% of nominal) that could result from off-nominal load impedance. Choose as determined by the transistor’s or the available power-supply voltage. The relationship among P, R, and the transistor saturation offset voltage is least-squares fitted to the data in Table I, over the entire range of from 1.7879 to infinity, within a deviation of ±0.15%, by a second-order polynomial function of
Hence
Alternatively, a third-order polynomial in gives a closer leastsquares fit to the data, to within -0.0089% to +0.0072%:
277
Hence
The effective dc-supply voltage is the actual voltage, less the transistor saturation offset voltage, hence is zero for a fieldeffect transistor. For a BJT, is of the order of 0.1 V at low frequencies, and up to a few volts (depending on the transistor fabrication) at frequencies higher than about The design equations for and that fit the data in Table I are given below. The last terms in (7), (8), and (9) are adjustments to the expressions fitted to the Table-I data, to account for the small effects of the nonzero susceptance of The numerical coefficients in those last terms depend slightly on and those dependencies will be the subject of a planned future article. For the example case of and the usual choice of being 30 or more times the unadjusted value of the adjustments for the susceptance of add 2% or less to the unadjusted value of and subtract 0.5% or less from the unadjusted value of
Finally, is determined by (a) the designer's choice2 for the value of R from (3) or (3a):
and (b)
278
Equations (4) through (9) are more accurate than the older versions in [1], [2], [4], and [6]. 5. ACCURACY OF DESIGN EQUATIONS
The maximum deviations of (5) from the tabulated values in Table I are ±0.15%; those of (5a) are -0.0089% and +0.0072%; those of (7) and (8) are ±0.13%; and those of (9) are ±0.072%. Kazimierczuk and Puczko [5] give tables of numerical data (similar to Table I here), obtained by a Newton's-method numerical solution of a system of analytical circuit equations they derived, and other useful numerical and graphical data. The tabulated values of P in [5] are within -0.13% to +0.47% of the values obtained from the continuous function (3) above. Those differences include (a) the error in the fitting of the continuous function in (3) to the discrete values in Table I (±0.15%) and (b) the differences (if any) between the numerical results of [5] and of Table I here. Those two sets of tabulated values can be compared directly at only their two values of in common: infinity (identical results) and 1.7879 ([5] has the same capacitance values and 0.28% lower P). The independently computed sets of data here and in [5] agree well (a maximum difference of about 0.3%), giving confidence in the validity of both. 6. HARMONIC FILTERING AND ASSOCIATED CHANGES TO DESIGN EQUATIONS The power in (5) or (5a) is the total output power, at the fundamental and harmonic frequencies. Most of the power is at the fundamental frequency. The strongest harmonic is the second, with a voltage or current amplitude at R of relative to the fundamental. For example, with the second-harmonic power is -20 dBc (1% of the fundamental power) without any filtering. Even-order harmonics
279
can be canceled with a push-pull circuit, if desired. In that case, the strongest harmonic is the third, at an amplitude of relative to the fundamental, hence -36 dBc (0.025% of the fundamental power) without filtering, for the same example of 5.1 . Sokal and Raab [11] give the harmonic spectrum as a function of the chosen If the circuit includes a low-pass or band-pass filter between R and the branch instead of being a direct connection as in Fig. 2, the fractions of the output power contained in each of the harmonics will decrease, according to the transmission function of the filter at the harmonic frequencies. As a small side-effect, the total output power and the waveforms of switch voltage and current will change slightly, requiring small changes to the numerical coefficients in (6) through (9) above, and in Table I and [5], New sets of numerical values can be calculated quickly with the help of a computer program such as HEPAPLUS [7], described briefly in Sections 7 and 8 below, and available from the author's employer. 7. OPTIMIZING EFFICIENCY
The highest efficiency is obtained by minimizing the total power dissipated while the amplifier is delivering a desired output power. That can be done by modifying the waveforms slightly away from the nominal ones shown in Fig. 3, allowing some of the components of power dissipation to increase, while having other components of power dissipation decrease by larger amounts. For example, allowing the minimum of the voltage waveform to be at about 20% of the peak voltage, instead of at zero, increases the power loss, but it reduces the rms/average ratio of the current waveform and the peak/average ratio of the voltage waveform. Both of those effects can be exploited to obtain a specified output power with a specified safe peak transistor voltage, with lower rms currents in the transistor,
280
and That reduces their dissipations. If their series resistances are large enough, the decrease in their power losses can outweigh the increase of power loss. The power loss in the transistor and in discharging a partially charged are not functions of the design frequency is inversely proportional to frequency, so the product is independent of frequency). For given types of C or L components, losses in capacitor ESRs (including that in the transistor's increase with design frequency, inductor core losses increase, and inductor winding losses decrease. The optimum trade-off depends on the specific combination of parameter values of the types of components being considered in a particular design. (It does not vary appreciably from one unit to another of a given design.) No a priori explicit analytical method yet exists for achieving the optimum trade-off among all of the components of power loss. Optimization is a numerically intensive task, too difficult to do by explicit analytical methods. But computerized optimization is practical. For example, running on an IBM-PC-compatible computer with a Pentium III/667-MHz processor, a commercially available program HEPA-PLUS [7], developed specifically for high-efficiency power amplifiers, designs a nominal-waveforms Class-E amplifier in a time too short to observe, simulates the circuit in 0.008 seconds, and optimizes the design automatically, according to user-specified criteria, in about 2.4 seconds. The program uses double-precision computation for accuracy and robustness, yielding the circuit voltage and current waveforms and their spectra, dc input power, RF output power, and all components of power dissipation. 8. EFFECTS OF NON-IDEAL COMPONENTS
Many of the non-idealities of the circuit components can be included in an analytical solution if the circuit is operating with the nominal switchvoltage waveform, but the task becomes progressively more difficult as
281
one attempts to include more of those effects simultaneously, and becomes impossible if the circuit is not operating at the nominalwaveforms conditions. The HEPA-PLUS computer program [7], mentioned above, simulates an expanded version of the circuit of Fig. 2 in any arbitrary operating condition (nominal or non-nominal waveforms). It includes all important "real-world" non-idealities of the transistor, the finite-Q power losses of all inductors and capacitors, and parasitic wiring inductances in series with and in series with the transistor. Details are available from the author's employer. 9. APPLICABLE FREQUENCY RANGE IS ABOUT 3 MHz TO 10 GHz (as of 1999) The Class-E amplifier can operate at arbitrarily low frequencies, but below about 3 MHz, one of the three types of switching-mode Class-D amplifier might be preferred because it can provide as good efficiency as the Class-E, with about 1.6 times as much output power per transistor, but with the possible disadvantage that transistors must be used in pairs, vs. the single Class-E transistor. Class E is preferable to Class D at frequencies higher than about 3 MHz, for its higher efficiency, easier driving of the transistor input port, and less-detrimental effects from parasitic inductance in the output-port circuit. The upper end of the useful frequency range for the low-order Class E is the frequency at which the achievable turn-off switching time is of the order of 17% of the RF period. In a Class-B amplifier, the turn-off transition time is 25% of the period. Therefore a low-order Class-E circuit will work well with a particular transistor at frequencies up to about 17%/25% = 70% of the frequency at which that transistor works well in a Class-B amplifier. (Unpublished higher-order Class-E circuits can operate efficiently at frequencies up to about double that of the low-order version.) Class-E circuits have been made successfully at frequencies up to 10 GHz [42]. Several microwave designers have reported
282
achieving remarkably high efficiency by driving the amplifier into saturation and using a favorable combination of series inductance to the load resistance [13] or fundamental and harmonic load impedances [14][20]. (The authors of [13]-[20] found the favorable tuning condition by using an automatic tuner and/or a circuit-simulation program to make an exhaustive search over the multi-dimensional impedance space to discover a favorable combination of circuit-element values, rather than by using a priori explicit design equations.) Secchi [13] and Mallet et al [14] provided oscillograms of their drain-voltage and collectorvoltage waveforms. Inspection of the waveform in [13, Fig. 2] shows a nominal Class-E waveform with The waveforms in Fig. 2(b) of [14] are Class E, but with an unusually small conduction angle. Probably higher output power could be obtained by increasing the conduction angle and modifying the loadnetwork impedance accordingly. This author does not know the operating mode of [15]-[20]; very likely those amplifiers are distributedelements versions (see below) of Class E, achieved empirically. Distributed vs. lumped elements: High-efficiency waveforms similar to those in Figs. 1 or 3 can be generated with lumped and/or distributed elements. At a given frequency, the choice depends on the available components and the trade-offs among their sizes, costs, quality factors, and parasitic effects. [12], [21]-[23], [41], and [42] were transmission-line versions of Class E, operating at 10, 8.35, 5, 2, 1, and 0.5 GHz. The 5-, 2-, and 1-GHz circuits were described as having been designed a priori by explicit design procedures, worked as expected, and were operated and measured without making any experimental adjustment. 10. EXPERIMENTAL RESULTS
Table II summarizes representative reported Class-E performance (as of 1999), from 44 kW PEP at 0.52-1.7 MHz to 1.41 W at 8.35 GHz and 100 mW at 10 GHz.
284
11.
TUNING PROCEDURE
Fig. 3 shows the nominal Class-E transistor-voltage waveform in the low-order circuit of Fig. 2: at the transistor's turn-on time, the waveform has zero slope, and has zero voltage for a FET or for a BJT. An actual circuit, or a circuit in the HEPA-PLUS computer program [7], can be brought from an off-nominal condition to that nominal-waveform condition by adjusting and/or and the load resistance R if R is not already the value that will provide the desired output power. The desired value of R is obtained from (6) or (6a) after having applied the allowance for parasitic resistances discussed in the last paragraph of Section 3 above.6 After adjusting the antenna tuner or the load-impedance-transforming network (located between the antenna or other load and the right-hand end of in Fig. 2) so as to provide an input-port resistance of R, there might be residual series inductive and/or capacitive reactances in series with R. The series inductive reactance adds to the reactance of and the series capacitive reactance adds to the reactance of Then the amplifier would operate with an off-nominal waveform, and possibly an off-nominal value of output power, because the effective values of and would differ from the design values. To correct for that, the reactances of and should be reduced by the amounts of the residual inductive and capacitive series reactances of the input-port impedance of the tuner or impedance-transforming network. The following text and figures explain how to make those adjustments to the circuit, if needed, without needing to know, a priori, the values of those residual series reactances. The text is in terms of a BJT; for a FET, substitute for The circuit parameters were chosen, via equations (2) through (10), to meet a chosen set of requirements. The circuit will operate with the nominal Class-E waveform, while delivering the specified output power at the specified frequency, if the chosen parameter values are installed in the actual hardware. The possible need for tuning results from (a)
285
tolerances on the components values (normally not a problem, because Class E has low sensitivity to component tolerances) and (b) the possibility of unknown-value inductive and capacitive reactances being inserted in series with R (hence in series with and after the load resistance has been transformed to the chosen value of R. Those series reactances require that the reactances of and be reduced by the amounts of the unknown inserted inductive and capacitive series reactances. But how to do that when those inserted reactances are unknown? Fig. 4 shows a waveform for an amplifier with off-nominal tuning, with the waveform features labeled for subsequent reference in the text. If we know a priori how changes of and will affect that waveform, we can adjust two parameters and so as to meet two criteria at the operating frequency: (a) achieve the nominal waveform of Fig. 3 and (b) deliver the specified value of output power. Fig. 5 shows how and affect the waveform. We know also that increasing reduces the output power, and vice versa. With the preceding information, and with (a) an oscilloscope displaying the waveform and (b) a directional power meter indicating the power being delivered to the load, we can adjust and so as to fulfill simultaneously the two desired conditions (nominal waveform and desired output power) even if the inductive and capacitive reactances in series with R are unknown. If (comprising the transistor output capacitance and the external capacitor connected in parallel with it) is within about 10% of the intended value, will normally not need adjustment. But in case of a possible large deviation from the design value, can also be adjusted so as to achieve the nominal waveform, using the information in Fig. 5 about the effects of on the waveform. In that case, the three components and can be adjusted so as to achieve three conditions simultaneously at the operating frequency: desired output power, transistor voltage of just before transistor turn-on, and zero slope of the waveform just before turn-on. The following diagrams and text explain how to adjust and R (if desired) to
286
adjust the shape of the
waveform.
Changes in the values of the load-network components affect the waveform as follows, illustrated in Fig. 5: Increasing to the right.
moves the trough of the waveform upwards and
Increasing moves the trough of the waveform downwards and to the right. Increasing moves the trough of the waveform downwards and to the right. Increasing R moves the trough of the waveform upwards (R is not normally an adjustable circuit element). Knowing these effects, you can adjust the load network for nominal Class-E operation by observing the waveform. (Depending on the settings of the circuit component values, the zero-slope point and/or the negative-going jump at transistor turn-on might be hidden from view, as in some of the waveforms in Fig. 6. If that occurs, the locations of those features on the waveform can be estimated by extrapolating from the part of the waveform that can be seen.) The adjustment procedure is as follows: 1. Set R to the desired value or accept what exists. 2. Set
for the desired
or accept what exists.
3. Set the frequency as desired. 4. Set the duty ratio to the desired value (usually 50%), with set to approximately 20% of the intended final value. If the transistor turn-on is visible on the waveform (as in Fig, 4), measure the duty ratio. Otherwise, observe the waveform and assume that turn-on occurs when the positive-going edge of
287
reaches +0.8 V and turn-off occurs when the negative-going edge of reaches 0 V. (The preceding voltage values are for a silicon NPN transistor at room temperature. For other types of transistors, make appropriate modifications to the voltage values.) 5. Observe the trough of the
waveform:
A. At the zero-slope point: What is the voltage relative to More positive, more negative, or equal?
B. At transistor turn-on: What is the slope? Positive, negative, or zero? If these points are unobservable because they lie below the zerovoltage axis, the voltage at zero slope is “more negative.” Estimate the slope at turn-on by extrapolation of the waveform. If the voltage at zero slope is unobservable because transistor turn-on occurs before zero slope is reached, the slope at turn-on is “negative.” Estimate the voltage at zero slope by extrapolation of the waveform. If you cannot estimate the or the slope by extrapolation, assume that is “equal” or that the slope is “zero.” 6. Adjust 6.
and/or
as shown in Fig. 5, and in expanded form in Fig.
is now the desired value, go to Step 8. If is less than the 7. If desired value, increase by up to 50% and readjust the duty ratio, and as needed. (The increase will decrease the effective value of the voltage-dependent causing the effective value of to be reduced. Therefore will need to be increased slightly.) slightly to 8. For a final check of the adjustments, increase generate an easily visible marker of transistor turn-on: the small negative-going step of Verify that the duty ratio is the desired value (usually 50%) and that the waveform slope is
288
zero at turn-on time. Now return to the value that brings the waveform to at turn-on time (and also eliminates the marker). 12.
GATE-AND BASE-DRIVER CIRCUITS
A simplistic view of the driver stage is that its design is much less important than the design of the output stage, because the power level at the driver stage is much lower than that at the output stage, by a factor equal to the power gain of the output stage, typically a factor of about 10 to 100. That simplistic view is not correct, because the output transistor will not operate as intended if its input is not driven properly. If the output transistor does not operate as intended, the output stage will not operate as intended, either. The resulting output-stage performance might or might not be acceptable. The output-stage transistor will operate properly as a switch, as intended, if its input port (Gate-Source of a FET or Base-Emitter of a BJT) is driven properly by the output of its driver stage. The driver stage must provide the output specified below. Symbols for FETs are used below; you can convert to BJT symbols if you wish. 1. Enough "off" bias during the "off" interval to maintain the drain or collector current at an acceptably small value. If you are willing to tolerate a power loss of x fraction of the normal dc input power due to non-zero "off"-state current, the drain or collector current during the "off" interval can be up to where is the dc current drawn from the dc drain-voltage supply, and D is the output-transistor's "on" duty ratio (usually 0.50, but it can be any value you choose and provide for in the choice of R, L, and C values in the load network). Example: If you are willing to tolerate 1% additional power consumption from the voltage supply caused by the non-zero "off"-state current, if is 5 A, and if D is
289
the usual value of 0.50, you can tolerate an "off"-state drain current of 0.01 [5 A] [1/(1 - 0.50)] = 0.1 A = 100 mA. That's easy. For example, the International Rectifier IRF540 (rated 100 V, 28 A) is specified for 0.25 mA maximum at and V, at a factor of 400 smaller than the 100 mA you are willing to accept in this example. 2. Enough "on" drive during the latter 3/4 of the "on" interval to maintain a low-enough You can choose what is "low-enough" for your purposes; refer to the last three sentences of Section 3. Why "the latter 3/4 of the 'on' interval": The current i(t) during the first 1/4 of the "on" interval is small enough that can be acceptably small for a fairly high because the small i(t) during the first 1/4 of the "on" interval causes an even smaller (the square of a small number is even smaller than the number before squaring). 3. Enough turn-off drive to turn-off the drain or collector current from 100% to 0% in a fall-time fast enough to make the turn-off power dissipation an acceptably small fraction of the output power. That fraction is where and is the period of the operating frequency Choose the acceptable fraction of the output power to be dissipated during the non-zero turn-off switching time. Then calculate the required drainor collector-fall time that must result from the "enough turn-off drive." Then provide sufficient turn-off drive to accomplish your chosen objective, according to the characteristics of the chosen output transistor. (That is the subject of an intended future publication.) For example, if you are willing to have the turn-off power dissipation be 6% of the output power, and if the allowable value for
290
i.e.,
can be as
large as 10.6% of the period. 4. Enough turn-on drive to turn-on the output transistor fast enough to make an acceptably small power dissipation during the turn-on switching. That has never been a problem with all of the drivers I have seen. Most driver circuits turn the transistor "on" and "off" with about the same switching times. If the more-important turn-off switching time is fast enough, the accompanying turn-on switching time will be more than fast enough.
The input-port characteristics of BJTs, MOSFETs, and MESFETs are enough different that different types of driver circuits should be used to drive those three different types of transistors.7 I intend to publish one or more future articles that discuss in detail driver circuits that meet criteria 1-4 above, for MOSFETs, MESFETs, and BJTs. A brief summary of driving a MOSFET or a MESFET follows. The polarity descriptions assume N-channel or NPN; reverse the polarity descriptions for P-channel or PNP. The best gate-voltage drive is a trapezoid waveform, with the falling transition occupying 30% or less of the period. (Trade-off: The shorter the turn-off transition time, the smaller will be the power dissipation in the output transistor during turn-off switching, but the larger will be the power consumption of the driver stage. For both MOSFETs and MESFETs, the optimum drive minimizes the sum of the output-stage power dissipation and the driver-stage power consumption.) The upper level of the drive waveform should be safely below the MOSFET's gate-source maximum voltage rating, or the MESFET's gate-source voltage at which the gate-source diode conducts enough current to cause either of two undesired effects: (a) metal migration of the gate metalization at an undesirably rapid rate (making the transistor operating lifetime shorter than desired) or (b)
291
enough power dissipation to reduce the overall efficiency more than the efficiency is increased by the lower dissipation in the lower that results from a higher upper level of the drive waveform. The lower level of the trapezoid should be low enough to result in a satisfactorily small current during the transistor's “off ”state, discussed in requirement 1 above. A sine-wave is a usable (but not optimum) approximation to the trapezoid waveform described above. To obtain an output-transistor “on” duty ratio of 50% (usually the best choice, but a larger or smaller duty ratio can be used if appropriate component values are used in the load network), the zero-level of the sine-wave should be positioned slightly above the FET's turn-on threshold voltage. A better approximation is to remove the part of the sine-wave that goes below the value that ensures fully “off ” operation, replacing it with a constant voltage at that value. This reduces the inputdrive power by slightly less than 50%, almost doubling the power gain of the output stage. A planned future article will discuss in detail a simple circuit that generates such a waveform. ACKNOWLEDGEMENTS
The author thanks Prof. Alan D. Sokal of the Physics Department, New York University, for many helpful discussions and for producing the numerical solutions in Table I and the initial set of equations that fit the data in Table I; John E. Donohue, formerly of Design Automation, Inc., for computing the coefficients of in (4) to fit the data in Table I, yielding (5) and (6); and Dr. Richard Redl of ELFI S.A., for computing the improved-accuracy functions in (5a), (7), and (9) that fit the P, and data of Table I. This text is an edited version [including correction of a typographical error in (1)], of “Class E RF Power Amplifiers,” published in QEX magazine, Jan./Feb. 2001, Issue No. 204, pp. 9-20, copyright 2000,
292
American Radio Relay League, Inc. That article added significant new information to text taken from “Class-E High-Efficiency Power Amplifiers, from HF to Microwave,” presented by this author at the IEEE International Microwave Symposium, June 1998, Baltimore, Maryland, U.S.A., and “Class-E Switching-Mode High-Efficiency Tuned RF/Microwave Power Amplifier: Improved Design Equations,” presented by this author at the IEEE International Microwave Symposium, June 2000, Boston, Massachusetts, U.S.A. The texts of the presented papers are included in the printed and CD-ROM versions of the Proceedings of the conferences, copyright 1998 and 2000, respectively, by IEEE. The author thanks ARRL and IEEE for permission to use the previously published material. FOOTNOTES 1
Most papers on the Class E amplifier of Fig. 2 (including this one) define as A few papers, e.g., [3], define as Kazimierczuk and Puczko [5], to their credit, give both values in their tabulations, as and as respectively.
2
The choice of involves a trade-off among operating bandwidth (wider with low harmonic content of the output power [11] (lower with high and power loss in the parasitic resistances of the load-network inductor and capacitor (lower with low
3
The nominal switch-voltage waveform has zero voltage and zero slope at the time the switch will be turned on. [l]-[4], and papers by other authors, referred to that nominal waveform as the “optimum” waveform, a misnomer. That waveform is “optimum” for yielding high efficiency in the case of a switch with negligibly small series resistance. But if the switch has appreciable resistance, the efficiency can be increased by moving away slightly from the nominal waveform, to a waveform whose voltage at the switch turn-on time is of the order of 20% of the peak voltage. No analytical optimization procedure yet exists, but the circuit can be optimized numerically, by a computer program such as HEPA-PLUS [7], discussed briefly in Sections 7 and 8.
4
Beware: A few publications define D as the fraction of the period that the switch is
off. 5
Updates to [11]: (a) Delete the column in Table I for
because
must be
293
1.7879 to obtain the nominal Class-E collector/drain-voltage waveform in the circuit described in [l]-[6], when the switch duty ratio D is 50%. (b) In (2), change the factor 1.42 to 1.0147; the factor 2.08 to 1.7879; and the factor 0.66 to 0.773. (c) Recalculate the numerical values of using (2) with the revised factors. 6
The 1997 two-part QST article [43] by Eileen Lau (KE6VWU) et al, about 300-watt and 500-watt 40-metre transmitters, discussed tuning in Part 2, but without a description of how to adjust the load-network components values to obtain the nominal Class-E voltage waveform, as is included in Section 11 here.
7
In the early 1980s, I made a driver circuit that would drive a BJT or a MOSFET interchangeably, with no change needed in the driver or in the power-amplifier’s input circuit. That driver was used in a Class-E demonstrator circuit, so that a person evaluating Class-E technology could insert any type of transistor for test purposes, be it an NPN BJT or an N-channel MOSFET, and observe that the changes of poweramplifier output power and efficiency were almost unnoticeably small, when inserting any of thirty transistors of different type numbers and manufacturers, some BJT and some MOSFET. Some of those people, accustomed to working with conventional Class-C power amplifiers, were astonished when they witnessed the results of that test.
REFERENCES [1] N. O. Sokal and A. D. Sokal, "High-efficiency tuned switching power amplifier," U. S. Patent 3,919,656, Nov. 11, 1975 (now expired). [Includes a detailed technical description.] [2] "Class E – a new class of high-efficiency tuned single-ended switching power amplifiers," IEEE J. Solid-State Circuits, vol. SC10, no. 3, pp. 168-176, June 1975. [The text of [1] cut to half-length; retains the most-useful information. Text corrections are available from N. O. Sokal.] [3] F. H. Raab, "Idealized operation of the Class E tuned power amplifier," IEEE Trans. Circuits and Systems, vol. CAS-24, no. 12, pp. 725-735, Dec. 1977. [4] N. O. Sokal and A. D. Sokal, "Class E switching-mode RF power amplifiers — low power dissipation, low sensitivity to component tolerances (including transistors), and well-defined operation," Proc. 1979 IEEE ELECTRO Conference, Session 23, New York, NY, 25 April 1979; reprinted in R.F. Design, vol. 3, no. 7, pp. 33-38, 41, July/Aug. 1980. [Includes plots of efficiency vs. frequency with as a parameter and efficiency vs. variations of all circuit parameters.] [5] M. K. Kazimierczuk and K. Puczko, "Analysis of Class E tuned power amplifier at any Q and switch duty cycle," IEEE Trans. Circuits and Systems, vol. CAS-34, no. 2, pp. 149-159, Feb. 1987. [6] N. O. Sokal, "Class E high-efficiency switching-mode power amplifiers, from HF to microwave," 1998 IEEE MTT-S International Microwave Symposium Digest, June 1998, Baltimore, MD, CD-ROM IEEE Catalog No. 98CH36192 and also
294
1998 Microwave Digital Archive, IEEE Microwave Theory and Techniques Society, CD-ROM IEEE Product # JP-180-0-081999-C-0. [7] HEPA-PLUS computer program, available from the author's employer, Design Automation, Inc., 4 Tyler Rd., Lexington, MA 02420-2404, U.S.A. [8] Ch. P. Avratoglou and N. C. Voulgaris, "A new method for the analysis and design of the Class E power amplifier taking into account the factor," IEEE Trans. Circuits & Systems, vol. CAS-34,. no. 6, pp. 687-691, June 1987. [9] F. H. Raab and N. O. Sokal, "Transistor power losses in the Class E tuned power amplifier," IEEE J. Solid-State Circuits, vol. SC-13, no. 6, pp. 912-914, Dec. 1978. [10] N. O. Sokal and R. Redl, "Power transistor output port model for analyzing a switching-mode RF power amplifier or resonant converter," RF Design, June 1987, pp. 45-48, 50-53. [11] N. O, Sokal and F. H. Raab, "Harmonic output of Class-E RF power amplifiers and load coupling network design," IEEE J. Solid-State Circuits, vol. SC-12, no. 1, pp. 86-88, Feb. 1977. [Text corrections are available from N. O. Sokal.] [12] E. W. Bryerton, W. A. Shiroma, and Z. B. Popovic', "A 5-GHz high-efficiency Class-E oscillator," IEEE Microwave and Guided Wave Letters, vol. 6, no. 12, Dec. 1996, pp. 441-443. [300 mW to external load at 5 GHz at 59% conversion efficiency (remaining RF output from transistor was used for input-drive to oscillator).] [13] F. N. Sechi, "High efficiency microwave FET amplifiers," Microwave J., Nov. 1981, pp. 59-62, 66. [Several "saturated Class B and Class AB" amplifiers at 2.45 GHz, using several types of GaAs MESFETs: 0.97 W at 71% PAE, 1.2 W at 72% PAE, 1.27 W at 72% PAE. The waveform in Fig. 2 is a low-order Class-E waveform with apparently = (2.7 V)/(0.688 A) = 3.9 ohms. All drain-current waveforms are sinusoidal; that seems to be inconsistent with the non-sinusoidal drain-voltage waveforms. Perhaps the bandwidth of the currentsensing instrumentation was sufficient to display only the fundamental component of the probably non-sinusoidal current waveforms.] [14] A. Mallet, D. Floriot, J. P. Viaud, F. Blache, J. M. Nebus, and S. Delage, "A 90% power-added-efficiency GalnP/GaAs HBT for L-band and mobile communication systems," IEEE Microwave and Guided Wave Letters, vol. 6, no. 3, pp. 132-134, March 1996. [Fig. 1 is well-annotated with the HBT parameter values, but it omits values for [15] S. R. Mazumder, A. Azizi, and F. E. Gardiol, "Improvement of a Class-C transistor power amplifier by second-harmonic tuning," IEEE Trans. MTT, vol. MTT-27, no. 5, pp. 430-433, May 1979. [800 mW output at 865 MHz, 53.3% collector efficiency, coupled-TEM-bar circuit. In a similar paper at the 9th European Microwave Conference, Sept. 1979, the same authors reported 64% collector efficiency at 800 mW output at 850 MHz.] [16] J. J. Komiak, S. C. Wang, and T. J. Rogers, "High efficiency 11 watt octave S/Cband PHEMT MMIC power amplifier," Proc. IEEE 1997 MTT-S International Microwave Symp., Denver, CO, June 8-13,1997, IEEE Catalog No. 0-78033814-6/97, pp. 1421-1424. [17 W at 5.1 GHz, 54.5% PAE, harmonic tuning]
295
[17] J. J. Komiak, L. W. Yang, "5 watt high efficiency wideband 7 to 11 GHz HBT MMIC power amplifier," Proc. IEEE 1995 Microwave and Millimeter-Wave Monolithic Circuits Symp., Orlando, FL, May 15-16, 1995, IEEE Catalog No. 95CH3577-7, pp. 17-20. [18] W. S. Kopp and S. D. Pritchett, "High efficiency power amplification for microwave and millimeter frequencies," 1989 IEEE MTT-S Digest, IEEE Catalog No. CH2725-0/89/0000, pp. 857-858. [19] Bill Kopp and D. D. Heston, "High-efficiency 5-watt power amplifier with harmonic tuning," 1988 IEEE MTT-S Digest, pp. 839-842. [12 FETs in parallel produced (from Table 3) 5.27 W output (apparently 0.27 W of that is lost in power-combining network) at 10 GHz with 35.3% PAE (Abstract says 5 W at 36% PAE). Exhaustive search for best combination of impedance vs. frequency. Built with distributed elements.] [20] L. C. Hall and R. J. Trew, "Maximum efficiency tuning of microwave amplifiers," 1991 IEEE MTT-S Digest, IEEE Catalog No. CH2870-4/91/0000, pp. 123-126. [Circuit simulations of optimum design found by exhaustive search of 12-dimensional parameters-space; the resulting design appears to be higherorder Class E with 3rd-harmonic resonator (Class F3).] [21] T. Mader, M. Markovic', Z. B. Popovic', and R. Tayrani, "High-efficiency amplifiers for portable handsets," Conference Record, IEEE PIMRC'95 (Personal, Indoor, & Mobile Radio Communications), Sept. 1995, Toronto, Ontario, Canada, IEEE publication 0-7803-3002-1/95, pp. 1242-1245. [Class E, 0.94 W at 1 GHz, at 75% drain efficiency, 73% PAE, Siemens CLY5 GaAs MESFET] [22] T. B. Mader and Z. B. Popovic', "The transmission-line high-efficiency Class-E amplifier," IEEE Microwave and Guided Wave Letters, vol. 5, no. 9, Sept. 1995, pp. 290-292. [0.94 W at 1 GHz at 75% drain efficiency, 73% PAE; 0.55 W at 0.5 GHz at 83% drain efficiency, 80% PAE; Siemens CLY5 GaAs MESFET] [23] T. B. Mader, "Quasi-optical Class-E power amplifiers," PhD thesis, 1995, Univ. of Colorado, Boulder, CO. [Class E with transmission lines: 0.55 W at 0.5 GHz at 83% drain efficiency, 80% PAE from Siemens CLY5 MESFET; 0.61 W at 5 GHz at 81% drain efficiency, 72% PAE from Fujitsu FLK052WG MESFET; four of the latter into a quasi-optical power combiner gave 2.4 W at 5.05 GHz at 74% efficiency, 64% PAE.] [24] T. Sowlati, C. A. T. Salama, J. Sitch, G. Rabjohn, and D. Smith, "Low voltage, high efficiency GaAs Class E power amplifiers for wireless transmitters," IEEE J. Solid-State Circuits, vol. 30, no. 10, pp. 1074-1080, Oct. 1995; same authors and almost-identical title and text in Proc. IEEE GaAs IC Symposium, Philadelphia, PA, Oct. 18-19, 1994, IEEE Catalog No. 0-7803-1975-3/94, pp. 171-174. [24 dBm = 0.25 W output at 835 MHz, at >50% power-added efficiency using integrated impedance-matching networks (PAE would be 75% with hybrid matching networks), from a GaAs IC at 2.5 Vdc] [25] T. Sowlati, Y. Greshishchev, C. A. T. Salama, G. Rabjohn, and J. Sitch, "Linear transmitter design using high efficiency Class E power amplifier," Conference Record, IEEE PIMRC'95 (Personal, Indoor, & Mobile Radio Communications),
296
Sept. 27-29, 1995, Toronto, Ontario, Canada, IEEE publication 0-7803-30021/95, pp. 1233-1237. [24 dBm = 251 mW at 835 MHz, 65% PAE] [26] J. Imbornone, R. Pantoja, and W. Bosch, "A novel technique for the design of high efficiency power amplifiers," European Microwave Conference, Cannes, France, Sept. 1994. [32.1 dBm = 1.6 W output at 850 MHz, at 62.3% poweradded efficiency, from a GaAs IC (output stage and driver stage) with high-Q lumped elements, at 5 Vdc. Simulated and waveforms for optimized output stage are Class E with V/27.4 V = 18%, as discussed in Section 7.] [27] K. Siwiak, "A novel technique for analyzing high-efficiency switched-mode amplifiers," Proc. RF Expo East '90, Nov. 1990, pp. 49-56. [higher-order Class E with 3rd-harmonic resonator (Class F3)] [28] C. Duvanaud, S. Dietsche, G. Pataut, and J. Obregon, "High-efficient Class F GaAs FET amplifiers operating with very low bias voltages for use in mobile telephones at 1.75 GHz," IEEE Microwave and Guided Wave Letters, vol. 3, no. 8, pp. 268-270, Aug. 1993. [higher-order Class E with 3rd-harmonic resonator (Class F3)] [29] R. M. Porter and M. L. Mueller, "High power switch-mode radio frequency amplifier method and apparatus," U. S. Patent 5,187,580, Feb. 16, 1993. [Class E with substantial voltage at turn-on, as in Section 4 here.] [30] Y-O Tam and C-W Cheung, "High efficiency power amplifier with travellingwave combiner and divider," Int. J. Electronics, vol. 82, no. 2, pp. 203-218, 1997. [Class E 450 MHz/5 W with 89.4% collector efficiency. The outputs of four such amplifiers were combined with a traveling-wave power-combiner, yielding 14.96 W output at 89.5% collector efficiency.] [31] J. E. Mitzlaff, "High efficiency RF power amplifier," U. S. Patent 4,717,884, Jan. 5, 1988. [1.6 W at 76% drain efficiency at 840 MHz. At least 1.5 W output with [at least?] 74% efficiency over 50-MHz band centered at 840 MHz (6% band). Described as Class F. Appears to be high-order Class E with lumped and transmission-line resonators. Shows transistor voltage and current waveforms for three "prior-art" circuits, but not for the circuit covered by this patent. Detailed explanation of how to synthesize load network to produce desired inputport impedance vs. frequency.] [32] M. Kessous and J.-F. Zürcher, "Amplificateur VHF en classe E utilisant un transistor à effet de champ (FET) VMOS de puissance" (VHF Class E amplifier using VMOS power FET), AGEN-Mitteilungen (Switzerland), no. 30, pp. 45-49, Oct. 1980. [2.58 W output at 145 MHz at 96.5% drain efficiency, 81.3% total using Siliconix VMP-4 MOSFET] [33] N. O. Sokal, "Design of a Class E RF power amplifier for operation at 2.45 GHz, and tests on a scaled-frequency model at 122.5 MHz" [1/20 frequency], Oct. 1979, unpublished report of Design Automation, Inc. Project 4198. [Used Raytheon RPC3315 GaAs MESFET intended to be used at 2.45 GHz. Initial test with frequency scaled-down by factor of 20, all inductors and capacitors (including transistor capacitances and expected wiring parasitic inductances) scaled-up by factor of 20, and all resistances, voltages, and currents at intended
297 final values. 210 mW output, 77% drain efficiency, 24 mW input drive, 9.4 dB power gain, 71% overall efficiency 68% PAE.] [34] D. W. Cripe, "Improving the efficiency and reliability of AM broadcast transmitters through Class-E power," National Association of Broadcasters annual convention, May 1992, 7 pp. [35] S. Hinchliffe and L. Hobson, "High power Class-E amplifier for high-frequency induction heating applications," Electronics Letters, vol. 24, no. 14, pp. 886-888, July 7, 1988. [>550 W at 3-4 MHz at >92% efficiency across the band, 450 W at 3.3 MHz at 96% efficiency from 104 Vdc, IRF450 MOSFET.] [36] R. Redl and N. O. Sokal. "A 14-MHz 100-watt Class E resonant [dc/dc] converter: principles, design considerations and measured performance," Power Electronics Conf., San Jose, CA, Oct. 1986. [Class E dc/dc converter had 87% drain efficiency at 100 W dc output. IRF540 RF power stage supplied estimated 105 W at 91.4% efficiency because of estimated 5 W loss in coupling transformer and rectifier associated with 100-W dc load.] [37] N. O. Sokal and Ka-Lon Chu, "Class-E power amplifier delivers 24 W at 27 MHz, at 89-92% efficiency, using one transistor costing $0.85," Proc. RF Expo East, Tampa, FL, Oct. 1993, pp. 118-127, and presented at RF Expo West, San Jose, CA, March 1993 but not in Proc. [International Rectifier (89%) and Harris Semiconductor (92%) IRF510 SMPS MOSFET; Harris device slightly larger die, lower and higher efficiency. Silicon-gate (about 1-2 ohms, but never specified by vendor) was borderline-acceptable at 27.12 MHz for inputdrive power. varies as it would have been quite acceptable at 13.56 MHz.] [38] N. O. Sokal and I. Novak, "Tradeoffs in practical design of Class-E highefficiency RF power amplifiers," Proc. RF Expo East, Tampa, FL, Oct. 1993, pp. 100-117, and presented at RF Expo West, San Jose, CA, March 1993, but not in Proc. [39] P. J. Poggi, "Application of high efficiency techniques to the design of RF power amplifier and amplifier control circuits in tactical radio equipment," Proc. MILCOM'95, San Diego, CA, Nov. 5-8, 1995, pp. 743-747. [40] S. C. Cripps, RF Power Amplifiers for Wireless Communications, Artech House, Norwood, MA, 1999, ISBN 0-89006-989-1, pp. 170-177. [Fig. 6.19 on p. 176: GaAs MESFET, 840 MHz, 79% efficiency at 1.24 W output, 15 dBm (31.6 mW) input, power gain = 1.24 W/0.0316 W = 39.2 =15.9 dB.] [41] M. D. Weiss, M. H. Crites, E. W. Bryerton, J. F. Whitaker, and Z. Popovic', "Time-domain optical sampling of switched-mode microwave amplifiers and multipliers," IEEE Trans. MTT, vol. 47, no. 12, pp. 2599-2604, Dec. 1999. [42] M. D. Weiss and Z. Popovic', "A 10 GHz high-efficiency active antenna," 1999 IEEE MTT-S International Microwave Symposium Digest, June 13-19, 1999, Anaheim, CA, file TU4B_5.PDF on CD-ROM IEEE Catalog No. 99CH36282C. [43] E. Lau (KE6VWU), K-W Chiu (KF6GHS), J. Qin (KF6GHY), J. Davis (KF6EDB), K. Potter (KC60KH), and D. Rutledge (KN6EK), "High-efficiency Class-E power amplifiers — Part 1," QST, vol. 81, no. 5, pp. 39-42, May 1997, and "... Part 2," vol. 81, no. 6, pp. 39-42, June 1997.
298
299
300
See next page for Fig. 6.
301
LINEAR TRANSMITTER ARCHITECTURES Lars Sundström Ericsson Mobile Communications AB/Lund University Lund, Sweden
ABSTRACT The need for linear transmitter architectures in current and future wireless systems is briefly discussed. The principles and properties of various linear transmitter architectures based on power amplifier linearization and direct modulation are given with a focus on analog implementation and battery-operated user equipment.
1. INTRODUCTION
There has been a migration from frequency and phase modulation towards more spectrally efficient modulation schemes in wireless systems for more than a decade. First, analog FM-based systems were replaced by digital standards and now digital standards are replaced or extended to include more efficient and flexible modulation schemes as we enter the era of third-generation mobile systems and a more diversified use of wireless communication in general. Common for these modulation techniques is that they require more or less linear transmitters to produce a sufficiently accurate waveform at high power levels that, in particular, preserve the desired spectral properties. Let us exemplify with some important standards: 303 J. H. Huijsing et al. (eds.), Analog Circuit Design, 303-323. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
304 D-AMPS (US) and PDC (Japan) - based on modulation and root raised-cosine (RRC) filtering.
DQPSK narrowband
GSM - based on GMSK (constant envelope, does not require a linear transmitter) modulation but extended standard (GSM Phase 2+) include linear modulation (EDGE) that is spectrally compatible with the GMSK signal. WCDMA - based on a QPSK-like wideband modulation scheme with RRC filtering. Extension of standard under development to provide higher data rates by the use of adaptive modulation and multicode schemes. Bluetooth - based on FM today (constant envelope, does not require a linear transmitter) but extension currently being developed (Bluetooth 2) to improve data rate substantially by the use of linear adaptive modulation that will require a linear transmitter. Hiperlan II - one of the alternatives is based on OFDM (linear modulation).
For each standard there is a detailed specification for the transmitter that regulates output RF spectrum emissions (spectrum emission mask, adjacent channel leakage power ratio etc.) to ensure compatibility in frequency domain in general and the error vector magnitude (EVM) to ensure that the waveform is sufficiently accurate in order to avoid significant loss in the link budget. The linearity of the transmitter is very important as it affects both output RF spectrum emissions and EVM. The linearity of a transmitter or the power amplifier (usually the dominating nonlinearity in a transmitter) may be specified as amplitude-toamplitude (AM-AM) and amplitude-to-phase (AM-PM) conversion, that is the output amplitude and phase through the nonlinearity as a function of the input amplitude. Note that this does only model intermodulation distortion (IMD) and not harmonic distortion. IMD is, however, the most serious distortion product as coincides with the desired signal (in frequency) and the techniques discussed below mainly reduce IMD. Besides linearity, the transmitter should of course have a high power efficiency, in particular if we consider battery-operated user equipment. Furthermore, the efficiency will be of increasing importance as we go from voice transmission to data transmission. As more stringent linearity requirements are introduced it will be increasingly difficult to maintain a high efficiency and this is where the linear transmitter architectures discussed below will be important.
305
Linear transmitter architectures can be divided into linearization architectures and direct modulation architectures. The former is defined as a technique where the nonlinear characteristic of the (final) power amplifier (PA) is exercised such that distortion is generated. A counteracting circuitry (linearizer) is added to cancel or reduce the distortion. As opposed to this, direct modulation techniques avoid the nonlinear behaviour of the PAs altogether by applying modulation components directly to or at the PAs. Besides linearity, we may also gain in efficiency when using a linearizer as the PA can be operated deeper into its nonlinear region where the efficiency is higher. However, the boost in efficiency is limited because the linearizer will basically rescale the signal amplitude range back to regions with less efficiency (although the peak amplitude will remain the same). Thus, if further improvement in efficiency is desired, direct modulation should be considered instead as it can exploit high efficiency switched amplifiers for all output amplitudes. A huge amount of linearization architectures can be found in the literature. Most if not all are based on one or a combination of three basic techniques: predistortion, feedback and feedforward. The classical feedforward technique is used in multicarrier basestations amplifiers as it provides an unprecedented combination of linearity and bandwidth but also large volume and very low efficiency. The complexity of a practical feedforward amplifier is usually quite high as accurate control of phase/delay and gain balance between parallel signal paths is required for proper operation. From a cost point of view this is usually not a problem as the major cost is connected to the PA devices. If we instead turn our focus towards linearization techniques that can be implemented with a high level of integration and are well suited for battery-operated and small user equipment we find predistortion and feedback. As for direct modulation techniques, envelope elimination and restoration (EER) and linear amplification with nonlinear components (LINC) are the two most prominent ones in this respect. These four techniques will be discussed in more detail below.
306
2. PREDISTORTION
Adding a complementary nonlinearity in series with the PA is probably the most obvious technique for linearization. The basic principle of predistortion as well as predistortion at baseband, IF and RF are illustrated in figure 1. Note that these figures does not define the location of interface between the digital and the analog domain. In practice, this interface may be located anywhere from the baseband signal generation up to and including an IF predistorter.
Predistortion is widespread, in some cases in the form of simple firstorder compensation circuits [1][2] that cancel or reduce, say, the thirdorder term of the PA nonlinearity, and in other cases as a complex architecture where the actual predistortion takes place in the digital domain [3]. Predistortion can also be used together with other linearization schemes such as feedforward to further improve the linearity in a basestation amplifier. In cases where a high degree on linearization is required, say 20dB and higher, continuous adaptation of the predistorter is a must to track the varying characteristics of the PA, caused by varying temperature, load, transmit power and aging. Thus, the complexity of predistortion tends
307
to become rather high as some means of monitoring the PA output signal is required to guide the adaptation process. Another important issue is the memory effects. For narrowband signals it can be assumed that the PA is memoryless, consequently the predistorter can be memoryless too. This assumption holds to some extent but for wideband basestation amplifiers with a stringent spectrum mask, this is not case. Digital baseband and IF predistortion techniques receive a significant amount of attention nowadays as it is believed to have the potential of giving more cost-effective, smaller and flexible basestation amplifiers compared with a feedforward implementation. While it is easier to represent an arbitrary nonlinearity in the digital domain and also do the necessary adaptation processing etc. the interface between the analog and digital domain will be moved closer to the PA where the requirements on resolution and sampling rate will be substantially higher [11]. This is of course even more true if the digital part directly generates an IF signal instead of baseband components. More advanced analog predistortion circuits do also exist. In theory there are many options to generate a predistorter function in the analog domain such as the multi-tanh and translinear principles. However, a programmable predistorter should have properties that allows for adaptation towards a global minimum of distortion at the output of the PA. Thus it helps if the predistorter synthesizes a simple analytical function. In addition to this, both AM-AM and AM-PM distortion must be corrected for and this increase the complexity. All this suggests that an analog predistorter should be implemented as a complex-valued polynomial gain function of low order [4][5]. Recently, integrated circuits in both BiCMOS and CMOS technology have been presented [6][7]. Both of them implement a fifth order complex-valued polynomial
where and are adjustable coefficients. The circuits can be configured to operate with baseband signals in which case the signals and are complex-valued, see figure 2, or with IF/RF bandpass signals, see figure 3.
308
Experimental results with both these circuits have shown that the thirdorder intermodulation products could be reduced by more than 30dB. The amplifiers that were used was of class A type driven deep into saturation. However, as discussed in [9] the attainable improvement quickly drops as more nonlinear power amplifiers are used, i.e. class B and C amplifiers, when driven by a two-tone signal. When using signals with limited modulation depth the results will improve though.
309
With wideband signals such as WCDMA the memory effects in the power amplifier and in the surrounding circuitry such as matching networks and filters may have a detrimental effect on distortion cancellation. As discussed in [8] it is possible to introduce memory effects in the predistorter as well by rather simple means and obtain significant improvement in linearity. Adaptation of the coefficients can be made quite easily. The adjacent channel power can be detected to guide an optimization algorithm as in [5]. The disadvantage with such a solution is slow adaptation that may not be able to track variations in the PA characteristics. Variations due to change in load and device temperature (due to change in output power) may call for faster adaptation schemes and this requires demodulation of e.g. the I and Q components of the output signal. Together with the reference signal these can be used to calculate new coefficients for the polynomial. One such scheme with a rather high complexity is described in [10]. 3. FEEDBACK Negative feedback is a well-established technique to build accurate and linear amplifiers. With RF power amplifiers, however, it is difficult if not impossible to obtain a reasonable amount of loopgain and at the same time preserve stability (although the evolution of process technology might change this). Therefore, at RF feedback usually means modulation feedback and not feedback of the entire broadband RF signal with all its harmonics. That is, the modulation components of the transmit signal are detected and compared with the corresponding components of the reference signal. This can be done using Cartesian or polar components at baseband or IF/RF. The techniques are illustrated in figures 4 and 5. All of these techniques have been studied in detail for many years. In particular, Cartesian feedback has been put into products and proven to work well with narrowband signals. Furthermore, the technique is considered to be appropriate for TETRA1 equipment where the spectrum mask is one of the most stringent. 1. TErrestrial Trunked RAdio, ETSI standard for digital land mobile radio
310
For wideband signals, however, modulation feedback appears to be less promising. The reason is the loop delay and how it affects stability. The stability can be studied by means of the phase margin. If the loop is characterised by a DC loop gain of and a single pole cut-off frequency the phase margin can be calculated as
where is the loop delay. Here, it is assumed that the pole contributes with Typically, the phase margin should be 60° or more to avoid noise peaking far from the carrier [12]. At the same time the loopgainbandwidth product must be sufficiently large to suppress distortion products to levels below the spectrum mask. We may assume that the baseband part of the loop may be given any transfer function as we desire. The delay in the RF part of the loop, how-
311
ever, appears to be rather fixed. Matching networks in the PA and if present, e.g filter, couplers etc. contribute to a group delay that can be quite significant. The design of the matching networks are dictated by technology, device size, acceptable loss in the matching networks etc. Thus, the matching networks are usually of low order and a first order network may be studied by means of its equivalent RLC circuit that exhibits a group delay of
Thus, for a matching network tuned for 1GHz and with a Q of 3 gives 1ns of group delay. The total delay of a power amplifier is typically several ns and if we consider an example with 5ns of delay, a DC loopgain of 100 (40dB), then a 60° phase margin is obtained for a loop bandwidth of This bandwidth is typically chosen to be the same as the signal bandwidth or larger. Note that the signal bandwidth here refers to the bandwidth of the modulation components. That is, while the Cartesian components have the same spectral properties as the undistorted RF bandpass signal, polar component exhibit a much larger bandwidth. In practice, the picture becomes more complicated as several additional (parasitic) poles present in the loop will contribute more phase shift thus leaving less headroom for the delay. To some extent small delays, e.g. in the PA, can be compensated for though. Is it worth noting that the fundamental limitation imposed by the delay completely disqualifies the use of most higher order filters in the loop e.g. SAW filters as they exhibit very large group delays (several hundred nanoseconds and higher). Nevertheless, in [13] a Cartesian feedback system was reported to give a 20dB reduction of third-order intermodulation distortion at 1.5MHz away from the carrier using a 500kHz two-tone test. There is an additional aspect on stability of Cartesian feedback that makes it a more complicated compared to regular feedback [12]. The synchronism of the reference and output modulation components can only be maintained by a control loop that ensures the proper phase shift along the RF path [14][15]. Any deviation in phase shift from the opti-
312
mal value will directly degrade the phase margin with the same amount. Note that this includes the varying AM-PM conversion in the power amplifier which can easily extend to 20° over the entire amplitude range. If the required improvement in linearity is modest a more self-sufficient and less complex feedback technique can be exploited such as envelope feedback and power feedback [16][17]. These do only correct for AMAM distortion but in many cases this is sufficient. The basic technique is illustrated in figure 6.
To counteract the AM-AM distortion generated in the PA a variable gain amplifier (VGA) with an appropriate gain control precedes the PA. The gain control is obtained from the error signal given by the difference of the detected RF input and attenuated output signals The error signal is amplified and filtered and finally an optional offset is applied. The signals and are detected individually using exactly the same operation D, in practice an envelope detector or power detector. In a practical realization the VGA function might be a separate entity as in [17] but it could also be incorporated into the PA to reduce complexity and delay. In any case, the VGA function must not introduce any AM-PM distortion as there is no compensation for phase variations. This is usually not a problem though as the required gain control range of the VGA is rather limited. A closer look at this scheme reveals that it is not as intuitive as regular feedback, the input-output amplitude relationship is given by
313
for envelope feedback and
for power feedback where and are the input and output amplitudes, respectively. In both cases we can define the loopgain as Note, however, that this is not entirely correct. If the gain variations in e.g. the PA is large we should instead consider the differential loopgain. In both cases it is, however, readily seen that the expression may be simplified to
if the loopgain is sufficiently large. But as the loopgain is dependent on the input signal amplitude there will be no control of the loop for small input amplitudes and this is where the optional coefficient will be effective. Assuming that the PA is linear with a gain for the range where the loopgain is insufficient, we may set
to obtain the desired gain for small input amplitudes as well. If is not properly set, the architecture will actually become nonlinear even when a linear PA is used. To further illustrate the behaviour of this scheme the amplitude gain derived from (4) and (5) is illustrated in figure 7 with various sets of parameters and a weakly nonlinear PA. From figure 7 it can be seen that a proper setting of is equally important as having a sufficient loopgain. As a matter of fact, there is no reason to increase the loopgain if it is not followed by a corresponding increase in accuracy of
314
From figure 7 there is no reason to use a power detector as the loopgain appear to drop much more rapidly with decreasing input signal compared to when using envelope detectors. But an envelope detector is a very nonlinear block. As such it is difficult to obtain matched detectors for the input and output signals, respectively. Mismatch between the detectors will degrade the linearization performance. Furthermore, a signal that run close to the origin will result in large spectral expansion, which will increase the loop bandwidth requirements. A power detector or squaring function, on the contrary, provide both good matching and limited bandwidth expansion.
315
Practical results are promising, with a fairly linear PA as starting point, a l0dB reduction of distortion was obtained in [17] with a narrowband signal. Provided that the offset is properly set, power and envelope feedback are both self-sufficient schemes. 4. EER - Envelope Elimination and Restoration Linearization of RF power amplifiers will improve efficiency as it will be possible to operate the PA deeper into the saturation region. However, as the output signal should be a replica of the reference signal, the output signal must have the same modulation depth as the reference signal. Thus, substantially less efficient regions of the PA will still be exercised and the boost in efficiency will be limited. To improve efficiency further the PA must always be operated in saturation or as a switched device. This is the main driver for direct modulation techniques. Envelope elimination and restoration (EER) is one such technique [18] and there are many similar architectures. The basic idea is illustrated in figure 8.
The PA is fed with a constant envelope signal that only contains the phase component of the reference signal and the amplitude component is applied by means of supply modulation. These components may be obtained using a limiter and an envelope detector, respectively, as shown in figure 8. Another option is to generate these components directly from the digital baseband circuitry. As the supply modulation path is far from linear in practice, such an open-loop approach is only feasible for some standards with relaxed requirements on spectral emission and EVM. The linearity can be improved with feedback though in similar fashion to the polar feedback
316
technique illustrated in figure 5 [19][20] at the expense of reduced bandwidth. Even without a feedback loop the bandwidth of this scheme is rather limited in practice. One reason is found in the spectral expansion of the reference signal as it is decomposed into its polar components. This is exemplified in figure 9 with a WCDMA signal1. Thus the bandwidth of the amplitude and phase paths of the EER architecture must be very much larger than the bandwidth of the reference signal, at baseband it can correspond to 3-4 times the symbol rate or more depending on the spectral emission and EVM requirements. Furthermore, as the bandwidth of various signal paths expands the system become more sensitive as interfering signals and noise will more easily couple into the circuitry and result in unwanted modulation of the output signal. Furthermore, as the signal bandwidth is increased the spectral density of the desired components are reduced (assuming that the power of the signals remains the same).
Another bandlimiting factor is the varying supply. For optimal efficiency it is tempting to use a switched DC/DC converter but the control sig1. The modulation format for WCDMA is complex and varies with data rate and spreading factor. For simplicity all examples are based on a regular QPSK signal with filtering according to standard specifications. This corresponds to a special case when the same spreading factor is used for data and control but it does not include complex scrambling (HPSK spreading).
317
nal bandwidth of such a converter is rather limited as it must be much lower than the switching frequency. Note also that it is the amplitude component that controls the DC/DC converter and as shown earlier the associated bandwidth is much larger than the bandwidth of the reference signal. The signal paths for the amplitude and phase components are fundamentally different in nature. This means that delay mismatch will be a potential problem as the two paths will not automatically track very well. Furthermore, as opposed to a delay mismatch between the I and Q components, the effect of delay mismatch between amplitude and phase components may result in severe spectral expansion at the output of the PA. The sensitivity to delay mismatch varies from one modulation scheme to another. The effect is illustrated for a WCDMA signal in figure 10 where the spectra are shown for delay mismatch of 1%, 2%, 5% and 10% of the chip period (260ns=l/3.84Mcps). For modulation schemes as the ones used in WCDMA and EDGE the maximum tolerable delay mismatch is typically a few percent or less of the chip and symbol period, respectively, depending on the design margins.
318
5. LINC - LInear amplification with Nonlinear Components
LINC [21][22], like EER, can be considered as a direct modulation technique. The input signal is divided into two constant envelope phasors that are separately amplified. The RF output signal is obtained by combining these two phasors after the PAs, see figure 11. The first proposed architectures were based on analog circuit techniques [22] but later on the use if digital techniques has been assumed to be the best option [23].
As is the case with EER, the advantage of LINC is its high efficiency potential as switched amplifiers can be used to amplify the two phasors. LINC also share the disadvantage of EER in that the nonlinear functions involved results in substantial frequency expansion of the internal signals compared with the reference signal. The phasors in frequency domain may be represented by the sum of two signal where the reference signal is a narrowband signal and e(f) is a wideband signal. The spectra of these two signals are shown in figure 12 for a WCDMA signal, e(f) has about the same power spectral density as within the signal bandwidth and decays slowly with increasing frequency. Thus, the linearity of LINC relies on an accurate subtraction of two large quantities and the effect of a small phase or gain imbalance between the two branches can be detrimental. As the power spectral density of e(f) is as high as -10 to -20dBc in adjacent channels and the requirements on spectral emission for this region could be as low as –60dBc the accuracy of the subtraction should result in a residue equal
319
to e(f) suppressed by some 40 to 50dB. Thus, this residue is obtained by scaling e(f) with
where and is the relative gain imbalance and phase imbalance, respectively. For example, a 40dB reduction of e(f) is obtained with 0.1 dB gain imbalance or 0.5° phase imbalance. If the phasors are generated at baseband and separately upconverted in quadrature modulators we must also consider the imbalances and offsets within these building blocks [24]. To avoid this problem, the phasors can be generated at IF [25][26] (or even at RF). This would then, on the other hand, prevent us from using accurate digital techniques for the nonlinear functions involved unless a low digital IF is used[27]. With an analog (IF/RF) solution the bandwidth of the nonlinear function is one of the main obstacles for wideband signal generation but experimental results show that it is at least feasible up to 1MHz of reference signal bandwidth [26].
While several authors have reiterated the 100% efficiency potential of LINC, very few has addressed this particular topic in more detail. Instead, prototype circuits have constantly been based on the use of traditional power combiners for the recombination process. Thereby, sufficient isolation between the two branches is obtained to avoid cross-
320
products due to a nonlinear interaction of the two PAs. The penalty is loss. It can be shown that the “efficiency” of the power combiner alone is given by where PAR is the peak-to-average-power-ratio of the modulated signal which typically gives below 50%. Despite its obvious shortcomings the LINC technique has been successfully implemented in multicarrier basestations amplifiers that covers the whole CDMA-band of 60MHz. It is also worth to mention that several techniques have been proposed that are based on LINC with global feedback (encompassing the PAs) to generate the phasors through individually modulated VCOs in each branch [28][29]. By doing so, less linear but power efficient recombination techniques can be used as the loop will correct for the errors but as with any feedback system the attainable bandwidth is limited. The Neoteric signal concept [30] also derives from the LINC technique. As described above, in LINC the e(t)signal in the two phasors is cancelled by subtraction at the output. This is the only way to cancel e(t) as this signal has the same center frequency as the desired signal. In the Neoteric signal concept, however, the signal that is added to the reference signal has a completely different center frequency but, still, the properties that ensures a constant envelope phasor. Now, as the undesired signal is located around another frequency it may be removed by filtering and, furthermore, only one amplifier is necessary. The major disadvantage with this technique is the wideband properties of the Neoteric signal which must be preserved with high accuracy through the whole transmitter chain. 6. DISCUSSION AND SUMMARY Linearity is dictated by the standards, efficiency is not. Sufficiently high linearity can always be obtained by means of a properly backed off PA operating in class A/AB at the expense of poor efficiency. This leaves us with a large potential in efficiency improvement which can be translated to increased talk time, reduced energy per transmitted bit or whatever measure that is the most applicable. This efficiency potential can only be exploited by means of more advanced transmitter architectures. For systems with moderate linearity requirements direct modulation techniques will be the desired choice as they will provide higher effi-
321
ciency compared with linearized PAs. Here the starting point is a very nonlinear but high efficiency switched PA. Similarly, when the linearity specification is stringent linearized PAs will be the preferred or only solution as the starting point is a moderately nonlinear PA with moderate efficiency. There is, however, a number of important issues that make the adoption of linear transmitter architectures non-trivial. For example, the available headroom in power consumption is limited if efficiency is still to be improved, especially for systems with large power control range and those with small output power levels in general. In addition to this, the large bandwidth associated with many of the new systems cannot easily be handled by most of the techniques discussed. Production cost is another important factor. For example, techniques that are based on IF signals may require additional SAW filters, which are generally avoided whenever possible. For the various techniques that have been described we may summarize some of the most important disadvantages that should be addressed for a successful adoption of each technique. Predistortion can be very effective in improving linearity for wideband signals. One major disadvantage with this technique is the need for adaptation of the predistorter nonlinearity which requires a feedback path from the PA output and most likely some digital signal processing as well. Feedback techniques are attractive as they, in principle, can be made self-sufficient. Here, however, more focus must be put on lowering the delay in the PA to allow for larger signal bandwidths. EER and its derivatives are very attractive from an efficiency point of view. The large bandwidths associated with the signal components make this technique sensitive to interfering signals and spectral shaping. Also, the need for an accurate and fast DC/DC converter for supply modulation is a critical issue. LINC does not require supply modulation but like EER the technique operate with internal signals that have large bandwidths. Otherwise, the main problem with LINC is the recombination of the PA output signals. No one so far has identified a sufficiently linear technique that can preserve the high efficiency of a switched PA while operating with frequencies in the GHz range.
322
7. REFERENCES [1] C. S. Yu, W. S. Chan, and W. L. Chan, “Linearised 2 GHz amplifier for IMT-2000”, In Proceedings of IEEE 51st Vehicular Technology Conference, 2000, pp. 245-248. [2] M. Nakayama, K. Mori, K. Yamauchi, Y. Itoh, and T. Takagi, “A novel amplitude and phase linearizing technique for microwave power amplifiers”, In IEEE MTT-S International Microwave Symposium Digest, 1995, pp. 1451-1454. [3] J. K. Cavers, “Amplifier linearization using a digital predistorter with fast adaptation and low memory requirements”, IEEE Transactions on Vehicular Technology, vol. 39, Nov. 1990, pp. 374-382. [4] J. Namiki, “An automatically controlled predistorter for multilevel quadrature amplitude modulation”, IEEE Transactions on Communications, vol. 31, no. 5, May 1983, pp. 707-712. [5] S. P. Stapleton, G. S. Kandola and J. K. Cavers, “Simulation and analysis of an adaptive predistorter utilizing a complex spectral convolution”, IEEE Transactions on Vehicular Technology, vol. 41, no. 4, pages 387-394, November 1992. [6] T. Rahkonen, T. Kankaala, and M. Neitola, “A programmable analog polynomial predistortion circuit for linearising radio transmitters”, In Proceedings of the 24th European SolidState Circuits Conference, ESSCIRC, 1998, pp. 276-279. [7] E. Westesson and L. Sundström, “A complex polynomial predistorter chip in CMOS for baseband or IF linearization of RF power amplifiers”, In Proceedings of the 1999 International Symposium on Circuits and Systems, 1999. ISCAS’99, pp. 206 -209. [8] J. Vuolevi, J. Manninen, and T. Rahkonen, “Cancelling the memory effects in RF power amplifiers”, to appear in Proc. of International Symposium on Circuits and Systems, ISCAS’01. [9] T. Kankaala,V. Jutila, A. Heiskanen, and T. Rahkonen, “Using analog predistortion for linearizing class A - C amplifiers”, In proc. 1998 Norchip Seminar, Lund, Sweden, November 9-10, 1998. pp. 257-263. [10] M. Ghaderi, S. Kumar, and D. E. Dodds, “Fast adaptive polynomial I and Q predistorter with global optimisation”, In IEE Proceedings on Communications, vol. 143, no. 2, April 1996, pp. 78-86. [11] L. Sundström, M. Faulkner and M. Johansson, “Effects of reconstruction filters in digital predistortion linearizers for RF power amplifiers”, IEEE Transactions on Vehicular Technology, vol. 44, Feb. 1995, pp. 131-139. [12] M. A. Briffa and M. Faulkner, “Stability analysis of Cartesian feedback linearisation for amplifiers with weak nonlinearities”, IEE Proceedings on Communications, vol. 143, no. 4, August 1996, pp. 212-218. [13] M. Johansson and T. Mattsson, “Linearised high-efficiency power amplifier for PCN”, Electronics Letters, vol. 27, no. 9, April 1992, pp. 762-764. [14] M. Faulkner, “An automatic phase adjustment scheme for RF and Cartesian feedback linearizers”, IEEE Transaction son Vehicular Technology, vol. 49, no. 3, May 2000, pp. 956-964. [15] J. L. Dawson and T. H. Lee, “Automatic phase alignment for high bandwidth Cartesian feedback power amplifiers”, In Proc. of IEEE Radio and Wireless Conference 2000, pp. 71-74. [16] T. Arthanayake and H. B. Wood, “Linear amplification using envelope feedback”, Electronics Letters, vol. 7, no. 7, April 1971, pp. 145-146.
323 [17] B. Shi and L. Sundström, “A 3.3V power feedback chip for linearization of RF power amplifiers”, Journal of Analog Integrated Circuits and Signal Processing, vol. 26, January 2001, pp. 37-44. [18] L. R. Kahn, “Single-sideband transmission by envelope elimination and restoration”, Proceedings of the IRE, July 1952, pp. 803-806. [19] V. Petrovic and W. Gosling, “Polar-loop transmitter”, Electronics Letters, vol. 15, no. 10, May 1979, pp. 286-288. [20] D. K. Su and W. J. McFarland, “An IC for linearizing RF power amplifiers using envelope elimination and restoration” IEEE Journal of Solid-State Circuits, vol. 33, no. 12, December 1998, pp. 2252-2258. [21] H. Chireix. High power outphasing modulation. Proceedings IRE, vol. 23, no. 11, pages 1370-1392, November 1935. [22] D. C. Cox, “Linear amplification with nonlinear components”, IEEE Transactions on Communications, vol. 22, no. 12, pages 1942-1945, December 1974. [23] S. A. Hetzel, A. Bateman, and J. P. McGeehan. “LINC transmitter”, Electronics Letters, vol. 27, no. 10, pages 844-846, May 1991. [24] L. Sundström, “Spectral sensitivity of LINC transmitters to quadrature modulator misalignments”, IEEE Transactions on Vehicular Technology, vol. 49, no. 4, July 2000, pp. 14741487. [25] B. Shi and L. Sundstrom, “A translinear-based chip for linear LINC transmitters”, In Digest of Technical Papers, Symposium on VLSI Circuits 2000, pp. 58-61. [26] B. Shi and L. Sundstrom, “An IF CMOS signal component separator chip for LINC transmitters”, IEEE Custom Integrated Circuits Conference, pp. 49-52, May 2001. [27] C. P. Conradi, “LINC transmitter linearization techniques”, M.Sc. thesis, University of Calgary, Canada, January 2000. [28] M. K. DaSilva, Vector locked loop, U.S. Patent 5, 105, 168, Apr. 14, 1992. [29] A. Bateman, “The combined analogue locked loop universal modulator (CALLUM)”, In Proc. 42th IEEE Vehicular Technology Conference, 1992, pp. 759-763. [30] R. E. Schemel, “Neoteric signal: method for linearising narrow-band amplifiers or signal paths up to their peak powers”, Electronics Letters, vol. 36, no. 7, pp. 666-668, March 2000.
GaAs Microwave SSPA’s: Design and characteristics A.P. de Hek and F.E. van Vliet TNO Physics and Electronics Laboratory P.O. Box 96864, 2509 JG The Hague, The Netherlands Email:
[email protected];
[email protected] ABSTRACT The performance of GaAs SSPA’s is crucial to a rapidly increasing number of systems. This tutorial aims at clarifying the design choices and trade-offs, and at warning the new designer for pitfalls and unexpected problems. The tutorial starts, after a brief introduction, with a survey of the relevant GaAs technologies. After this, the tutorial follows the steps of a normal broadband microwave GaAs SSPA design: The transistor unit cell and then the operating point are chosen and the trade-off between power and gain for the load impedance is made. The topology for the total amplifier is determined, based on paralleling sufficient transistors for the required output power, and adding stages to achieve the required gain. Finally the matching is performed, starting at the output and working its way back to the input. Everywhere, the stability of the transistors and the total amplifier is of concern. Oscillations pose the biggest threat to SSPA designers. Finally, the design steps are illustrated with a recent example of a 5-7 Watt, 30 dB gain HFET amplifier. At the end of the tutorial, a relatively long list of references is included. They were included especially to assist the new designer in finding his way in literature. 325 J. H. Huijsing et al. (eds.), Analog Circuit Design, 325-345. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
326
1. INTRODUCTION
The design of broadband microwave GaAs SSPA’s is a difficult job. The performance of such an SSPA is strongly dependent on the topology of the SSPA, on the (non-linear) models of active and passive components used during the design, on the choice of the transistor sizes and shapes and on the technology chosen at the start. Furthermore, the design is often started with incomplete information and the parameters mentioned before cannot be changed independent of the others. Then, the performance of the amplifier will be at best moderate unless highly accurate modelling is used for everything in the amplifier. In addition, if not all that is enough, the nature of common SSPA’s is such that a variety of typical oscillations can occur. However, for a number of systems the performance of the SSPA is of crucial importance for the performance of the system, giving a legitimate reason to spend sufficient time on their design. Historically, this is definitely the case for expensive military phased-array radar systems. A sufficient bandwidth is here essential for a guaranteed radar performance in presence of a jamming signal, while the maximum output power secures the detection range of the radar and the power-added efficiency requirements limit the amount of DC power needed to operate the radar. Modern low-cost handheld telephones, to take the opposite application, are in a similar way dependent of the SSPA performance. Speak-time is directly related to the efficiency of the SSPA and stringent linearity and power control requirements give good reasons to invest in design time. There is much more to SSPA’s that falls outside the scope of this tutorial, which is limited to the design. The characterisation methods
327
of non-linear devices are not discussed, although especially for power transistors with their characteristic low output impedance this should be done properly. The thermal aspects of SSPA’s are not discussed, but are definitely important for reliability reasons and have a direct impact on the packages as well. Both thermal issues and packages, but also compact and physical models, and pulsed vs. continuous operation will not be discussed here. 2. TECHNOLOGY 2.1. Introduction As was made clear by the title, this tutorial focuses on GaAs. Reason for this is the frequency range considered. At microwave frequencies above a few GHz, GaAs devices dominate the market; between one and a few GHz, GaAs coexists with Si devices. The reason for this is in the high of GaAs transistors, in turn caused by the high electron mobility and saturation velocity.
The majority of the technologies are either MESFET, HFET or HBT. Until recently, the reliability of HBT transistors was questionable, and we therefore focus on MESFET and HFET technologies. A lot of the techniques are however directly applicable to HBT designs. 2.2. Transistor modelling First and most important is the modelling of the power transistor. SSPA’s have very specific requirements to the exact transistor used. Standard foundry models therefore are seldom satisfactory. In order to have maximum flexibility, a matrix of transistors with variations in all design parameters must be available and characterised.
First, S-parameters and IV-curves must be measured. Out of these, an equivalent circuit is deduced. Standard simulation software has a large variety of models available. Experience has shown so far that EEFET3 and EEHEMT1 models were satisfying for our designs for MESFETs and HFETs respectively. As an illustration, the equivalent circuit of a HFET power transistor is shown in figure 2.1.
328
For the chosen power transistor, the optimum load impedance must then be determined, based on measurements. This is not an easy measurement, due to the very low output impedance of the transistor. For a measurement system based on passive tuners, it is hard to measure at the right output impedance; the losses in probes and cables prevent the low impedance to be presented to the transistor. An active load-pull set-up [1] is preferable, our own set-up is shown in the figure below for illustration purposes [2].
329
2.3. Modelling of passive components The only components left to be modelled are the passive components. In SSPA’s, the passive components considered are usually transmission line structures, capacitors, resistors and via holes. Again, the foundry-supplied models are seldom of sufficient accuracy and should therefore not be trusted. Measurements rule, but fortunately the availability of EM simulators, such as Agilent’s Momentum and Sonnet Software’s EM has made the modelling task much easier. The use of EM simulations in combination of proper verification structures has turned out to be the right combination.
The importance of these simulations is illustrated with two cases: The first compares for a microstrip 45° bend the standard equivalent circuit models with EM simulations, and the second compares the measured values of a parallel capacitor embedded in microstrip lines with its EM counterpart. Note that with the meshing of capacitors, it is crucial that the meshes of top- and bottom-plate match.
330
With the design increasing in complexity, bigger parts of for example the matching can be simulated as a whole in the EM simulator. Be prepared to perform a lot of these simulations! For some structures the metal thickness plays its role. This is generally difficult to take into account with planar EM simulators. It may require the use of a full three-dimensional simulator, such as e.g. Agilent’s HFSS, for the problem to be solved. 3. UNIT TRANSISTOR CELL 3.1. Introduction The first step in the high-power amplifier design is the determination of the amplifier topology, see chapter 4, and the selection of the unit transistor cell. In this chapter the selection of the unit transistor cell and the factors that influence the performance of the transistor are discussed in section 3.2. In section 3.3 the effect of the operating class on both output power and power added efficiency is discussed. Finally, in section 3.4 the stability of the unit transistor cell is discussed. 3.2. Selection unit transistor cell The size of the unit transistor cell is determined by the required output power and the number of transistors used in parallel. These numbers
331
result in a required output power per transistor. Consequently, the total needed amount of gate width is known for a given transistor technology. The degree of freedom that is left to the designer for the realisation of the required amount of gate width is the selection of the number of gate fingers and the unit gate width of these fingers. Other factors that influence the performance of the unit transistor cell are: The gate-to-gate spacing of the transistor. The layout of the transistor. The effect of before mentioned items is strongly dependent on the frequency band of interest. In general, the higher the frequency the more important above factors become. Examples of the two most commonly encountered transistor layouts are shown in figure 3.1. At higher frequencies (f > 5 GHz) the transistors based on a fishbone layout will have better performance because the unit gate width is smaller, this will be discussed in more detail in the remainder of this section. The use of fishbone transistors can also result in amplifiers that have a smaller occupied chip area for a given output power. This is because less transistors in parallel have to be used to realise the required output power.
332
The gain of a transistor decreases as the unit gate width of the transistor increases. This gain degradation is frequency dependent [3]. The higher the frequency the more the gain decreases. An example of the gain reduction as function of the unit gate width is shown in figure 3.2.
The results show that the output power is more or less constant and the gain decreases with increasing unit gate width. Therefore, there is a maximum to the unit gate width. When this limit is reached the only way to increase the output power any further is using more gate fingers in parallel. This is only to a certain extent possible. When more gate fingers in parallel are used the phase difference between the inner and outer fingers starts to increase. Consequently, the gain of the transistor will be reduced. Practical limits at 10 GHz are a maximum unit gate width of and a maximum number of gate fingers in parallel of 50.
333
Another important issue that can be influenced by a designer is the gate-to-gate spacing of the transistor. Reducing this spacing has the obvious advantage that the size of the transistor diminishes and therefore the amplifier size is reduced. A disadvantage is the decreasing mean time to failure when the junction temperature is increased [4]. In figure 3.3, an example of the calculated junction temperature as function of the gate-to-gate spacing is shown. These calculations are performed with the help of Hotpac [5]. A maximum junction temperature of 125 °C is a general accepted upper limit. From the results depicted in figure 3.3 can be concluded that the minimum gate-to-gate spacing is A larger spacing will not give a much lower temperature but will only result in a larger unit transistor cell. 3.3. Operating class and load impedance After the dimensions and layout of the unit transistor cell are chosen it is time to select the operating point of the transistor and determine the load impedance. The drain voltage is determined by the minimum
334
voltage swing formed by the knee voltage of the transistor and the maximum voltage formed by the breakdown voltage. A drain voltage in the middle of these two extreme values is a good choice. As next step the operating class must be determined. This operating class is directly related to the Power Added Efficiency (PAE) of the transistor,
Equation 3.1 shows that not only a good output power to dissipated DC power ratio is essential but also the power gain of the transistor should be as high as possible. In [6] an equation for the PAE is given where the class A power gain is related with the operating class. The depicted results show that the operating point that will give maximum PAE moves from class A for a transistor with a low power gain to class B for one that has a high power gain. In general, a class AB operating point will give the maximum PAE. The operating class has also influence on the output power. In [6] it is shown that the maximum obtainable output power is found for an operating point, which lies between class A and B and will decrease towards class C. The choice, which operating class can be used is also limited by the application in which the amplifier will be used. The use of for instance class C is not allowed for amplifiers that have high demands with respect to linearity. The discussed result is valid for amplifiers that are operated in their ‘linear’ region. Other ways to improve the PAE are driving the transistor into compression [7] and/or applying harmonic terminations [8,9]. If harmonic tuning is applicable depends on the required frequency band of interest. If the bandwidth of the application is large, it will become difficult if not impossible [3] to realise the required load impedances. For higher frequencies, the influence of the capacitor between drain and source starts to increase. This capacitor starts to act as a harmonic termination (short) [10]. Consequently, the effect of harmonic terminations is reduced.
335
The next step is the determination of the optimum load impedance. Commonly encountered methods are: 1. The Cripps method [10,11]. 2. Load-pull simulation with a large-signal transistor. 3. Perform load-pull measurements. The latter method gives the most accurate results but involves the use of costly measurement equipment, which is not always available. For these cases, method 1 or 2 can give a reasonable estimation of the load impedance.
3.4. Stability unit transistor cell The final step that needs to be taken before the overall amplifier design can start is the analysis and if necessary improvement of the stability of the unit transistor cell. At microwave frequencies, the stability analysis is commonly performed with the help of the K-factor [12]. Note that the K-factor is not sufficient for the analysis of the
336
complete amplifier, see also chapter 4. Examples of other methods that can be used to analyse the stability of the transistor can be found in [13,14]. In general, the transistors are not unconditional stable over the entire frequency band of interest. Therefore, networks that improve the stability of the transistor should be applied. The most effective way is the application of a series RC network at the input of the transistor, see figure 3.5.
Stability improvement with the help of parallel or series feedback is not applicable in the case of a microwave power amplifier because too much gain and output power will be lost. In addition, the use of a network in series with the output of the transistor results in both a reduction of the output power and PAE. The before mentioned RC network is also helpful in the suppression of parametric oscillations [15].
4. HIGH-POWER AMPLIFIER DESIGN 4.1. Introduction In the previous chapter the selection of the unit transistor cell and the selection of the operating point are discussed, In this chapter the amplifier topology is discussed in section 4.2. In section 4.3, the design of the matching networks is discussed. Finally, this chapter concludes in section 4.4 with some remarks regarding the stability analysis of the complete power amplifier.
337
4.2. Amplifier topology In the previous chapter, it is discussed that the performance of a transistor is limited by a maximum number of gate fingers and a maximum unit gate width. Therefore, a number of transistors have to be used in parallel to realise the required output power level, see figure 4.1.
The overall of the complete two-stage amplifier can be calculated as function of the losses of the matching networks and and the transistor parameters
From this equation be concluded that for a high overall PAE of the amplifier: 1. The loss of the output matching network should be as low as possible. Of course this demand is also essential for an as high as possible output power.
338
2. The gain and the PAE of the last stage transistors should be as high as possible. and of the first stage transistors and 3. The PAE and gain the losses of the input and interstage matching networks and can not be neglected. 4.3. Design matching networks The design of the matching networks starts at the output of the amplifier. After this network is realised the interstage matching network(s) are designed. Finally, the input matching network is designed. The matching networks must perform the following functions: 1. Present the required source and load impedances to the input and output of both the transistors and the complete power amplifier. 2. Divide/combine power to and from the transistors. 3. Supply the bias voltages to the transistors. 4. Enhance the stability of the transistors.
As first step in the design of the matching networks a model of the source and load impedance of the matching networks is determined. For frequencies up to at least 12 GHz the in figure 4.2 depicted equivalent schematics for the source and load impedances are valid for relative frequency band widths up to 40%.
When the component values are known the maximum obtainable bandwidth and/or matching ratio can be analysed with the help of the Bode-Fano limit [16,17]. For the source and load impedance’s
339
depicted in figure 4.2 the matching network can be synthesised with e.g. the theory described in [17]. Another approach first described in [18] and known as the real frequency matching technique does not require any knowledge regarding the model of the load and source impedance. The design of the matching networks is demonstrated here for the output matching network. The first thing that is observed for this network is the need to combine transistors in parallel. In other words, there must be some kind of physical connection between the transistors. Therefore, the use of a low-pass matching network seems to be the correct choice. Another aspect that must be taken into account from the start is the way the bias voltages must be applied to the transistors. Wherever possible this is done with the help of a parallel inductor. This inductor should be applied at the point that has the lowest impedance level and has therefore the least influence on the overall performance of the matching network. In general, this point is located directly at the output of a transistor. The resulting equivalent schematic of the output matching network is depicted in figure 4.3.
At the output of the matching network a DC blocking capacitor is added. The analytical techniques from [17, 18] are not directly applicable for the determination of the component values of the matching network due to the existence of the bias inductor To circumvent this problem the following procedure is used. As first step, the source impedance is made real at the centre frequency with the help of the bias inductor. The influence of the blocking capacitor is considered negligible. As second step, the component values
340
according to a Tchebychev approximation are calculated with the equations given in [19]. In the final step, the overall performance is optimised as function of frequency. With the help of this approach, excellent results are obtained. At this point, it is time to convert the ideal component values into their layout equivalent. The inductors are realised with the help of microstrip lines. This is necessary due to the high drain currents that will flow through the lines, which prevent the use of integrated inductors. The capacitors are realised with the help of MIM capacitors. An example of the layout of an output matching network is shown in figure 4.4.
341
Figure 4.4 shows that there exists coupling between the various part of the layout. Therefore, the use of an electromagnetic field simulator for a final optimisation of the layout is mandatory. In the before mentioned approach the effect of the losses has been accounted for with the help of the optimiser. Examples of approaches where the component losses are taken into account from the start of the design can be found in [21,22]. The design of the interstage and input matching networks can be performed in a similar way. The only exception might be the need for a frequency dependent loss to compensate for the frequency dependent gain role-off of the transistors. The way to realise such a frequency dependent loss is not discussed here. Information regarding this subject can be found for instance in [23]. 4.4. Stability analysis After the design of the matching networks is completed, it is time to analyse the stability of the amplifier as much as possible at all thinkable operating conditions. The transistor cells have been stabilised for different load impedances. Unfortunately it is not possible to realise sufficient on-chip decoupling lower than 1 GHz. Therefore off-chip decoupling must be applied to guarantee stability. The amplifiers can also become unstable due to the existence of on and off-chip feedback loops. Methods to analyse this kind of instabilities are described in [24,25]. Inequalities in the transistors or matching networks can give rise to odd-mode oscillation [24].
342
This type of oscillation can be prevented by the use of odd-mode suppression resistors in between the transistors, see figure 4.5. The final type of oscillation that needs attention is the so-called subharmonic or parametric oscillation [15]. In [26,27] an analysis method and insight in this type of oscillation is given. The stabilisation RC network discussed in the previous chapter is also very useful in the prevention of subharmonic oscillations [15]. 5. DESIGN EXAMPLE SOLID STATE POWER AMPLIFIER As conclusion of the described design procedure we summarise with a solid-state power amplifier design, which is designed with the help of the techniques described in the previous chapters. The discussed amplifier design aimed at an output power between 5-7 Watt with a gain of 30 dB at X-band and maximum PAE. The required output power is realised by placing eight transistors in parallel, see figure 5.1.
343
At the input of the transistors RC stabilisation networks are placed. The amplifier is developed with the help of the HFET technology of the Fraunhofer Institute for Applied Solid State Physics (FhG-IAF) [28]. The results of this amplifier are depicted in figure 5.2.
The results show that target goals mentioned at the beginning of this chapter have been reached. More information regarding amplifiers designed by TNO-FEL with methods described in this paper can be found in [29-34]. 6. REFERENCES [1] Y. Takayama, “A New Load-pull Characterization Method for Microwave Power Transistors”, 1976 IEEE MTT-S Symposium Digest, pp. 218 - 220, June 1976. [2] A.P. de Hek, “A Novel Fast Search Algorithm for an Active Load-pull measurement system”, GAAS98 Symposium Digest, pp.268-275, October 1998. [3] J.L.B. Walker, “High-Power GaAs FET amplifier”, Artech House, 1993.
344
[4] W.J. Roesch, “Thermo-Reliability Relationships of GaAs ICs”, GaAs IC symposium Digest, pp. 61 –64, 1988. [5] J.A. Albers, “HOTPAC: Programs for Thermal Analysis Including Version 3.0 of the TXYZ Program, TXYZ30, and the Thermal Multilayer Program, TML”, NIST special publication 400-96, August 1995. [6] Y. Takayama, “Considerations for High-Efficiency Operation of Microwave Transistor Power Amplifiers”, IEICE Trans. Electron., Vol. E80-C, pp. 726-732, June 1997. [7] D. M. Snider, “A theoretical analysis and experimental conformation of the optimally loaded and over-driven RF power amplifier”, IEEE Trans. Electron Devices, vol. ED-14, pp. 851-857, June 1967. [8] H.L.Kraus, C.W. Bostian and F.H. Raab, “ Solid State Radio Engineering”, chapters 12-14, John Wiley & Sons, 1981. [9] F. H. Raab, “Class-F Power Amplifiers with Maximally Flat Waveforms”, IEEE Trans. Microwave Theory Tech., vol. MTT-45, pp. 2007-2012, November 1997. [10] S.C. Cripps, “RF Power Amplifiers for Wireless Communications”, Artech House, 1999. [11] S.C. Cripps, “A Theory for the Prediction of GaAs FET Load-pull Power Contours”, IEEE MTT-S Symposium digest, pp. 221-223, 1983. [12] J. Rollet, “Stability and Power Gain Invariants of Linear Two Ports”, IRE Trans. on Circuit Theory, vol, 9, pp. 29-32, March 1962. [13] A. Platzker, W. Struble and K.T. Hetzler, “Instabilities Diagnosis and the Role of K in microwave Circuits”, IEEE MTT-S Symposium Digest, pp. 1185-1188, June 1993. [14] W. Struble and A. Platzker, “A Rigorous Yet Simple Method for Determining Stability of linear N-port Networks”, GaAs IC Symposium Digest, pp. 251-254, 1993. [15] D. Teeter, A. Platzker and R. Bourque, “ A Compact Network for Eliminating Parametric Oscillations in High Power MMIC Amplifiers”, IEEE MTT-S Symposium Digest, pp. 967-970, June 1999. [16] H.W. Bode, “Network Analysis and Feedback Amplifier Design”, D. van Nostrand company Inc., 1945. [17] R.M. Fano, “Theoretical Limitations on the Broadband Matching of Arbitrary Impedances, “ Journal of the Franklin Institute, vol. 249, pp. 5783 and 139-154, January/February 1950. [18] R.M. Cottee and W.T. Joines, “Synthesis of Lumped and Distributed Networks for Impedance Matching of Complex Loads”, IEEE Trans. Circuits Syst., vol. CAS-26, pp. 316-329, May 1979. [19] H.J. Carlin, “A New Approach to Gain-Bandwidth Problems”, IEEE Trans. Circuits Syst., vol. CAS-24, pp. 170-175, April 1977. [20] Y.S. Zhu and W.K. Chen, “Low-pass impedance transformation networks”, IEE Proc.-Circuits Devices Syst., vol. 144, pp. 284-288, October 1997. [21] L.C.T. Liu and W.H. Ku, “Computer-Aided Synthesis of Lumped Lossy Matching Networks for Monolithic Microwave Integrated Circuits (MMIC’s)”, IEEE Trans. Microwave Theory Tech., vol. MTT-32, pp. 282290, March 1984. [22] L. Zhu, “A Novel Approach to the Synthesis of Mixed and Distributed Lossy Networks”, IEEE MTT-S Symposium Digest, pp. 1355-1358, June 1992.
345
[23] W.H. Ku and W.C. Peterson, “Optimum Gain-bandwidth Limitations of Transistor Amplifiers as Reactively Constrained Two-port Networks”, IEEE Trans. Circuits Syst., vol. CAS-22, pp. 523-533, June 1975. [24] R.G. Freitag, “A Unified Analysis of MMIC Power Amplifier Stability”, IEEEMTT-S Symposium Digest, pp. 297-300, May 1992. [25] M. Ohtomo, “Stability Analysis and Numerical Simulation of Multidevice Amplifiers”, IEEE Trans. Microwave Theory Tech., vol. MTT-41, pp. 983991, June/July 1993. [26] T. Takagi, M. Mochizuki, Y. Tarui, Y. Itoh, S. Tsuji and Y. Mitsui, “Analysis of High Power Amplifier Instability due to Loop Oscillations”, IEICE Trans. Electron, vol. E78-C, pp. 936-943, August 1995. [27] J. Imbornone, M. Murphy, R.S. Donahue and E. Heaney, “New Insight Into Subharmonic Oscillation Mode of GaAs Power Amplifiers Under Severe Output Mismatch Condition”, IEEE Journal of Solid-state circuits, vol.32, pp. 1319- 1325, September 1997. [28] W. Marsetz, A. Hülsmann, K. Köhler, M. Demmler and M. Schlechtweg, “GaAs PHEMT with 1.6W/mm output power density”, Electronic letters, vol.35, pp. 748-749, April 1999. [29] F.L.M. van den Bogaart, A.P. de Hek and A. de Boer, “MESFET Highpower High-Efficiency Amplifiers at X-band with 30% bandwidth”, GAAS’96 proceedings, pp. 3A2-1 - 3A2-4, June 1996. [30] A.P. de Hek, F.L.M. van den Bogaart, “Broadband High Efficient X-band MMIC Power Amplifiers for Future Radar Systems”, Wocsdice 97, Workshop on Compound Semiconductor Devices and Integrated Circuits proceedings, pp. 63 - 64, May 1997. [31] F.L.M. van den Bogaart, A.P. de Hek, “First-pass Design Strategy for High-Power Amplifiers at X-band”, IEE Tutorial Colloqium on “Design of RFICs and MMICs” digest, pp. 8/1 - 8/6, November 1997. [32] A.P. de Hek, F.L.M. van de Bogaart, “Optimisation of High-Power Amplifiers using non linear models”, IEEE European Workshop on: NonLinear Device Characterisation and Use in RFIC and MMIC Power Amplifier Design, July 1999. [33] A.P. de Hek, P.A.H. Hunneman, M.Demmler, A.Hülsmann, “A Compact Broadband High Efficient X-band 9-Watt PHEMT MMIC High Power Amplifier for Phased Array Radar Applications”, GAAS ’99 Symposium Digest, pp. 276 - 280, October 1999. [34] A.P. de Hek, P.A.H. Hunneman, “Small sized high-gain power amplifiers for X-band applications”, GAAS’00 Symposium Digest, pp. 221-223, October 2000
MONOLITHIC TRANSFORMER-COUPLED RF POWER AMPLIFIERS IN SI-BIPOLAR W. Simbürger, D. Kehrer1, A. Heinz, H.D. Wohlmuth, M. Rest, K. Aufinger, A.L. Scholtz1 INFINEON Technologies AG, Corporate Research, High Frequency Circuits Otto-Hahn-Ring 6, D-81739 Munich, Germany 1
Technical University of Vienna, Institute of Communications and Radio-Frequency Engineering Gusshausstrasse 25/389, A-1040 Vienna, Austria
ABSTRACT Monolithic integrated lumped planar transformers were introduced more than ten years ago. We present a comprehensive review of the electrical characteristics which results in an accurate lumped low-order equivalent model. Amplifiers, mixers and Meissner-type voltage controlled oscillators using monolithic transformers have been published a few years ago. For the first time, integrated transformer-coupled power amplifiers with a high performance up to 2 GHz are demonstrated. This presentation gives an introduction into monolithic transformer and circuit design of push-pull type power amplifiers. Two designs were realized: 1. A monolithic 2.5 V, 1 W Si-bipolar power amplifier with 55% power-added efficiency at 1.9 GHz.
2. A monolithic 2.8 V, 3.2 W Si-bipolar power amplifier with 54 % power-added efficiency at 900 MHz.
1
INTRODUCTION
Transformers have been used in radio frequency (rf) circuits since the early days of telegraphy. Normally transformers are relatively large and expensive components in a circuit or system. But there are several outstanding 347 J. H. Huijsing et al. (eds.), Analog Circuit Design, 347-371. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
348
advantages using transformers in circuit design: direct current (dc) isolation between primary and secondary winding, balanced-unbalanced (balun) function, impedance transformation and no power consumption. The requirements of nowadays telecommunication systems needs a high degree of monolithic integration. Today it is possible to integrate lumped planar transformers in silicon-based integrated circuit (IC) technologies which have excellent performance characteristics in the 1-20 GHz frequency range. The outer dimensions are in the range of about down to diameter depending on the frequency of operation and the IC technology. Monolithic integrated lumped planar transformers are introduced by e.g. [1]. A review of the electrical performance of passive planar transformers in IC technology was presented by [2]. Amplifiers and mixers using monolithic transformers are presented in [3, 4]. A monolithic 2 GHz Meissner-type voltage controlled oscillator is realized in e.g. [5]. The transformer coupled push-pull type rf power amplifier was invented in the early days of tubes which has survived into the semiconductor era with its benefits. There appears a 4:1 load-line impedance benefit for a push-pull combining scheme in an equal-power comparison to simple parallel device connection [6]. In general, impedance mismatch losses at the output of the power amplifier due to the decrease of the impedance required at low supply voltages limits the output power and the power-added efficiency (PAE). Due to the balanced amplifier design each output transistor contributes only half the total output power. The emitter bondwire inductance is not so critical because of the differential output stage. However, this approach requires a balun at the output of the power amplifier. For the first time, monolithic integration in Si-based technologies becomes successful [7, 8, 9]. But up to now there was no way to get an accurate prediction and model of the electrical characteristic of on-chip transformers. Section 2 presents a new winding scheme, modelling and model verification of an integrated lumped planar transformer in silicon which has excellent performance characteristics. A lumped low-order model, which consists of 24 elements gives an accurate prediction of the electrical behaviour and ensures a fast transient analysis, because of the low complexity. The method of parameter extraction for the equivalent circuit is based on a tool developed by the authors which uses a new expression for the substrate loss and two finite element method (FEM) cores called FastHenry [10] and FastCap [11]. In Section 3 a monolithic rf power amplifier for 1.8–2 GHz is presented which has been realized in a Si bipolar technology. The chip is operating down to supply voltages as low as 1.2 V. The balanced 2-
349
stage power amplifier uses two on-chip transformers as input-balun and for interstage matching, with a high coupling coefficient of At 1.2 V, 2.5 V, and 3 V supply voltage an output power of 0.22 W, 1 W and 1.4 W is achieved, at a PAE of 47%, 55% and 55%, respectively, at 1.9 GHz. The small-signal gain is 28 dB. Section 4 shows a power amplifier design optimized for high output power at 900 MHz and low supply voltages. The chip is operating from 2.8V to 4.5 V. At 2.8 V the output power is 3.2 W with a PAE of 54%. The maximum output power of 7.7 W with an efficiency of 57% is achieved at 4.5 V supply voltage. The small-signal gain is 38 dB. In Section 5 a lumped LC-balun as output matching network is reviewed and extended to a dual-band balun.
2 MONOLITHIC TRANSFORMER DESIGN Monolithic transformers have been presented in various geometric designs and many different kinds have been realized. A special planar winding scheme for monolithic transformers which results in a very high coupling coefficient is discussed in this section. To realize other values than N=1:1
of the turn ratio, different numbers of primary and secondary turns are used. This implements that some adjacent conductors belong to the same
350
winding which results in a lower A solution for this problem is to use an interlaced winding-scheme. One winding (e.g. the secondary) is sectioned into a number of individual turns connected in parallel rather than one continuous winding. Each segment of the secondary windings is interlaced with a primary turn. The line width of each segment is designed to carry the same current to obtain a homogeneous magnetic field distribution. The monolithic transformer shown in Fig. 1 consists of six primary turns P1-P6 and two secondary turns S1-S2. The turn ratio is N=6:2. The center taps PCT and SCT are available.
Fig. 2 shows a three-dimensional topview of the transformer. The primary ports, P+, PCT and P-, are located on the left side. The secondary ports, S+, SCT and S-, are located on the right side. The transformer design is nearly symmetric about a line. The outer diameter is and the inner diameter is The lateral spacing between the turns is about and has different values for each metal layer because of different design rules (Fig. 4). The conductor width on the primary side is about The conductor width on the secondary side about and different for each winding to get the same series resistance of each segment of the secondary turns. Fig. 3 shows a cross-section of this transformer. The primary winding consists of metal 3 and metal 2 connected in parallel and is separated to
351
the substrate by The secondary winding consists of metal 1-3 connected in parallel to decrease ohmic loss. The substrate distance is The secondary winding consists of metal 1-3 connected in parallel to decrease ohmic loss. A cross section of the substrate and metal layer stack is shown in Fig. 4.
2.1
Lumped Low-Order Equivalent Model
An electrical model of a transformer can be recognized from the physical layout. The circuit devices in Fig. 3 are the basic elements of the equivalent circuit shown in Fig. 5 and can be identified as: multiple coupled inductors to ohmic loss in the conductor material to parasitic capacitive coupling between the windings to and into the substrate to and finally substrate losses to With this basic elements a lumped low-order equivalent model was constructed.
352
Limits of the Transformer Model In general the transformer model is valid down to dc. The upper frequency limit of the model depends on the transformer geometry. For valid simulation results the maximum outer dimensions of the transformer must be the guided wavelength. In most cases the upper limit of the proposed model is about 3/2 times the self resonant frequency of the transformer.
2.2. Parameter Extraction The lumped low-order equivalent model (Fig. 5) describes the electrical behaviour of the monolithic integrated lumped transformer. This section gives the background details about extraction of all elements used in the equivalent-circuit. Inductance and Series Resistance Transformers composed of straight conductors can be treated with the summation of self- and mutual-inductances of all individual conductor elements. The whole transformer geometry built up of straight conductors is the input to the FEM-core FastHenry [10]. The exact modeling of the
353
planar construction is an important task for an accurate inductance extraction. The exact modeling of the layer construction is less important for the inductance calculation. Each inductance is coupled mutually with every other inductance, denoted by the coupling coefficients where is the extracted mutual inductance. Ohmic losses in the conductor material due to skin effect, current crowding and finite conductivity are modeled by the series resistances at the frequency of operation of the transformer. Capacity Extraction Capacities are difficult to determine accurately and capacitive effects are best investigated in mesh point analysis. The exact modeling of the layer construction is important to get accurate results. In order to reach short processing times only a small part of the transformer’s cross section is the input to the FEM-core FastCap [11]. The static specific capacities from
354
primary to secondary primary and secondary to substrate are extracted. The capacities of the transformer are and where and are the mean perimeters of the included transformer turns. In the case of a circular transformer as shown in Fig. 3 they are calculated as
to of Fig. 5 are determined as and The sum of the capacities for each winding is the static capacity and to the substrate. The parasitic capacitive coupling between primary and secondary winding are determined as Substrate Loss Fig. 6 shows a conductor (i.e. a turn of a transformer) suspended in a dielectric. Capacitive coupling causes a current flow down to the ground plane shown in Fig. 6 as lines of constant current density.
From Fig. 6 is clear that the current-feed-in area at the substrate edge has a greater width than the physical width W. We define a effective feedin width Weff depending on the distance and conductor height T using the approximation:
355
The specific resistance in as shown in Fig. 6 can be written as
from a single conductor to ground
The error of (4) is always smaller than 3 % in the range of of a complete transformer winding is based on (3) and (4) where W is the complete width of the primary or secondary winding as shown in Fig. 3. for the primary winding can be written as
and similar for the secondary winding, respectively. to of Fig. 5 are determined as and More detailed information on modelling and parameter extraction of monolithic transformers can be found in [12].
2.3
Transformer Model Verification
The transformer is placed on silicon using two test structures including deembedding structures to measure the scattering parameters. One test structure is used to evaluate the primary-to-secondary transmission coefficient, where one input terminal of the primary and secondary winding is grounded, respectively. The second test structure is used to characterize the primary winding and secondary winding separately, where the opposite winding is left open, respectively. The center taps are always left open. In general, a 4-port measurement setup would give a little bit more accuracy, but would require much more measurement efforts. The equivalent circuit of the high coupling performance transformer is shown in Fig. 7. All parameter values are extracted by using the method described in Section 2.2. Node 1 is connected to the substrate. The values of the primary and secondary self inductance are
The strength of magnetic coupling between primary and secondary side denoted by the k-factor is
356
The series resistance of the conductors on the primary side is and on the secondary side Tue due the greater distance to the substrate the parasitic capacity of the primary winding is less than the capacity of the secondary winding The substrate resistances of both windings are in the same range of about Fig. 8 shows the measured and simulated reflection S11 and S22 of the high coupling performance transformer. Measurement and model shows excellent agreement up to 5 GHz. Fig. 9 shows S21. The insertion loss is about 9 dB at 1.9 GHz. The difference between simulation and model is negligible. The S-parameters describe the electrical behavior of a monolithic transformer completely. But, not only the scattering parameters must be observed. Also the Z-parameters, Y-parameters, and Qfactor derived directly from the S-parameters give a fundamental insight to the transformer’s characteristic.
357
Fig. 10 shows primary inductance and secondary frequency. The self inductances are analyzed using
as a function of
358
The simulated and measured self resonance is at 4 GHz. Analyzing the coupling coefficient as a function of frequency the relations
are useful. Then the coupling coefficient can be written as
Fig. 11 shows the coupling coefficient versus frequency. A of 0.9 at 1.9 GHz is a very high value for monolithic lumped planar transformers. Especially which represents the input impedance of the secondary short-circuit transformer, becomes significant importance because of the low input impedance of the driver stage and output stage of the power amplifier. Fig. 12 shows the real part of the measured and simulated real part of and Fig. 13 shows the imaginary part. Simulation and measurement agrees very well up to 3 GHz. The quality factor of the
transformer with the secondary winding open circuit
and short circuit
359
can be analyzed using the following expressions
and of this transformer at 2 GHz. In most cases the upper limit of the model is about 2/3 times the self resonant frequency of the transformer.
2.4
Transformer Tuning
In many applications, i.g. input matching and interstage matching of a power amplifier, a high current transfer ratio of the on-chip transformer is desired. In contrast to an ideal transformer the current transfer ratio of a lossy transformer is not equal to the value of the turn ratio. Fig. 14 shows a secondary short-circuit transformer. It consists of a primary winding and a secondary winding and are mutually coupled, denoted by the In most cases the input impedance of the driver stage and the output stage is very low. Therefore, the secondary winding of the transformer in Fig. 14 is short-circuit, but without loss of generality. The ohmic loss of the primary winding ohmic loss the secondary winding and the input impedance of the transistors (assumed real valued) are considered by the admittance G. The transformer is connected as a parallel resonant device using the capacitor C.
Then the resonant frequency rived as
of the tuned transformer can be de-
360
The quality factor Q of the resonant circuit is
The inner current transfer ratio of the ideal transformer is
Now the total current transfer ratio former can be expressed by
of the parallel resonant trans-
This relation shows, that in contrast to the untuned transformer, the total current transfer ratio can be increased by a quality factor of Q > 1.
3
A MONOLITHIC 2.5 V, 1 W SI-BIPOLAR POWER AMPLIFIER WITH 55 % PAE AT 1.9 GHZ
This section presents a circuit design using the transformer described in Section 2. Fig. 15 shows the schematic diagram of the power amplifier for
1.9 GHz. The circuit consists of a transformer X1 as input-balun, a driver stage T1 and T2, a transformer X2 as interstage matching network and a power output stage T3 and T4. The transformers X1 and X2 are of the same kind (Sect. 2). X1 is connected as a parallel resonant device using the
361
MOS capacitor (Sect. 2.4). The transformer acts as balun as well as input matching network. The interstage power transformer X2 is connected as a parallel resonant device using and are realized using two MOS capacitors connected in antiseries, respectively. The effective emitter area of the driver stage is two times The emitter area of the output stage is two times The bias operating point of the driver stage and the output stage is adjusted using the current mirrors R1, D1 and R2, D2 respectively, connected via the center taps of the transformers. Fig. 16 shows the die photograph of the amplifier. The chip size is The power amplifier has been fabricated in an advanced production-near silicon bipolar technology [13]. The transistors have a double-polysilicon selfaligned emitter-base-configuration similar to a lot of current production technologies of various companies. As only standard process tools are used the technology is highly manufacturable at low costs. The minimum lithographic feature size is The doping of the speedlimiting base profile is done by low-energy ion implantation and subsequent diffusion using rapid thermal processing. This enables a final base width of only 50 nm at an intrinsic base sheet resistance of The devices have transit frequencies and maximum oscillation frequencies (extracted from the maximum available gain) of 50 GHz and provide an ECL gate delay of 16 ps. The collector-base breakdown voltage is and the collector-emitter breakdown voltage is A supply voltage of more than is possible, if low impedance driving conditions are present [14]. The power amplifier was tested at to 2 GHz using chipon-board packaging on a two-sided Rogers RO4003 test board. Conductive epoxy is used for the die attach. The input of the amplifier chip is connected via a micro-strip line to the input signal. The supply-voltage line of the output stage consists of two lines, translating a low impedance at to the output transistors. The optimum load impedance at is translated by a balanced odd-mode micro-strip line. Two lumped capacitors are used to set the real part and the imaginary part of the optimum load impedance at A compensated semi-rigid line acts as balun. Fig. 17 shows the measured output power and efficiency as a function of rf input power at 1.9 GHz, and as a function of power supply voltage. The matching network is unchanged for all supply voltages and the complete frequency range. The power amplifier is operating in a pulsed mode with a duty cycle of 12.5%. The pulse width is 0.577 ms. The bias operating current, without rf excitation, is two times 20 mA at the driver stage and
362
two times 75mA at the output stage. The bias operating currents are adjusted to these values at each level of supply voltage. When operating from a 1.2 V supply, the amplifier has a maximum output power of 0.22 W (23.4 dBm), and a power-added efficiency of 47 % at 1.9 GHz. At 3 V supply voltage, the output power is 1.4W (31.5 dBm) at a power-added efficiency of 55 %. Fig. 18 shows the output power and PAE versus the frequency from 1.8 GHz to 2 GHz. Fig. 19 shows the two-tone intermodulation performance of the power amplifier at 1.9 GHz and 2.5 V supply voltage. The 3rd-order output inter-
363
cept point is +30 dBm. The 7th-order signal-to-intermodulation ratio extracted from the measurements in Fig. 19, is shown in Fig. 20. The ratio is 8.5 dB in the fully saturated region. Table 1 summarizes the measurement results of the power amplifier.
4
A MONOLITHIC 3.2 W SI-BIPOLAR POWER AMPLIFIER WITH 54% PAE AT 0.9 GHZ AND 2.8V
In this section a circuit design for 900 MHz, optimized for high output power at low supply voltages around 3V is presented using the transformer described in Section 2, except that the shape of the transformer is enlarged by a factor of about two. The outer diameter of the transformer is now. Fig. 21 shows the model of the enlarged transformer. The series resistance of the conductors on the primary side is and on the secondary side The primary inductance is 7nH, the secondary inductance is 1 nH. The coupling factor is The self resonant frequency is 1.8 GHz. The frequency of operation of this transformer should be less than 1.2 GHz for good circuit performance. Fig. 22 shows the simplified schematic diagram of the balanced 2-stage power amplifier. The rf-part of the power amplifier consists of an on-chip transformer X1 as input-balun, a driver stage T1, T2, two transformers X2, X3 as interstage matching network and a power output stage T3, T4.
364
The effective emitter area of the output stage is two times The input-transformer is connected as a parallel resonant device using two MOS capacitor connected in antiseries. The transformer acts as balun as well as input matching network. The interstage matching network of the power amplifier consists of two transformers X2 and X3 connected in parallel, to get a high current transfer ratio at a low signal voltage swing. To diminish break-down effects at high supply voltages a closed loop bias operating point circuit is implemented. The maximum usable output voltage of the driver and the power stage depends on the driving conditions [14]. Thus, the source impedance of the bias driver should be as low as possible. The bias current of the driver stage is set by an operational amplifier U1 and T7, T8 via the secondary center tap of X1. T5 acts as current sensing device. The collector current of T5 is compared with the bias operating point reference current This closed loop ensures a low
365
impedance driving condition and a constant collector bias current over a wide range of supply voltage, for the driver stage T1, T2. R1 matches the output characteristic (breakdown) of the sensing device T5 to the driver stage transistors T1, T2. The bias circuit of the power stage T3, T4 is of the same kind. Fig. 23 shows a die photograph of the power amplifier. The chip measures The chip is fabricated in a standard 3-layer-interconnect silicon bipolar production technology of Infineon B6HF [15]. The collector-base breakdown voltage is and the collector-emitter breakdown voltage is For measurements the chip is bonded on a FR4 test board (see Fig. 26, Fig. 24, Fig. 25). The input of the amplifier chip is connected via a micro-strip line to the input signal of The supply-voltage line of the output stage consists of two lines translating a low impedance at to the output transistors. The optimum load impedance at is translated by a balanced micro-strip line. The real and imaginary part of the load impedance is determined nearly orthogonal by two capacitors. A compensated semi-rigid line acts as balun. This balun-line can be replaced by a lumped LC-balun with slight loss of performance. A more detailed description and evaluation of performance of this matching network compared to a lumped LC balun is presented in [7]. Fig. 27 shows the output power and PAE versus input power as a function of supply voltage at 900 MHz. The matching network is unchanged for all supply voltages. At 2.8 V supply voltage an output power of 3.2 W
366
with 54% PAE is achieved. The small-signal gain is 38 dB. The maximum output power at 4.5V supply voltage is 7.5 W at a PAE of 57%. The col-
367
lector efficiency of the output stage is 68 % in this case. But at this high level of output power, load impedance mismatch can result in damage of the output stage. However, at an output VSWR=10 the maximum usable supply voltage is 3.5 V. Output power and PAE versus frequency are shown in Fig. 28. The 3rd-order output intercept point is +41.3 dBm at 900 MHz and 3 V supply voltage. Table 2 gives a summary of the power amplifier performance.
368
5
A LUMPED LC-BALUN AS OUTPUT MATCHING NETWORK
Fig. 29 shows a lumped LC balun, which was originally used as an antenna balun [6, 16, 17]. This circuit can be used as a simple output matching network for push-pull type power amplifiers. However, the PAE-performance of the power amplifier is decreased due to inappropriate impedances at the harmonic frequencies. The performance of a lumped LC balun at 900 MHz and 4 W output power is evaluated in [7]. The bridge-type circuit (Fig. 29) consists of two inductors and two capacitors A rf-choke coil and a dc-block capacitor is used to feed the supply voltage. is the balanced input impedance of the bridge. Each collector is loaded by is the load resistor, usually. L and C can be
calculated by
where is the characteristic impedance of the bridge-type circuit. is the frequency of operation. and are assumed to be real valued. If should be complex valued, matching is possible, but
369
then the bridge becomes more or less imbalanced Better performance and less sensitivity against changes in component values can be achieved, if the imaginary part of the optimum load impedance is matched separately using a simple additional transformation network (L, C or LC) connected in series or in parallel to the output of the power amplifier. If the inductors are replaced by a parallel resonant circuit and the capacitors are replaced by a series resonant circuit in Fig. 29, then a lumped dual-band LC balun, shown in Fig. 30, is available. The circuit provides a balanced input impedance at and at Independent matching and balun conversion at two different frequencies can be done. and can be calculated by
where of the bridge at Note, that
and and
and
are the characteristic impedances are assumed to be real valued.
is a must, using the design equations above.
6 CONCLUSION A study is presented of the electrical characteristics of lumped planar transformers. A precise lumped low-order equivalent model is derived from the physical layout. Measurement and model shows excellent agreement. For the first time, transformer-coupled push-pull type power amplifiers with a high performance are integrated in Si-bipolar at 900 MHz and 2 GHz.
REFERENCES [1] G. Rabjohn, Balanced Planar Transformers. United States Patent with Patentnumber 4,816,784, 1989.
370
[2] J. Long, “Monolithic Transformers for Silicon RF IC Design,” IEEE of Solid-State Circuits, vol. 35, pp. 1368–1382, September 2000.
[3] J. McRory et al., “Transformer Coupled Stacked FET Power Amplifiers,” IEEE Journal of Solid-State Circuits, vol. 34, pp. 157–161, February 1999. [4] J. Long et al., “A 5.1-5.8 GHz Low-Power Image-Reject Downconverter in SiGe Technology,” in Proceedings of the 1999 Bipolar/BiCMOS Circuits and Technology Meeting, (Minneapolis), pp. 67–70, September 1999. [5] H. Wohlmuth et al., “2 GHz Meissner VCO in Si Bipolar Technology,” in 29th European Microwave Conf., Conf. Proc., Vol. 1, (Munich), pp. 190–193, October 1999. [6] S. Cripps, RF Power Amplifiers for Wireless Communications. Norwood, MA 02062: Artech House, first ed., 1999.
[7] Simbürger, W. et al., “A Monolithic Transformer Coupled 5 W Silicon Power Amplifier with 59 % PAE at 0.9 GHz,” IEEE Journal of SolidState Circuits, vol. 34, pp. 1881-92, December 1999. [8] W. Simbürger et al., “A Monolithic 2.5V, 1W Silicon Bipolar Power Amplifier with 55% PAE at l.9GHz,” in IEEE MTT-S International Microwave Symposium Digest, (Boston), pp. 853-856, IEEE, June 2000. [9] A. Heinz et al., “A Monolithic 2.8V, 3.2W Silicon Bipolar Power Amplifier with 54% PAE at 900MHz,” in IEEE Radio Frequency Integrated Circuits (RFIC) Symposium Digest of Papers, (Boston), pp. 117-120, IEEE, June 2000. [10] MIT, FastHenry USER’s GUIDE, Version 3.0. Massachusetts Institute of Technology, 1996. [11] MIT, FastCap USER’s GUIDE. Massachusetts Institute of Technology, 1992. [12] D. Kehrer, “Design of Monolithic Integrated Lumped Transformers in Silicon-based Technologies up to 20 GHz,” Master’s thesis, Technical University of Vienna, Institute of Communications and RadioFrequency Engineering, Gusshausstrasse 25/389, A-1040 Vienna, Austria, December 2000.
371
[13] Böck, J. et al, “A 50GHz Implanted Base Silicon Bipolar Technology with 35 GHz Static Frequency Divider,” in Symposium on VLSI Technology, Digest of Technical Papers, pp. 108–109, 1996. [14] Rickelt, M. and Rein, H.-M., “Impact-Ionization Induced Instabilities in High-Speed Bipolar Transistors and their Influence on the Maximum Usable Output Voltage,” in Bipolar/BiCMOS Circuits and Technology Meeting, (Minneapolis), pp. 54–57, IEEE, September 26-28 1999. [15] Klose, H. et al., “B6HF: A 0.8 Micron 25GHz/25ps Bipolar Technology for ”Mobile Radio” and ”Ultra Fast Data Link” IC Products,” in IEEE Bipolar Circuits and Technology Meeting, pp. 125–127, IEEE, 1993. [16] A. Krischke, Rothammels Antennenbuch. Stuttgart: Franck-Kosmos, 11th ed., 1995. [17] P. Vizmuller, RF Design Guide - Systems, Circuits, and Equations. Norwood, MA 02062: Artech House, first ed., 1995.
Low Voltage PA design in standard CMOS Koen Mertens, Michiel Steyaert K.U.Leuven, ESAT-MICAS Kasteelpark Arenberg 10 B-3001 Heverlee, Belgium
Abstract Recent years there is a trend to low voltage single supply amplifiers. When going to lower supply voltage there is a strong decrease in output power and efficiency, and this for all classic available power devices such as MESFET’s, PHEMT’s, HBT’s, .... For this low supply voltages the standard CMOS technology can be competitive in comparison with other technologies. The prospect of having one technology for all the RF and digital building blocks is very attractive. Only one technology has to be supported, which results in lower production cost. With the aid of some selected papers we will discuss some design aspects of CMOS PA design. In the last chapter a comparison with the classical devices concerning power and efficiency will be made.
1. Introduction The problem of delivering output power in CMOS, results in a lot of investigation on CMOS Power amplifier design. First attempts were presented in [1]. This paper resuscitates the use of Sokal’s class E switching power amplifier [2], which was presented in the year 1975. It is only recently that the number of papers about CMOS power amplifiers has increased drastically over the years. More than 90 percent of the papers recently published, still use the class E amplifier as the basic topology. The reason for this lays in three facts. 373 J. H. Huijsing et al (eds.), Analog Circuit Design, 373-394. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
374
The first reason is that this type of amplifier guarantees the best efficiency, when a minimum of lumped elements is used in the output matching network. When we take in mind that high efficiency and low voltage supply are not compatible, the most efficient type of amplifier must be selected. The efficiency typically dominates the power consumption in portable radio devices, which is directly related with the battery lifetime. The second reason is that the drain-source capacitance of the device can be quit high, compared to non-CMOS power amplifiers. The total drain-source capacitance is the sum of the drain source junction capacitance of the device with the capacitance of the metal traces. The metal capacitance on his own can easily be a few pF large. This can be well understand, if we know that high output power demands a high RMS current. The output interconnection conducting this RMS current has to be compliant with electron migration rules, and for this reason wide traces are used. As will be seen later, the total drain-source capacitance is a crucial design parameter in the design. Finally the third fact is that hot electron or Time-Dependent Dielectric breakdown can be avoided, because current and voltage are in the ideal case not present at the same time. This guarantees that there is no performance degradation during the lifetime of the device. For a save operation, the maximum allowable voltage stress over the switch transistor is the only remaining specification of interest.
2. Driving the NMOS switch transistor The switch transistor must be driven with the aid of a driving circuit. Making a good efficient driving stage to drive the capacitance of the NMOS switch is a challenge. Using digital buffers to switch the power NMOS transistor is not a good idea. The total power consumed by the buffers depends on the dynamic and short circuit dissipation. For a typically designed buffer, the short circuit dissipation can become larger than the dynamic dissipation. As shown in [3] a tapering factor different from the normal ‘e’ factor, which is derived from optimization towards the propagation delay, has to be chosen. Even when we use the recommended tapering factor of 11.5 the power consumption will still be to high. A better method is to tune out the gate capacitance of the switch transistor, lifting the broadband
375
character of the driver. The power consumption will be less, so the efficiency of the driving stage is increased. The driving signal is now a sine-wave of two times the supply voltage. Driving the NMOS switch with this large signal is an additional benefit over a digital buffer design. For a Class E amplifier driven by a sine-wave the class E conditions (Vds=0 and dVds/dt=0), at the moment of closing the NMOS switch, stay valid. This means that the switching losses from the off to on state can be kept minimal, and only leads to a minor performance degradation of two percent. In a fabricated power amplifier this is not the only source of power loss. For that reason the different losses are investigated in the next section.
3. Causes of power loss In figure 1, a simplified schematic of a class E amplifier with losses is given.
Following values for Cshunt and Lx should to be used:
376
to satisfy the well-known class E conditions, for a certain target load resistance R. For the calculated values the amplifier operates under peak power output capability Further more, negative voltage and current waveforms are eliminated. The target load resistance R is quite small and an upward LC impedance transformation network is employed to transform this to By lumping the components, we eventually come to a single inductor and two capacitors. The global drain efficiency of the schematic can be expressed in terms of individual efficiencies, which are considered to be independent of each other. This means that for each individual loss the class E stage is supposed to be working in perfect class E conditions. This implies that the drop in efficiency can be assigned to the loss under consideration, giving rise to an intermediate efficiency. For the five losses represented in figure 1, the intermediate efficiencies are discussed in the following paragraphs. 3.1. Ron loss and the loss in the driving stage
Due to the ‘on’-resistance, power is consumed in the NMOS transistor, therefor the output power in the target resistor R is not equal to:
but lowered to:
The variable g used in above formulas is called the DC-current to RF voltage transfer constant. The value for g is usually 1.862, but can be calculated by solving a set of differential equations [4]. The intermediate efficiency is given by [5].
377
Summing the target load resistance R with the parasitic resistance of the excess inductor forms the actual load resistance Ra, used in expression (5). For large output powers the actual load resistance must be low. To guarantee a high an NMOS transistor with a large width has to be selected. The draw back of this action is that a large gate capacitance is formed. A large gate capacitance lowers the equivalent parallel resistance seen by the driver tank, increasing the power consumption of the driver circuit. In publications [1,6-8], the gate capacitance is tuned out with the aid of a bond wire, hence the equivalent parallel resistance yields:
The factor used in above expression stands for the parasitic resistance of the bond wire given in When a class C amplifier (with a typical efficiency of around ten-percent) is used for the driver stage, than the intermediate efficiency can be written as follows:
From the explanation above we can conclude, that if you design for maximum output power, both intermediate efficiencies will be low. Resulting in an overall low efficiency. To increase the efficiency the specification of maximum output power must be released. This means that the target resistance must be taken larger to increase A PA with optimal efficiency can be achieved, when both intermediate efficiencies can be designed equal. To do so, scaling the
378
NMOS transistor is necessary. The conclusion is that for the PA in CMOS two different approaches can be followed. They can be designed for maximum output power [1,6], or they can be designed for maximum power added efficiency [7]. The power added efficiency is defined as the output power minus the input power, divided by the supply power. Above approaches are substantial different and must be well understood. 3.2. Inductor Losses
For each of the inductors, given in figure 1, the parasitic resistance can be given in Over these resistance’s a voltage drop can be measured. In case of the DC feed inductor, this voltage drop is lowering the supply voltage to
As the output power is proportional to the square of the DC voltage seen by the drain, above expression can be transformed in to:
This leads to an intermediate efficiency for
which is defined as:
for the value calculated with (1), the class E conditions no longer applies. To fulfill the class E conditions for the actual load resistance, an adjusted excess inductance can be calculated. The expression for the excess inductance is given by:
379
A fraction of the output power is absorbed in the parasitic resistance of the excess inductance The degraded output power in the target resister R is given by converting formula (4) to
In above formula takes the influence of the excess inductance in to account. In figure 2 the efficiency for the excess inductance is plotted.
The picture clearly shows that has to be small for high efficiencies. On chip inductors on today’s silicon substrates have typical a value around This implies that if standard CMOS is aimed for, only bonding wire inductors can be selected for the excess inductance and the DC feed inductance. Only when going to Silicon-On-Insulator structures or GaAs substrates, these inductors can be made with low
380
enough resistance. For this technologies a total integrated power amplifiers, as demonstrated in [10] and [11], can be made. However, due to the ever increasing number of metal layers [12] and Cu metalisations in deep-submicron CMOS, integration of the inductors can become an option in the near future. 3.3 Dirac Losses due to imperfections
Practically built class E amplifier, will have tolerances on its component. These non-idealities will produce a Dirac impulse in the current characteristic when the switch closes, see fig. 3.
The Dirac pulse is smeared out in the time domain, because the current through the NMOS has a finite response time. The result is that voltage and current are overlapping during a short time. An estimation of this loss can be done, by varying the component values, and measuring the power loss. An intermediate efficiency of 95% due to the Dirac impulse is a realistic assumption.
381
3.4 Combining the efficiencies
The maximum overall efficiency of the amplifier can be found by combining the intermediate efficiencies of the previous sections. This results in the expression:
The efficiencies that are related to the power dissipated in the transistor are summed, all the others are multiplied. Ones again we emphasize that for a maximum efficiency design the intermediate efficiencies have to be equal. 4. Impact of the transistor junction capacitance
The junction capacitance of the transistor can not be ignored in the design process. This non-linear capacitor can be modeled by using the model:
Where V is the reverse voltage over the junction, is the built-in voltage of the junction, and is the zero-bias capacitance. The tolerated current per contact area does not scale proportional with the technology. This keeps the drain area relative constant. The resulting zero-bias capacitance, when scaling down in technology, is therefore practical the same. A zero-bias capacitance of can be taken as a good reference value. The grading coefficient
typical ranges from 0.55 to 0.9 for sub-micron processes. From paper [13] it was shown that the peak voltage becomes higher, for a class E amplifier with a non-linear shunt capacitance.
382
The current waveform, the output voltage and the load network component values are unaffected. Above expression can be normalized with respect to the supply voltage. This normalized maximum drain voltage is plotted in figure 4 as function of
From the expression of the peak drain voltage above, a correction factor can be introduced: The factor 3,56 is the normalized maximum drain voltage for an ideal class E amplifier with infinite DC feed. E.g., for a Vdd/Vbi of 4, becomes a value between 1.15 and 1.30 (see Fig. 4). This required correction has serious consequences for the design. The amplifier has to remain under the junction and oxide breakdown voltage of the device to guaranty the reliability of the amplifier. This means that the supply
383
voltage has to be lowered, leading to less output power and efficiency. The unavoidable wiring capacitance will help to linearize the total shunt capacitance. Making the wiring capacitance large with respect to the non-linear junction capacitance of the NMOS switch, leads to an optimal performance. So, ones again the sizing of the switch is required. To maintain a low Ron resistance, methods have to be used for increasing the shunt capacitance. This is certainly true when going to higher frequencies. For higher frequencies the shunt capacitance from (2) becomes smaller, and as a result the non-linear shunt capacitance plays a more dominant roll. Two possible ways for increasing the shunt capacitance will be discussed in following section.
4.1. Increasing the shunt capacitance. 4.2. Method one
The first method is used in paper [7]. In this design the robustness of the class E topology to component variations is exploited [14]. Using a higher shunt capacitance than calculated with expression (2) lowers while improving the power added efficiency. The normalized maximum drain voltage is additionally lowered, because the ratio between the linear capacitance and the non-linear capacitance increases. Raising the supply voltage, due to decreasing can compensate for the drop in output power. Maintaining the same output power as before is therefore not a problem. Figures 5 illustrate the effects of variation of the shunt susceptance for a class E amplifier with infinite DC choke and constant supply voltage. We observe that the output power is lowered for increasing shunt susceptance, without a substantial loss in efficiency. This can only be true, when the DC resistance seen by the power supply is increased. Consequently, the parameter g, which links the load resistance with the DC resistance, must be raised as well. The resulting improvement of is canceled by the influence of so leaving the term in expression (13) almost unaffected. The improvement of the overall efficiency emanates therefore from the elevated intermediate
384
efficiency (10). So, this design method maximizes the output power, the shunt capacitance and the power added efficiency.
4.3. Method two
The second method is reported in [6]. In this design a CMOS technology is used, in stead of the technology used in previous example. When scaling down, the oxide capacitance increases, because the oxide thickness is lowered.
As a consequence a lower supply voltage and gate voltage has to be used, to satisfy the breakdown requirements. The current flowing through the switch transistor, operating in its linear region, is approximated by This means that for small transistor lengths more current is sourced. To exploit the better current capability of the NMOS transistor, the class E amplifier must be moved to an operation region, which favors the current property of the switch. This can be done by reducing the DC feed inductor. From [15] we know that the output
385
power, RMS current and shunt capacitance increases, while is reduced. We can clearly see in fig.6, that the shunt capacitance is increased compared with the calculated value, using formula (2). The total shunt capacitance of 37pF is large enough to have a good ratio between the wiring and junction capacitance. The achieved efficiency of 42 percent, with 0.9 Watt of output power, points to a design that is optimized for maximum output power. The design achieves the same maximum output power and efficiency than with a CMOS technology, while the supply voltage is lowered to 1.8 volts.
5. Voltage compensation for Ron In case the linear bulk capacitance dominates over the nonlinear bulk capacitance, the maximum peak-drain voltage chances from into
386
The effective voltage used in expression (20) is the DC drain voltage minus the voltage drop over the on-resistance, when the switch is closed. Writing the effective voltage in function of results in
Form (21) we know that the effective voltage is always lower than Vdrain,dc. The result is that the supply voltage can be taken higher, for the same voltage stress of the device. For the CMOS technology used in [7], the maximum allowable peak drain voltage is situated around 7.5 volt. For a and a Ron of the maximum permitted supply voltage is set to 2.3 Volt. Following values were used to calculate the output power and efficiency for one side of the differential structure of figure 8: Vdd=2.3V; g=2.2; (L=1.8nH); The calculated output power equals 0,52 Watt with an efficiency of 66%. Due to the differential structure the total output power is multiplied by two, which gives a total output power of 1,04 Watt.
387
Consequently the output power and efficiency calculated with formulas (12) and (13) approaches closely the measured peak output power and PAE of figure 7. Notice that in the measurement the supply voltage is plotted up to the maximum allowable supply voltage of 2.3 volt.
6. Cross-coupled output stage An additional trick, for raising the total efficiency, is the use of a differential cross-coupled pair as output stage. The benefit of the use of such a structure is clearly demonstrated in the 1.9GHz design published in [8]. The inner transistors are driven with a gate voltage of 3.6 times the supply voltage. This means that the inner transistors are roughly three times better than the outer transistors. The onresistance of the transistor is therefore lowered, while the gate capacitance and non-linear shunt capacitance are kept as low as possible. The differential output is not a handicap, because in portable applications the amplifier can be placed close to the antenna.
388
7. Frequencies above 1GHz In spite of some differences between the 700MHz design [7] and the 1.9GHz design [8], following observation can be made. In the 700MHz design a 1.8nH inductor is used for the driver stage. When moving this design to a frequency of 1.9GHz, the new inductor value for the driver stage becomes equal to 0.25nH. Such an inductance value is difficult to manufacture. In the 1.9GHz design, see fig. 9, an inductor value of 0.37nH is used instead. Hence, for operating at the same resonance frequency, a smaller switch transistor must be selected. Comparing the outer transistor sizes used in the two designs, reveal that this action has taken place. The transistor width is reduced from 6,5 mm to 3,6 mm. This reduction has a large effect on the overall efficiency, because the product of and decreases. The power added efficiency for the 1.9GHz design is lowered to 48%, while an output power of 1.1 Watt is achieved.
The 0.37nH inductors, in figure 9, are made by placing bonding wires in parallel. The inductance value for parallel bonding wires is given by:
389
The term represents the mutual coupling between two inductors. When the current through two neighboring bonding wires flows in opposite direction, a negative sign for this term is achieved. A lower inductance and a higher parasitic resistance for the inductor are a result. The maximum number of bonding wires, that can be placed, is limited by the parasitic capacitance of the bonding pads. Achieving very small inductance values requires therefore additional measures. In the 1.9GHz design the die was thinned to lowering the distance between the die and the substrate.
8. CMOS versus other technologies Since high efficiency CMOS PA are non-linear class E topologies, they can only be compared with other non-linear PA. In table 1 a summary, for a few leading Class E amplifiers, is given. The table reveals that the output power for a fully Integrated Power Amplifier is about 6 to 9 dBm lower, compared with a discrete power amplifier.
Also, the consumed chip area is a factor 1,5 to 3 times larger, than that of a discrete power amplifier. The CMOS PA designed for
390
maximum efficiency [7], can be regarded equivalent to the IPA’s given in table 1. After all, when the on-chip inductors where replaced by off-chip inductors the output power and efficiency would increase to equivalent values. The benefit offered by technologies such as GaAs and SOI would of course vanish. Hence, for higher output powers it would always be better, for a given supply voltage, to design with off-chip inductors. For low voltage applications that require large output powers, CMOS is the best choice.
9. Linearization Due to the high non-linearity the operation of the CMOS PA is restricted to constant envelope modulation scheme, such as GMSK. Two linearization schemes can be used, to solve this problem. The first method is the use of the LINC topology. In this method two matched power amplifiers are combined. Each of the nonlinear amplifiers produces a sine-wave with a certain phase. When a coupler sums the sine-waves, the phase relation between the two signals define the amplitude of the resulting output signal. The problem is to isolate the two outputs of the amplifiers in a sufficient manner. Only when a micro-strip coupler is used enough isolation can be achieved. The size of the micro-strip coupler is given by the used frequency, for frequencies lower than 2GHz this micro-strip structure is too big to be practical. Another disadvantage is that the resulting efficiency is reduced to half of the efficiency of a single power amplifier [17]. The second method seems to be more elegant. In the Envelope Estimation and Restoration technique, see figure 10, the power supply is modulated to produce different output powers (amplitudes) [18]. We notice, from figure 7, that the relation between output power and supply voltage for a class E amplifier is not linear. A fraction of the output power is therefore fed back to the input of a differential amplifier. There it is compared with the wanted output signal. The resulting error signal is used to adjust the supply voltage of the nonlinear class E amplifier. A class S switching power supply makes the supply voltage for the class E PA. This low frequency power supply modulator can easily made in the same standard CMOS technology as the class E amplifier.
391
The class E amplifier always operates under the recommended nominal power supply. This gives the class S amplifier enough headroom to produce the maximum output voltage for the class E amplifier. The efficiency of the class S amplifier depends on the bandwidth, output swing and allowable distortion. For a small bandwidth and low distortion good efficiencies can be achieved. In paper [18] an efficiency of 80% was achieved for a bandwidth of 20KHz, a peak sinusoidal signal of 0.8V and a distortion of –55dBc. The resulting efficiency for the combined system of the class S and class E amplifier will be the product of both efficiencies. When an efficiency of 42% (maximum output power design) for the Class E amplifier is assumed, a resulting efficiency of 33.5% can be achieved. This efficiency is far better than amplifiers that use Output power Back-Off to linearize their output power. For linear amplifiers using orthogonal frequency division multiplexing an OBO of 6 dB must be used, to accommodate for the strong fluctuated envelope of the OFDM signal. The linear amplifier will accordingly have an efficiency below 13%. Hence, a CMOS power amplifier with EE&R is a strong candidate for this type of applications [19]. A supplemental advantage of using EE&R is that the amplifier and the power supply modulator can be used and designed separately. Also other modulations than OFDM can be used. The only restriction is that transitions through the origin in the constellation diagram must be
392
avoided. The origin must be avoided, because the swing of the class S amplifier is limited. Following modulations can be used: (NADC/IS-54), OQPSK (CDMA). offset 8-PSK (GSM EDGE). For example [18] the maximum swing is 1.6V, setting the maximum supply voltage equal to 2.3V. The resulting maximum output power of 28dBm is necessary to apply with the North American Digital Cellular standard. The corresponding output power for the minimum supply voltage of 0.7Volt is 6dBm. This gives that signals with amplitude variations of 22dBm, which is more than actual needed, can be transmitted. Figure 11 presents two constellation diagrams for the example of [18]. A large improvement in performance, due to the EE&R, can be observed. QPSK Constellation of CMOS PA
10. Conclusion As explained CMOS PA’s can be designed for maximum output power or for maximum efficiency. Independent of the choice of operating point, CMOS PA’s can compete with other low voltage designs in other technologies. When moving to output powers higher than 25dBm CMOS becomes the best candidate. Using more expensive technologies loses its attractiveness, because the inductors for the IPA designs can not be integrated.
393
References D. Su and W. McFarland, “A 2.5V, 1W Monolithic CMOS RF Power Amplifier”, Proceedings of the custom integrated circuits conference, IEEE, May 1997, pp. 189-192. N. Sokal and A. Sokal, “Class E–A new class of high-efficiency [2] tuned single-ended switching power amplifiers”, IEEE JSSC, Vol.10, No. 3, June 1975, pp. 168-176. [3] H. J. M. Veendrick, “Short circuit dissipation of static CMOS circuitry and its impact on the design of buffer circuits”, IEEE JSSC, Vol. sc-19, No. 4, August 1984, pp. 468-473. [4] Chen Wen, H. Floberg and Qiu Shui-Sheng. “A New Analytical method for analysis and design of class E power amplifiers taking into account the switching device on resistance”, International Journal of Circuit Theory and Applications, 27, 1999, pp. 421-436 [5] F. H. Raab and N. Sokal, “Transistor Power Losses in the class E tuned power amplifier”, IEEE JSSC, Vol. sc-13, No. 6, December 1978, pp. 912-914. [6] C. Yoo and Q. Huang, “A Common-Gate switched, 0.9W Class E power amplifier with 41 % PAE in CMOS”, VLSI Circuits Symposium, June 2000, pp. 56-57. [7] K. Mertens, M. Steyaert and B. Nauwelaers, “A 700MHz, 1W fully differential Class E power amplifier in CMOS”, ESSCIRC, September 2000, pp. 104-107. [8] K. C. Tsai and P. R. Gray, “A 1.9GHz 1W CMOS Class E power amplifier for wireless communications”, IEEE JSSC, Vol.34, No. 7,July 1999, pp. 962-970. [9] Thomas H. Lee, The Design of CMOS Radio-Frequency integrated Circuits, ISDN 0521639220, Cambridge university press 1998, pp. 52. [10] T. Sowlati, C. Salama, J. Sitch, G. Rabjohn and D. Smith, “Low voltage, High efficiency GaAs class E power amplifiers for wireless transmitters”, IEEE JSSC, Vol. 30, No. 10, October 1995, pp. 1074-1079.
[1]
394
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
Y. Tan, M. Kumar, J Sin, L. Shi and J. Lau, “A 900 MHz fully integrated SOI power amplifier for single-chip wireless transceiver applications”, IEEE JSSC, Vol.35, No. 10, October 2000, pp. 1481-1486. A. Jain, W. Anderson, et. al, “A 1.2GHz Alpha Microprocessor with 44.8GB/S Chip Pin Bandwidth”, digest of technical papers ISSCC, 6 February 2001, pp. 240-241. P. Alinikula, K.Choi and S. I. Long, “Design of Class E power amplifier with nonlinear parasitic output capacitance”, IEEE transactions on circuits and systems _ II: Analog and digital signal processing, Vol.46, No. 2, February 1999, pp. 114-119. F. H. Raab, “Effects of circuit variations of the Class E tuned power amplifier”, IEEE JSSC, Vol. sc-13, No.2, April 1978, pp. 239-247. R. E Zulinski, “Class E power amplifiers and frequency multipliers with finite DC-Feed inductance”, IEEE transactions on circuits and systems, Vol. cas-34, No.9, September 1987, pp. 1074-1087. S. L. Wong, H. Bhimnathwala, S. Luo, B. Halali and S. Navid, “A 1W 830MHz Monolithic BiCMOS power amplifier”, digest of technical papers ISSCC 1996, pp. 52-53. S. Tomisato, K. Chiba, K. Murota, “Phase error free LINC modulator”, Electronics Letters, Vol. 25, No. 9, April 1989, pp. 576-577. D. K. Su and W. J. McFarland, “An IC for linearizing RF Power amplifiers using envelope elimination and restoration”, IEEE JSSC, Vol.33, No. 12, December 1998, pp. 2252-2258. W. Liu, J. Lau and R. S. Cheng, “Considerations on applying OFDM in a highly efficient power amplifier”, IEEE transactions on circuits and systems _ II: Analog and digital signal processing, Vol.46, No.11, November 1999, pp. 13291336.