HANDBOOK OF RECORDING ENGINEERING FOURTH EDITION
HANDBOOK OF RECORDING ENGINEERING FOURTH EDITION
by
John Eargle JM...
757 downloads
5254 Views
21MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
HANDBOOK OF RECORDING ENGINEERING FOURTH EDITION
HANDBOOK OF RECORDING ENGINEERING FOURTH EDITION
by
John Eargle JME Consulting Corporation
Springe]
John Eargle JME Consulting Corporation Los Angeles, CA, USA
Eargle, John. Handbook of recording engineering / by John Eargle.~4th ed. p. cm. Includes bibliographical references and index. ISBN 1-4020-7230-9 (alk. paper) 1. Sound—Recording and reproducing. I. Title. TK7881.4.E16 2002 621.389'3-dc21 2002032065
ISBN 0-387-28470-2 (SC) ISBN 978-0387-28470-5 ISBN 1-4020-7230-9 (HC)
e-ISBN 0-387-28471-0
Printed on acid-free paper.
First softcover printing 2006 © 2003 Springer Science+Business Media, Inc. (hardcover edition) All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the pubhsher (Springer Science+Business Media, Inc., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now know or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if the are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1 springeronline.com
SPIN 11545002
CONTENTS Preface
vii
SECTION 1. FOUNDATIONS IN ACOUSTICS Chapter 1. Acoustics in the Modem Studio Chapter 2. Psychoacoustics: How We Hear
1 28
SECTION 2. MICROPHONES Chapter 3. Microphones: Basic Principles 46 Chapter 4. Microphones: The Basic Pickup Patterns Chapter 5. Environmental Effects and Departures from Ideal Performance 65 Chapter 6. Microphones: Electronic Performance and the Electrical Interface 74 Chapter 7. Microphone Accessories 84
53
SECTION 3. RECORDING SYSTEMS: ANALYSIS, ARCHITECTURE, AND MONITORING Chapter 8. Basic Audio Signal Analysis 94 Chapter 9. Recording Consoles, Metering, and Audio Transmission Systems 107 Chapter 10. Monitor Loudspeakers 139
SECTION 4, RECORDING TECHNOLOGY Chapter 11. Analog Magnetic Recording and Time Code Chapter 12. Digital Recording 184 Chapter 13. The Digital Postproduction Environment
154 201
vi
Contents
SECTION 5. SIGNAL PROCESSING Chapter Chapter Chapter Chapter
14. Equalizers and Equalization 213 15. Dynamics Control 222 16. Reverberation and Signal Delay 232 17. Special Techniques in Signal Processing
242
SECTION 6. RECORDING OPERATIONS Chapter 18. Fundamentals of Stereo Recording 254 Chapter 19. Studio Recording and Production Techniques Chapter 20. Classical Recording and Production Techniques 311 Chapter 21. Surround Sound Recording Techniques
267 290
SECTION 7. PRODUCTION SUPPORT FUNCTIONS Chapter 22. Mixing and Mastering Procedures 326 338 Chapter 23. Music Editing and Assembly
SECTION 8.CONSUMER MEDIA Chapter 24. Recorded Tape Products for the Consumer 352 359 Chapter 25. Optical Media for the Consumer Chapter 26. The Stereo Long-Playing (LP) Record 371
SECTION 9. STUDIO DESIGN FUNDAMENTALS Chapter 27. Recording Studio Design Fundamentals
Bibliography Index 424
409
394
PREFACE The fourth edition of the Handbook of Recording Engineering follows the same broad subject outline as the third edition and includes new data on the many developments that have taken place in digital technology and surround sound recording techniques during the last six years. The emphasis of the book has shifted slightly toward needs voiced by teachers of recording technology, and students will find this edition easier to read and study than the earlier editions. Sidebars have been introduced in many of the chapters for detailed technical follow-up, leaving the body of the text free for general commentary. The book is divided broadly into nine sections, described below: 1. Foundations in Acoustics. The studio itself becomes the laboratory for our discussion of both acoustics and psychoacoustics. 2. Microphones. The microphone is indeed the central creative tool of our industry and the subject is given five chapters of its own. 3. Recording Systems: Analysis, Architecture, and Monitoring. A new chapter on audio signal analysis pulls together into a single chapter many concepts previously covered in multiple chapters. The modem in-line console is explained in greater depth than in previous editions. 4. Recording Technology. While analog recording retains its pre-eminence in basic tracking activities, the disc-based digital workstation has become the primary digital tool for both multichannel recording and postproduction work. 5. Signal Processing. Major developments in this area include plug-in "modules" for digital workstations that literally duplicate the highly esteemed equalizers and compressors of the past, and new sampling-type reverberation systems that duplicate the acoustics of famous performance venues around the world. 6. Recording Operations. In the last six or so years, surround sound has attained maturity and is now given parity with stereo techniques. 7. Production Support Functions. The techniques of mixing, music editing, and assembly remain much as before and are essential activities in the real world of audio.
viii
Preface
8. Consumer Media, Along with high-performance media such as the DVD audio and the SACD, the half-century-old stereo LP retains its position as the medium of choice for DJ-driven dance music, and as such it deserves its own chapter. 9. Studio Design Fundamentals, While greater numbers of pop and classical music releases are postproduced in the home project environment, the professional studio remains the center of tracking activities for music of all kinds.
John Eargle Los Angeles, 2002
Chapter 1
ACOUSTICS IN THE MODERN STUDIO
INTRODUCTION A basic knowledge of acoustics is essential for all recording engineers, and there is no better place to start than in the studio itself. In this chapter, we will cover the development of both simple and complex waves. We then move on to sound behavior in rooms, developing the concepts of sound transmission, absorption, reflection, and reverberation. Directionality of typical sound sources in the studio will be briefly discussed, and we end the chapter with a discussion of sound behavior in small spaces, such as isolation booths and reverberation chambers.
THE BASICS OF SOUND Sound waves are produced by variations in air pressure above and below its normal static value. For musical signals the time repetition interval for the variation is called its period, which is composed of one cycle of the wave. The number of cycles per second is called ihQ frequency of sound, normally denoted by the term hertz (Hz). The magnitude of the signal is known as its amplitude, and the time relation between two signals of the same frequency is specified as the phase relationship. For convenience, we state that there are 360 degrees in a single cycle of a sound wave, and relative phase relationships are normally stated in degrees. Figure 1-1 shows these basic relationships. For young persons with normal hearing, audible sound covers the frequency range from about 20 Hz up to about 20,000 Hz. The abbreviation kHz stands for 1,000 Hz {k is the abbreviation of the Greek kilo, meaning "one thousand"). We can then relabel 20,000 Hz as 20 kHz. The range of loudness of sound is fairly wide and is shown in Figure 1-2. The solid curve indicates the audible frequency and loudness ranges over which we normally perceive sound. You can identify the frequency scale along the bottom of the graph. The portion of the solid curve at the at the bottom of the graph is known as the threshold of hearing, or minimum audible field (MAF). Any sounds below this range are not normally heard. The part
Chapter 1
Time
Figure 1-1. A sine wave showing period, frequency, phase, and ampHtude.
of the curve at the top is known as the threshold of feeling', any sound in this range or higher is likely to cause a tingling sensation in the ear, or even be painful to the listener. The vertical scale at the left of the graph is stated in decibels (dB). This term is described in Sidebar 1.1, but for now just remember that each 20 dB on the vertical scale represents a 10-to-l sound pressure difference. The total 120 dB range represented on the vertical scale corresponds to an overall pressure difference of a million-to-one between the loudest and softest sounds we can hear. You can also see as you examine Figure 1-2 that the ear is much more sensitive to low-level sounds in the range between 1 kHz and 5 kHz than it is at higher and lower frequencies. This and other hearing phenomena will be discussed in detail in Chapter 2.
The Basics of Sound
50
100
200
500 Ik Frequency (Hz)
2k
Figure 1-2. Total range of hearing (solid curve); normal ranges of music (dashed curve) and speech (dotted curve). Sidebar 1.1: Introduction to the decibel The decibel (dB) is a convenient way to express the ratio of two powers, and that ratio is always expressed by the ternn level. The bei is defined as: Level = log (W^/WQ) bei, where log indicates the logarithm to the base 10. More conveniently, we use the decibel, which is one-tenth bei: Level =
10 log (W^/Wo) decibel
Let our reference power, PQ, be 1 watt; then 2 watts represents a level of 3 dB, relative to 1 watt: Level = 10 log (2/1) = 10(.3) = 3 dB Extending the ratio, 4 watts represents a level of 6 dB, relative to 1 watt: Level = 10 log (4/1) = 10(.6) = 6 dB In a similar manner 10 watts represents a level of 10 dB, relative to 1 watt: Level = 10 log (10/1) = 10(1) = 10 dB
Chapter 1 Figure 1-3 presents a useful nomograph for determining by Inspection the level In dB of various power ratios In watts. Simply locate the two power values along the nomograph and read the level difference In dB between them.
Decibels above and below a one watt reference power (dBW) -30
-20
-10
0
+10
+20
+30
l ' M ^ M I i M i | i M ^ M I / i l i | ' M i i M l i M | i i | i | i | Willi'i|i|Mljili|i'|i|i|l|i|lil I 2 0.001
4 6 l 2 0.01
4 6 1 0.1
2
4 6 «
2 4 6 1 1 10 Power In watts
2
4 6 » 100
2
4 6 » 1000
Figure 1-3. Nomograph for reading power ratios in watts directly in dB.
EXAMPLE: Find the level difference In dB between the maximum output capability of a 20-watt amplifier and a 500-watt amplifier: Above 20 watts read 13; above 500 watts read 27. Then: Level difference = 2 7 - 1 3 = 14 dB You can see that the relative levels between 100 and 10 watts, 60 and 6 watts, 4 and 0.4 watt are all the same: 10 dB. Obviously, the relative level of any 10-to-1 power ratio Is 10 dB. Likewise, the relative level of any 2to-1 power ratio Is 3 dB. Sound power Is proportional the square of sound pressure, and from this we get the relationship: Sound pressure level (SPL) = 10 log (Pi/Po)^ = 20 log (Pi/Po) The reference power for zero dB SPL Is given as the very small pressure value of 20 micropascals. We will not deal directly with pascals, but only with the pressure levels they produce. Here Is an example: What Is the SPL corresponding to a pressure of one pascal: SPL = 20 log (106/20) = 20 log (50,000) = 94 dB
The dynamic ranges typically occupied by music (dashed curve) and speech (dotted curve) are also shown in Figure 1-2. Music in a concert hall is normally perceived over a dynamic range that doesn't exceed about 80 dB, and speech is normally perceived over an even narrower range of about 40 dB. In most aspects of audio engineering, the horizontal scale on a frequency response graph is skewed to make musical intervals such as the octave appear equally spaced. The logarithmic (log) frequency scale preserves this relationship and is used in Figure 1-2. See Sidebar 2.4 for more discussion of the log scale.
The Basics of Sound
Sidebar 2.4: The log frequency scale Figure 1-4A shows a typical grid for presenting audio frequency response data. 500 Hz and Its succeeding octave values are shown at black markers, and you will note that these values are all equally spaced. By comparison, Figure 1-4B shows a grid with a linear frequency scale. As before, the black markers show 500 Hz and its octaves. As we go up in frequency the markers become more widely spaced, and this is counter to the way we actually perceive the octave Intervals.
Log frequency scale 10
m
IJL in Tfl
0
•o
I
^ -10 > o
oc -20
•
"
HJ
Typical response of a I dynamic microphone
jfl IkJ
p
y
1
f
50
%
100
B
1 k 500 Frequency (Hz)
5k
10 k
20 k
Linear frequency scale
10
ff
0
I ^ -10
I oc -20
-30
Lj i
A
A
A
••H m
A
m
0 1 2
3 4 5k
A
10k Frequency (Hz)
Figure 1-4, Typical log frequency scale (A); linear frequency scale (B).
20 k
Chapter 1 SIMPLE SOUND WAVEFORMS Sound travels approximately 1130 feet per second (344 meters per second) at normal temperature. The pressure variation of a low frequency sound of 100 Hz is shown in Figure 1-5A. Since the wave is in motion, the base line of the graph can be measured in time. At 100 Hz, ihQ period, or time taken for the wave to repeat itself, is equal to 1 divided by the frequency: 1/100 = 0.01 seconds. The waveform shown here is that of a pure tone, known as a sine wave. We can also relate the period of the wave to the actual length of the wave as it propagates through air. At a speed of 1130 feet per second, each cycle of the 100-Hz signal will have a wavelength of 1130/100, or 11.3 feet, as shown in Figure 1-5A. The same information is shown for a midrange frequency of 1 kHz in Figure 1-5B. Here, the wavelength of the signal is: 1130/1000 =1.13 feet, or about 13 inches. One period at 100 Hz (11.3 feet)
One period at 1 kHz (1.13 feet) B
One period at 10 kHz (1.35 Inch)
Figure 1-5. Sample waveforms for 100 Hz (A); 1 kHz (B); and 10 kHz (C).
Complex Waveforms
7
At a frequency of 10 kHz the wavelength in air will be: 1130/10,000 = 0.113 feet, or about 1.35 inch (Figure 1-5C). At 20 kHz, the normal upper limit of audible sound, the wavelength is just a little over half an inch. By comparison, the wavelength at 20 Hz is 1130/20, or 56.5 feet, so the entire frequency range of audible sound covers a 1000-to-l ratio of wavelengths. You will often see wavelength expressed as X (Greek letter lambda), The relationships among frequency (/), wavelength (X) and speed of sound (v) are:
f=vlX X = vlf Figure 1-6 shows the frequency ranges of various musical sources as they relate to the keyboard of a piano.
COMPLEX WAVEFORMS Music and speech are largely composed of periodic complex waveforms that repeat at regular intervals. They consist of harmonics of a fundamental sine wave similar to that shown in Figure 1-7. The lowest frequency present,^^, is called ih^ fundamental ox first harmonic. A frequency of 2 times^^ represents the second harmonic, and so forth. The four harmonics in this example combine as shown to produce a complex waveform. Harmonics as high as the tenth or twelfth are common in brass instrument waveforms when those instruments are played loudly. Many conmionplace sounds exist for a very short time, and it is difficult to isolate any periodic behavior in the waveform. Figure 1-8A shows the recorded waveform of a hand clap. The total time occupied by the signal is only about half a second, and most ofthat consists of the acoustical "ringout" of room reflections after the hands have contacted each other. Figure 1-8B shows the waveform of the continuous sound "ah." We can see the periodic nature of the waveform, and we can also see slight cycle-tocycle variations within the waveform. The spoken word "yes" is shown in Figure 1-8C. Note the clear delineation of the waveform for each component of the word. The note "A" below middle C played on a trombone is shown in Figure 1-8D. The harmonic structure of this steady-state waveform is fairly detailed, and the peak value of the waveform is much greater than its average value. (More on this in Chapter 8.)
Chapter 1
00000'9L
00 00001
Figure 1-6. Frequency ranges of instruments and voices compared to piano keyboard.
Sound Behavior in a Large Studio First harmonic
SecorKi harmonic
1
2fn
Third harmonic
13fo
rvwv
Fourth hanrtonic
i
4fn
Frequency
Time
fo
Time
•
2fo
• •
3fo
4fo
Frequency
Figure 1-7. Combining sine waves, first four harmonic waveforms (A); harmonics represented on a frequency scale (B); summed waveform (C); the four contributing frequencies (D).
SOUND BEHAVIOR IN A L A R G E STUDIO Figure 1-9 shows a perspective view of a modem studio used for film scoring as well as for laying down basic tracks for pop recording projects. Both walls and ceiling areas consist of individually adjustable sections so that the studio can be made "live" (reflective) or "dead" (absorptive), as required for the music being recorded. In pop recording, additional movable baffles (goboes)
Chapter 1
10
y||i|iiiiii>iiiiiW)ii|i(Mwi>iii»iiii
^^*^t^^
B
Figure 1-8. Typical sound waveforms (horizontal time axes not to scale). Handclap (A); the sound "ah'* (B); the word "yes" (C); the note A3 (A below middle C) played on a trombone (D). {A through C courtesy Drew Daniels; D courtesy S. Lipschitz and University of Waterloo, Canada)
Development of Reverberation in a Studio
11
adjustable acoustic panels
Motion picture screen
Isolation area
Orchestra seating •
^
Figure 1-9. Perspective view of a large recording studio.
would be used for further isolation of musicians, and sections of carpet would be laid down on the floor as needed. Let's assume that a loud, impulsive sound is generated in the performing area of the studio. Sound radiates outward from the source in many directions; when it strikes a boundary some of the sound is absorbed and the remainder is reflected. In a relatively short time (less than about one second), the distribution of all reflections initiated by a single sound source will fill the room fairly densely. The absorption coefficient of the reflecting surface is given by a (Greek letter alpha), which determines the ratio of energy absorbed to the energy reflected. For example, if a surface has an absorption coefficient of 0.5, then half the energy striking that surface will be absorbed, the other half reflected. The reflection/absorption process continues until the initial sound energy from the source has been dissipated. This process is shown in simplified form in Figure 1-10.
DEVELOPMENT OF REVERBERATION IN A STUDIO Let's assume that the studio is empty and that wall and ceiling sections have been adjusted to make the room fairly live. Now, locate a loudspeaker in the middle of the studio and place a microphone about twenty feet away. If we "pulse" the loudspeaker with a short signal burst, the physical distribution of sound in the studio looks like that shown in Figure 1-11 A, and the "time history" of the signal picked up at the microphone appears as shown in Figure 1-1 IB.
Chapter 1
12
E(i - ä f , r ^
-
"
^
, E(1 -5) / Assume 5 at all twundaries
r" K^E(1 -5]3
Sour(»
^
Listener Figure 1-10. Sound striking a studio wall, showing effects of absorption and multiple reflections. The absorption coefficient at each surface is ä, which is the average absorption coefficient in the space.
The first-arrival sound at the microphone comes directly from the loudspeaker, and this is followed by a series of early reflections reaching the microphone through reflections from nearby side walls, ceiling, and floor. Further reflections eventually arrive from all directions as the sound undergoes multiple reflections throughout the room. A listener positioned near the microphone would hear the following: 1. The initial, direct sound, which would be clearly identified as originating at the loudspeaker. 2. Early reflections (those in the range up to about 30 milliseconds), which would not be heard as such but which would enhance the perceived loudness of the direct sound. 3. Within a matter of half a second or so, the listener would be aware of a sense of space, or ambience, created by successive reflections coming from all directions in the studio. These multiple reflections would produce a distinct "ringout," or decay of sound in the studio after the initial sound has stopped. The decay of sound is known as reverberation, and the reverberant signal would appear to arrive at the listener from all directions. In this setting, the studio would be appropriate for music performance and recording.
Development of Reverberation in a Studio
-^^^^I>^
13
\ ^^
i / ^ \ \
/
\
jT
1
^_^^-^
\ \
""•*•*•—•"—--^
"
^
'
•
^
'
r---=
^
Sr -
Source
^
- Direct sound - Early reflections Reverberation
—
Listener
—
B initial sound
Initial sound at Early listener reflections
4
r
Late reflections, merging into revert>eration
Reverberation envelope 20
40
ÜL
60 80 100 Tinne (nniiliseconds)
Figure 1-lL Development of reverberation in a live room. Physical distribution of sound reflecting in a room (A); time distribution of reflections at a microphone (B).
Let's now repeat this experiment with the studio's acoustics adjusted for maximum sound absorption. The physical distribution of sound in the studio looks as shown in Figure 1-12A, and the time history picked up at the microphone appears as shown in Figure 1-12B. In the dead studio, the initial reflections might be so low in level that further reflections appear to be nearly nonexistent. In this case a listener positioned at the microphone would hear the following: 1. The direct sound, clearly heard as originating at the loudspeaker. 2. Initial early reflections from walls and ceiling, which would only slightly enhance the loudness of the direct sound.
Chapter 1
14
Direct sound Early reflections
Listener
B initial sound
Initial sound at listener k
Early reflections Late reflections
20
40
iikk
Decay envelope
60 80 100 Time (milliseconds)
Figure 1-12. Sound in a dead room. Physical distribution (A); time distribution at a microphone (B).
3. Further sound reflections in the studio would very likely be too low in level to be detected by ear, and there would be little or no reverberant field as such. In this setting, the studio would be ideal for speech recording. In Figures 1-1 IB and 12B, the dashed curves show the general outline of later reflections as they decay in the studio. If we replot these curves with the vertical scale in dB, as shown in Figure 1-13B, the decay contours will be straight lines, and from these we can calculate the reverberation time (T^Q) in the room. Acousticians have established T^Q as the time, measured in seconds, for the reverberant sound level to diminish 60 dB after the sound source has shut off. Details of this are shown in Sidebar 1.3.
15
Development of Reverberation in a Studio
CO Q.
Pressure decay, live room
CO CO
£
Pressure decay, dead room
Q.
3 O CO
Time (sec) B CO
1 Level decay, live room Level decay, dead rcx)m
(D k. 3 CO CO
a. •o
c o CO
Time (sec) Figure 1-1S. Sound decay plots. Pressure (A); sound pressure level (B). Sidebar 1.3: Determining reverberation tinne in a roonn Acousticians define the reverberation time of a space to be the time interval, measured in seconds, over which the level of a sound has decayed by 60 dB. In most rooms of regular shape with fairly live acoustics the reverberation time (Tßo) «s given by the following equation: 0.049 V ^60 "
(English units)
Sä
"^60 -
0.16 V Sä
(metric units)
where V is the room volume, S is the room surface area, and a is the average absorption coefficient of all absorptive surfaces. The average absorption coefficient is given by: _ a
=
S-jCc-j + S^cco •+•••• + ^n^n =—^ S^ + S2 + . . . + Sp
—•—•
Chapter 1
16
where a^, 82 through a^ are the absorption coefficients of all individual surface areas Sp S2, through S^. English or metric units may be used. Another useful equation gives the mean free path (MFP), which Is the average distance sound travels between successive reflections as It decays In a reverberant space: MFP =
4V
English or metric units may be used. Values of absorption coefficient are normally measured on octave centers, and some typical values are given in the following table: Materials
125 Hz
Brick, unglazed 0.03 Carpet, heavy, 0.02 on concrete Medium velour 0.07 (draped to half area) Concrete or terrazzo 0.01 floor Wood floor 0.15 Plywood paneling, 0.28 3/8" thick
250 Hz 500 Hz 1 kHz
2 kHz
4 kHz
0.03 0.06
0.03 0.14
0.04 0.37
0.05 0.60
0.07 0.65
0.31
0.49
0.75
0.70
0.60
0.01
0.015
0.02
0.02
0.0
0.11 0.22
0.10 0.17
0.07 0.09
0.06 0.10
0.07 0.11
Here is an example of reverberation time calculation In a rectangular room at 500 Hz: Room dimensions:
length width height volume
= = = =
35 feet 20 feet 15 feet 10,500 cubic feel[
Details of areas: Materials
a
Dimensions
Area (S)
Sa
Floor, wood Celling, plywood Wall, brick Wall, brick Wall, velour Wall, velour
0.10 0.17 0.03 0.03 0.49 0.49
35x20 35x20 20x15 20x15 35x15 35x15
700 700 300 300 525 525
70 119 9 9 257 257
ä =t
Total Sa _ 721 = 0.236 Total S ~ 3050
(0.049)(10,500) ^^ ~ (3050)(0.236)
0.7 seconds
Sound Attenuation with Distance in the Studio; Critical Distance
17
In this room the mean free path is: MFP = 4V/S = 13.8 feet A typical measurement of reverberation time using a graphic level chart recorder is shown in Figure 1-14.
0.0
0.2
0.4 0.6 0.8 Time (seconds)
1.0
1.2 ^
Calculation of reverberation tinie: 1. Sound source is stopped at t = 0. 2. Level falls 20 dB in one second. 3. Reverl)eration time = 1 x (60/20) = 3 seconds
Figure 1-14. Calculation of reverberation time from a measured decay curve.
SOUND ATTENUATION WITH DISTANCE IN THE STUDIO; CRITICAL DISTANCE In our next experiment let's restore the studio to its live configuration. If the studio you have access to cannot be adjusted for reverberant acoustics, you may find it more convenient to use a large gymnasium or other live space for these experiments. A person should be positioned at the middle of the studio. Ask that person to face in your direction and to read any text at a fairly constant level while you slowly back away from the talker. As you begin fairly close in to the talker, the sound field you hear will be dominated by the talker alone, and the sound pressure will fall by approximately one-half (6 dB) each time you double your distance from the talker. Stop moving away when you reach a point where the direct sound of the voice and reflected room sound both appear to be about equal. Mark that spot. Then, repeat the experiment, starting at a distant position and walking toward the reader until direct and reflected sound appear to be about equal. Again, mark the spot.
18
Chapter 1
When you compare the two points you have marked, you may be surprised to find that they are roughly the same distance from the talker. This position where direct and reflected sound in a room appear to be about equal is known as critical distance (D(-). The entire procedure we have described here is shown in Sidebar 1 A. Critical distance is an important factor in any recording activity. It depends on both an instrument's directionality and studio liveness. The more directional the instrument is, the greater critical distance will be along its main axis. If an instrument can "aim" or focus its output in a given direction, it will tend to swamp out or override room reverberation in that direction. If an instrument has low directivity, then reverberation will tend to dominate at shorter distances from the instrument. In a more absorptive the room, critical distance will be greater because room reflections will be weaker. In studio recording, microphones are normally placed well within critical distance, basically to ensure good isolation from other instruments. On stage in live sound reinforcement, floor microphones are normally placed in the transition region between direct and reverberant sound fields. Sidebar 1.4: Attenuation of sound with distance; critical distance The nomograph shown in Figure 1-15 shows the loss of direct sound level over distance. That loss follows inverse square law, that is, there is a 4-to-1 reduction in acoustical power for each doubling of distance from the sound source. For example, if a source produces a level of 80 dB SPL at a distance of four feet, what will be the level at a distance of 16 feet? On the nomograph, locate the value of 4 at the bottom; above that read the level of 12 dB. Then, moving on to 16 feet on the bottom scale, read the level of 24 dB on the upper scale. Take the difference: 2 4 - 1 2 = 12 dB. Thus, the difference between the two readings is 12 dB, and the actual level at 16 feet will be 8 0 - 1 2 , or 68 dB SPL. 20 Log D (loss In dB) 0
5
10
15
20
25
30
35
40
40 50 60
80 100
45
50
55
60
|lVlf|llll[llll(ll(ip^^^^^^^^^ 1
1.5 2.0 2.53.0
4
5 678910
15
25 30
150 200
300 400500600 8001000
D in feet (or meters)
Figure 1-15. Inverse square nomograph. Critical distance is that distance away from a sound source where the direct sound field of the source and the reverberant field of the room are equal. It depends on the directivity of the sound source and the amount of sound absorption present in the room. For most rooms that are fairly live, it is given by the following equation:
Dc = 0.14 VQSä
Sound Attenuation with Distance in the Studio; Critical Distance
19
where O is the directivity factor of the sound source and S and ä are as given in Sidebar III. The directivity of the sound source is the ratio of sound emitted along a specific direction as connpared to the sound that would be observed at the same position if the source were completely omnidirectional. Here are some typical values of DQ: Human voice (in the forward direction) Trumpet (mid-range, along axis of bell) High frequency 90"" by 40'' horn loudspeakers
2 5 13
Figure 1-16 illustrates the merging of both direct and reverberant fields as we move away from a sound source in both live and dead rooms. Note that in the live room critical distance will be shorter than in the dead room.
Relatively live room
2 4 8 Distance from source (feet or meters)
16
32
Figure 1-16. Merging of direct and reverberant fields in a room.
Musical sources vary considerably in their directional properties, and we will discuss these in detail in later chapters. For the present we will consider only two representative sound sources, a trumpet and an 8-inch diameter loudspeaker mounted in a large wall. Both of these are simple sound radiators, as compared, for example, with a violin or a woodwind instrument. The directionality of the trumpet is shown in Figure 1-17A and that of the loudspeaker in Figure 1-17B. You can clearly see that both of these devices have fairly broad directivity at low frequencies, with rising directivity at higher frequencies.
20
Chapter 1
Frequency
500
1k
2k
4k '
Q
2
3
3.5
7.5
Figure 1-17. Directivity of a trumpet (A); directivity of an 8-inch diameter loudspeaker located in a large wall (B).
Sound Losses Due to Humidity
21
SOUND LOSSES DUE TO HUMIDITY In a reflection-free environment, sound level falls off at 6 dB per doubling of distance from the source, and this is what we would see in a typical studio. However, in a very large space such as an auditorium or arena, the fall-off with distance at high frequencies would depend also on the humidity in the space. This effect is fairly small, and a typical example of it is shown in Figure 1-18. The loss is greatest when the humidity is lowest. 1.2 m (4 ft)
4 m (13 ft)
-10
12 m (40 ft)
40 m (130 ft) • 80% Relative Humidity
LEGEND:
_J 100
20% Relative Humidity
L 200
I
L
500 1 k Frequency (Hz)
Figure 1-18. Effects of humidity on high frequency loss with distance.
MORE ON SOUND PROPAGATION Sound bending around an obstacle When there are no barriers or reflecting surfaces in its way, sound tends to travel in straight lines as it moves away from the source. However, we have all observed that sound goes around comers and bends around various obstacles in its path. This behavior is known as diffraction, and it depends on both the frequency of sound and the size of the objects it encounters. Diffraction is a complex phenomenon, and what we show here in Figure 1-19 is a fairly simplified description of it. At low frequencies (LF), sound simply wraps around the comer as shown at ^. As frequency rises, sound continues to wrap around the comer, but to a lesser degree. At the highest frequencies (HF), the comer casts a "sound shadow" as shown at B. This is all in agreement with our everyday experience. We've all heard the muffled speech sound of someone about to come around a comer, and we aren't surprised to hear the high frequencies in the speech sound come to life when the person finally does round the comer.
Chapter 1
22
At low frequencies: At LF, sound bends around corners as if they were not present
Source
B
At high frequencies: At HF there is a clear shadow zone in which sound level is diminished. There is also lowlevel scattering of sound from any sharp boundaries, such as a comer ^
^
Figure 1-19. Diffraction of sound around a comer. At low frequencies (A); at high frequencies (B).
Isolation in the studio Baffles, often called "goboes," are used in the studio to provide added separation between instruments and microphones. Figure 1-20 shows how effective a large folding gobo can be in reducing sound leakage from a nearby
23
More on Sound Propagation
Absorptive
Reflective side ' M L X /
JX^/
Microphone
f/
.y!?!?!i?l'?®!^ -''®?p?_Q??_ y^i^^yt J?§?i?L
400
800
Frequency (Hz)
Figure 1-20. Sound isolation properties of a large gobo. Top view (A); side view (B); isolation (dB) versus frequency at a fixed distance (C).
unwanted source. You can see that there isn't much isolation at low frequencies, where the size of the gobo is small compared with the sound wavelength radiated from the source to be isolated. At frequencies in the range of about 80 Hz the gobo becomes effective, and as frequency rises the loss provided by the gobo attains values in the range of 10 to 12 dB. From the viewpoint of isolation, the larger the gobo the better. Many goboes are reflective on one side and absorptive on the other, and this adds to their usefulness in the studio.
Sound propagation through a small opening A small hole in a large wall provides a good path for sound transmission. If the hole is small, as shown in Figure 1-21, the reradiation of sound on the donwstream side is omnidirectional, as shown.
Chapter 1
24 Plane wave approaching a wall
Wall section
Spherical radiation
Figure 1-2L Sound transmission through a small hole in a wall.
SOUND IN SMALL SPACES Isolation rooms Modem studios all have isolation areas that are essential parts of the studio complex, and the most common here is the so-called vocal booth. If this area is large enough it can also be used for isolating other instruments or groups, such as a drum kit. Anyone who has ever sung in the shower knows just how effectively the shower stall can amplify certain notes and minimize others. What we are hearing in this case are so-called room modes, those individual frequencies at which the small shower stall has a tendency to resonate. Sidebar 1.5 presents mathematical details on how room modes can be calculated. Sidebar 1.5: Calculation of room modes in a rectangular space The frequencies of normal modes of a rectangular room are given by the following equation:
m^i'f^ where c is the speed of sound, /, w, and h are, respectively, the length, width and height of the room, and n|, n^, and n^ are, respectively, a set of integer values (1, 2, 3,...) taken in all possible combinations. The frequency values given by this equation represent those normal room
Sound in Small Spaces
25
modes at which the room can produce standing waves, which reinforce sound levels in the room. In a small room these modes are fairly widely spaced and, if they are not carefully damped, give the room its own particular "sound." In a large studio the so-called modal density is very high, even down to fairly low frequencies, and we are normally not able to pick them out as such. It is only in fairly small spaces, such as vocal booths and acoustical reverberation chambers, where we have to be concerned. In a vocal booth we want to damp out all of the lower room modes; in a reverberation chamber we want to use the room modes to our advantage in producing a convincing reverberant effect. Suppose we have a fairly live space with dimensions of length = 16 feet, width = 1 4 feet, and height = 8 feet. We can enter these values Into the equation above and choose integral values of n|, n^, and n^ in combination sets from 1 to 3 as follows: (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 1, 3), (0, 2, 0), (0, 2, 1), (0, 2, 2), (0, 2, 3), and (0, 3, 0), The modes corresponding to these three-digit groups of numbers are indicated in Figure 1-22. 1 1 Room response fairiy uniform j | 1 1 »" this range and above 0
WMP rVr m AJN 11 r^ p
^
1
wmmmmmm
1 1A
•6 1
\2 1 181
Z 4 Z4 1
H Room rmdes H
it
jo 1 o 1 F| o Icvi 1
1 «-1
(5
o 20
^
30
o 1.- h-
LLL_1
40
IfVi 1
50
j
1 1
1
ro rr-
|r- IfVJ 1 O «-
(M
ITITITliriTl TT T
60
80 100 Frequency (Hz)
200
300
400
Figure 1-22. Typical small-room mode structure at low frequencies.
In any vocal booth the main requirement is to damp out all room vibration modes so that they do not interfere with the microphone pickup. A sketch of a typical large isolation room is shown in Figure 1-23. A room of this type would have sufficient internal acoustical treatment to reduce the effects of room modes to near inaudibility. The material on the walls should be thick enough to ensure that sound absorption extends down to the frequency range about 100 Hz. This is necessary to assure that the entire vocal range will be well damped. Another requirement of the vocal booth is that its walls be thick and heavy enough to prevent outside sound from the studio or control room from entering.
Chapter 1
26
Figure 1-23. View of an isolation room adjacent to a studio.
Large studios will most likely have several isolation rooms to handle a number of recording requirements simultaneously. Those musicians in the isolation spaces will monitor the recording operations with stereo headphones.
Reverberation chambers Another small space we want to discuss is the "echo/' or reverberation, chamber. With today's excellent digital reverberation devices, acoustical reverberation chambers may not be the necessity that they once were. While many digital reverberation devices can do a very good job simulating acoustical spaces, many engineers and producers prefer the sound of a good acoustical chamber, especially when used with pop vocalists. Good chambers are expensive to build, and they often occupy more real estate than can be spared. They require solid construction based on concrete blocks, and all surfaces in the room must be finished with fine plaster and sealed. The signal to be reverberated is fed into the room via a loudspeaker, and it is picked up with two microphones which simulate a reasonably good stereo effect. The normal reverberation time of a chamber is in the range of 2 to 3 seconds in the midrange. At low frequencies the response will be highly colored by the specific mode structure of the room as we showed in Figure 1-21. It is this unique
27
Sound in Small Spaces
coloration that imparts an individual sound to some noted acoustical chambers. A typical physical and electrical setup of a reverberation chamber is shown in Figure 1-24. All surfaces masonry with piaster coat Irregular solids on walls to scatter high frequencies
Loudspeaker Microphones
Signal flow:
Reverb auxsend
Power amplifier
t>
Mid
Loudspeaker
a-
Reverberation chamber
Mio 2
stereo returns
—o
a
Figure 1-24. Functional view of a reverberation chamber.
If you want to hear what good chambers sound like, listen to any of the old Columbia pop recordings from the sixties of Tony Bennett and other singers (the building stairwell was used as the reverberation chamber!), or any of the Capitol Records vocals recorded in the last 40 years. Those great chambers at Capitol in Hollywood are still in use.
Chapter 2 PSYCHO ACOUSTICS: HOW WE HEAR
INTRODUCTION Psychoacoustics deals with the psychology of hearing and includes subjects such as judgements of pitch and loudness of sounds, localization of sound sources and specific masking effects. While certain judgements with regard to music performance spaces, such as spaciousness, warmth, intimacy and the like, are based largely on conditioning and listeners' experiences, their discussion is often included under the broad umbrella of psychoacoustics. We will end with a discussion of hearing protection, a vital concern in today's musical environment.
LOUDNESS PHENOMENA Equal loudness contours While loudness and sound pressure level are roughly equivalent, they are not quite the same thing. Loudness is a subjective judgement that varies according to the amount of signal present and the frequency range ofthat signal. The most common way of showing this is by way of a set of equal loudness contours, Many measurements have been made in this area, including the original work of Fletcher and Munson. Today, we normally refer to the work of Robinson and Dadson (1952), as shown in Figure 2-1. A given curve on the graph is known as a phon^ and the family of phon curves is plotted on a rectangular grid labeled in sound pressure level and frequency. At 1 kHz the phon values are equal to the corresponding values of SPL, and each phon curve plots the variation in SPL throughout the frequency necessary to maintain the same sensation of loudness. For example, if we identify 1 kHz on the 50-phon curve, we can take 50 dB SPL as our reference level. Now, move downward in frequency on the 50-phon curve until you reach a frequency of 30 Hz. You will then see that the 50-phon curve now intersects the grid at a value of 82 dB SPL. This is a difference of about 32 dB, and it indicates that a 30-Hz signal must be about 32 dB higher in level than a 1-kHz signal at 50 dB SPL in order to sound as loud.
Loudness Phenomena
29
100
1000
5000 10,000
FREQUENCY IN Hz
Figure 2-1 Robinson-Dadson equal loudness contours.
If we repeat this experiment using a 1-kHz reference signal of 90 dB SPL, we will see that the 30-Hz signal needs to be only about 21 dB higher in level in order to have the same apparent loudness as our 90-dB reference. There are many implications of this, some of which you may already be familiar with. The main one is monitoring levels. If you make a recording or a mix at a high level and then play it back at a lower level, it will sound as though it doesn't have enough bass—and the reason should now be obvious. You can understand the importance of setting a proper monitoring level in the control room and leaving that level pretty much fixed during the course of the job you are working on. If you do change levels for one reason or another, be sure that you have recorded your settings so that you can go back to the original level as a reference. You should remember that a subjective judgement oi twice as loud or half as /owöf normally requires a level difference of about 10 dB! Many engineers are surprised at this—^until they actually take part in tests themselves. Since a level change of 3 dB represents a doubling or halving of power, most of us suspect that loudness judgements would follow suit, but this is not the case. A familiar application of the variation in loudness contours is the loudness control found on most consumer receivers. This control automatically adds bass as you turn the level down. In most receivers the absolute tracking
30
Chapter 2
with the phon curves may not be exact, but it is in the right direction and will pretty much adjust the music spectrum so that balances seem natural, whatever the setting of the control.
Sound level meter Acousticians use the sound level meter (SLM) for making field measurements of loudness. A professional model is shown in Figure 2-2. An instrument such as this costs several thousand dollars, but for routine measurement of monitor levels in the control room and at concerts a low-cost consumer model, such as one you might buy at your local electronics store for less than fifty dollars, will do almost as well. (It's not as precise and rugged as the professional model and lacks certain other features.) SLMs normally have three important controls: the range control adjusts the internal gain of the meter so that you can conveniently read a wide range of levels on the scale. The response control allows you to read instantaneous peak values (fast setting) or an averaged value (slow setting). The weighting control takes into account the loudness contours when you are making measurements at fairly low levels (use the C-scale when making high level readings and the A-scale when making low level readings.) Figure 2-3 shows these two weighting curves. You can see that the A-curve is roughly the inverse of the 40-phon equal loudness contour shown in Figure 2-1, and this modifies, or "weights," the meter readings so that they will tend to track our loudness judgements.
Loudness dependence on signal duration The ear tends to integrate, or "spread out," a short, impulsive signal so that it may not sound as loud as a continuous signal of the same actual level. Figure 2-4 shows the results of an experiment in which the loudness of a 500-Hz pulsed tone is compared to the continuous tone. As you can see, when the tone is very short, in the range of 2 milliseconds in length, it will have to be raised about 18 dB in order to sound as loud as the continuous signal. It is only when the pulse becomes about 100 milliseconds (one-tenth of a second) that it begins to sound almost as loud as the continuous signal. This integrating property of hearing has been incorporated into many console meters (most notably, the so-called VU meter), so that the values those meters read will indicate fairly accurately the apparent level of the signal rather than its instantaneous value.
Pitch Versus Frequency
31
Figure 2-2. Photo of a modem sound level meter (SLM). (Courtesy Bruel & Kjaer)
PITCH VERSUS FREQUENCY While subjective pitch and signal frequency would seem to be basically the same, there are some subtle differences between them. Essentially, this relation is level-dependent and is shown in Figure 2-5. In this data, the change in
Chapter 2
32
+20 +10
m 0 •o
/^
"ö>-io
'X
I
^L i \anc I C "
g,-20 1^5-30
I' 50
100
200
500
1k
2k
5k
10k 20k
50k 100k 200k
500k IM
Frequency (Hz)
Figure 2-3. Weighting curves for loudness measurements.
apparent pitch is indicated in cents. The cent represents one-hundredth of a semitone and is a very small pitch interval. The data in this figure, compiled by Terhardt (1979), indicate that, above 2 kHz, tones tend to sound higher (sharper) in pitch with increasing level. Below 2 kHz, the opposite effect is noticed. In general, you will not be aware of these effects, which are on the order of a fraction of a semitone (there are 12 semitones per octave). I have noticed the effect only once in may years of recording, and that was in connection with an organ recording made in an extremely live room. As the reverberation of single high frequency notes faded into the background noise floor of the room, there was a slight lowering of its apparent pitch. For full-range signals, the reverberation exhibited no such tendency.
-20 r
4 10 20 40 100 200 400 Duration (milliseconds) Figure 2-4, Dependence of loudness on tone duration.
33
Localization of Sound Sources
30 20
h-
10
i^^^^'-'T"^ 1
_-^^^^*^*^^^^
2 kHz I
CO
1 0 cg (D -10 o
L^^^^^
1 ^^^^^ * 1 .^^^^ ^^^^ I
1
-2 I I 40
50
-20
1
60
70
80
90
Sound pressure level (dB)
Figure 2-5. Dependence of subjective pitch on sound pressure level of pure tones.
LOCALIZATION OF SOUND SOURCES Our hearing mechanism is binaural in its operation; that is, it depends on the joint reception of sound by two ears. It is the brain's comparison of signals from the ears that determines the direction, or localization, we assign to a given sound source. There are five primary factors at work here: 1. Phase relationships at the ears. For steady-state signals, phase effects are important in the frequency range below about 700 Hz, where the wave length of sound is long enough so that a clear and unambiguous phase lag or lead angle is clearly present at the ears, as shown in Figure 2-6. At very low frequencies the phase angle difference is so small that localization of a steady-state source is difficult. 2. Head shadowing relationships at the ears. For steady-state signals, shadowing effects are important in the frequency range above about 2 kHz, as shown in Figure 2-6. 3. Delay relationships between the ears for transient (short-term) signals. At high frequencies, delay itself becomes a determining factor. This is often referred to as "the law of the first arrival" in which localization tends to take place in the direction of the earlier signal. 4. Pinna (outer-ear) transforms. Each of us has a unique set of folds and convolutions in the physical structure of the outer ear, and these cause fairly detailed frequency differences in both amplitude and phase
Chapter 2
34
At low frequencies: Sound source
At high frequencies: Sound source
Phase difference ^\ at the ears
Figure 2-6. Sound source localization mechanisms. At low frequencies (left); at high frequencies (right).
between the ears. These enable us to make broad localization judgements of up-down and front-back events. 5. Reinforcement effects due to head motion. In all cases, we exercise the ability to move our heads, almost unconsciously, in an effort to refine and fine-tune our judgements of localization. Using an artificial head (often referred to as a "dummy" head), we can record and reproduce the binaural experience as shown in Figure 2-7A. It is possible to place the head in one environment and listen to an accurate recreation ofthat binaural experience later in another space. In some advanced systems, head tracking is taken into consideration so that the listener's head can be moved side to side, with resulting listening cues modified accordingly, as shown in Figure 2-7B.
ACUITY OF LOCALIZATION Our ability to localize from side-to-side in the forward listening direction is on the order of one degree, while in the vertical plane the acuity is about three degrees. For sounds arriving from the sides, overhead or the back, listeners have a much reduced localization accuity, and in order to refine our judgements we often turn our heads in the direction of a sound source in order to accurately localize it.
Stereophonie Localization-Real and Phontom Images
35
Apparent source
Source Dummy head
Headphones on listener Equalizers/ amplifiers
Apparent source
Source Dummy head
Headphones on listener Equalizers/ amplifiers
Head rotational data used to modify signals fed to headphones Figure 2-7. Binaural sound transmission. Basic setup (A); arrangement with lateral head tracking (B).
STEREOPHONIC LOCALIZATION-RE AL AND PHANTOM IMAGES How is it possible for two loudspeakers operating in stereo to create a sensation of sound images that occupy the entire horizontal space between the loudspeakers? The answer \s phantom images. Let's envision a simple stereo setup in which a guitar is positioned in the left channel, a bass is positioned in the right channel and a vocalist is positioned in the center. The guitar and bass arrive directly from their respective loudspeakers and as such are real images. They are literally as real as if players were standing in the position of the loudspeakers, and they are localized according to the processes we have just described. The vocalist on the other hand exists as ?iphantom image^ one that appears where there is no physical sound source. If a real sound source exists directly ahead of the listener, the sound pressures at the ears are as shown in the left portion of Figure 2-8. Because of symmetry, the sound pressures, or phasors, are equal at both ears, and the source is localized as straight ahead.
Chapter 2
36
In stereo, we position a sound in the middle of the stereo stage by mixing it equally in both channels, as shown in the right portion of Figure 2-8. When this is done, we see that the left ear hears both loudspeakers, as does the right ear. There is a very slight arrival time difference for the sounds arriving at each ear, and these are indicated by the angular displacement of the two phasors at each ear. However, the pair of phasors at each ear will coalesce into a single net phasor at each ear. These simulated phasors are equal in all respects, and as such they will convey to the listener a clear impression of a sound source located straight ahead. By balancing the left-right values of a signal, a clear impression can be created of individual sound sources located anywhere along the line connecting the two loudspeakers. Because there are two cues at each ear, one from each loudspeaker, this process is sometimes referred to as summing localization, Phantom signals are fairly accurate in their localization for a listener exactly in the center and looking straight ahead. If the listener's head is moved side to side, or if the listener is even slightly off the axis of symmetry, the phantom images will tend to become blurred or slightly fiizzy. In general, the localization of center-stage stereo sources will collapse toward the closer
Real image
Creation of phantom Image
Phasors
Figure 2-8. Creation of phasors for forward localization.
Direct Creation of Phantom Images in Remix Operations—The Panpot
37
of the two loudspeakers when we are seated off the center line. However, in most of our stereo listening, we seem to be relatively tolerant of these imperfections in the system. It is also important to remember that phantom images in stereo result only from creating signal phasors below about 700 Hz, with little or no effect on the other factors that influence normal localization of real images. As a result of this, phantom images in sound reproduction are effective in the forward listening direction and normally are limited to a primary frontal listening angle of about 60 degrees. Clear phantom images cannot be created at the sides of the listener or in any overhead position.
DIRECT CREATION OF PHANTOM IMAGES IN REMIX OPERATIONS—THE PANPOT The panpot (panoramic potentiometer) is a device with one input and two outputs widely used for positioning individual microphone outputs to positions on the stereo stage. A schematic diagram of a panpot is shown in Figure 2-9A. The input signal can be continuously distributed between left and right outputs. The potentiometers are normally selected so that in their center positions the level fed to each output is 0.7 of the input, or 3-dB down, as shown in Figure 2-9B. The potentiometer provides its action on the fiill-range signal, but is only the low frequency signal components (those below about 700 Hz) that are effective in steering the signal uniformly across the stereo stage.
STEREO LOCALIZATION WITH COMBINATIONS OF AMPLITUDE AND DELAY BETWEEN CHANNELS Both amplitude and delay cues between the stereo pair of loudspeakers can influence the position of phantom images (Franssen, 1964); however, these "combined" phantom images can sometimes be fairly vague and not clearly defined. These images most often occur in recordings made with spaced omnidirectional microphone pairs. We will study this technique in greater detail in later chapters, but for now we will analyze the simple microphone array and sound source shown in Figure 2-10.
Chapter 2
38
-o
Left out
-O
Right out
output
^"^Vi^,^^ -3dB
•3
Left -90**-
^^^
output
>^
Center — 0* —
Right -+90**
Figure 2-9. The panoramic potentiometer. Circuit (A); output amplitude versus control position (B).
MODERN ADVANCES IN PHANTOM IMAGE CREATION AND MANIPULATION Many small portable radio sets and loudspeaker setups for computer applications promise the user a "full, expanded sound field wrapping around the listener/' or words to that effect. These systems make use of what is called crosstalk cancellation^ a. technique we will discuss in some detail in a later
Modern Advances in Phantom Image Creation and Manipulation
39
Tradeoffs between amplitude and delay In determining approximate localization on the stereo soundstage:
-
4
-
2
0
2
4
Time differences (msec) between loudspeakers (left earlier)
Source leads in right channel by 1 millisecond. Source Is 5 dB louder In right channel than in the left.
Where will the stereo image appear? 1. On horizontal axis, go to -1 (left channel tags). 2. On vertical axis, go to +5. 3. Intersection is shown as bold cross In figure. 4. Image will appear slightly intx)ard of the right loudspeaker
Figure 2-10. Franssen's data for estimating stereo positioning versus relative amplitudes and delays at the two loudspeakers.
chapter dealing with consumer media. Essentially, these systems use a combination of delay and equalization to create at the ears of a listener exactly positioned in front the loudspeaker array a set of localization phasors modeled after real sound sources in the targeted (desired) positions. These systems work correctly only for a listener in the "sweet spot"; a listener located just a few inches to one side will not hear the effect as intended.
40
Chapter 2
A SUBJECTIVE VOCABULARY DEFINING PERFORMANCE ENVIRONMENTS When we describe a concert performance-or a recording played at home-we use terms that come largely from the visual and tactile domains. We might describe a sound texture as bright or dull, or we might describe a particular performance venue as muddy, intimate or warm. In fact, relatively few of the words we may use come directly from the hearing domain. Years of concert hall design have led to general acceptance of a number of design attributes that will pretty much ensure good acoustics. Why then do so many new concert halls invoke critical damnation? It's not that the acoustician didn't want do his job properly, but rather that certain architectural and economic factors prevented him from doing it. The acoustician would prefer to follow in the path of those in his profession who have designed great sounding halls—and this normally means copying the proven work of others. But architects often want to make their own design statements, and orchestra management may want a hall that seats 3,500 patron—all seats with good sight lines to the stage. There are a few subjective aspects of listening spaces that correlate well with measurements, and these are discussed below:
Spaciousness KuttruflF(1979) states that spaciousness arises from the following conditions: a. That reflected sounds be mutually incoherent (not arising from symmetrical pairs of reflections). b. That their intensities be no lower than about 20 dB below the direct sound. c. That their delays relative to the direct sound not be longer than about 100 milliseconds. d. That they arrive primarily from lateral directions (from the sides of the room).
Intimacy Intimacy implies closeness to the performing group and the ability to hear details clearly. This normally requires that the first reflections reaching the listener arrive no later than about 15 to 20 milliseconds after the direct sound. In some cases, overhead suspended "clouds" may be used to achieve the delayed reflections in the audience area.
A Subjective Vocabulary Defining Performance Environments
41
Liveness Liveness implies reverberation time in the range from 500 to 2,000 Hz which is longer than at lower frequencies. This is a normal characteristic of earlier, mid-sized halls constructed of wood paneling and with relatively little upholstery and drapery.
Warmth Warmth is associated with halls that have enhanced low frequency response and a relatively long reverberation time in that frequency range. It has long been associated with many of the great halls built about 100 years ago, such as Boston's Symphony Hall, New York's Carnegie Hall and Amsterdam's Concertgebouw.
ANALYSIS OF HALLS Figure 2-11A shows a perspective view of the interior of a concert hall. Direct sound, early reflections, and reverberation are indicated for a typical listening position. Barron (1971) has experimentally analyzed the effects of sound arriving at the listener from a 45 degree angle relative to sound arriving from the stage. The data, shown in Figure 2-IIB, indicates the combined effects of delay time and relative level of the delayed sound at the listener. The target for the acoustical designer is to maximize the seating area over which listeners can appreciate a "natural spatial impression." Obviously this can't be done for all the seats in the hall, and it is a good reason why the best seats in the house are often the most expensive. Ironically, box seats on the side walls are acoustically among the worst in the house; they are also the most expensive, perhaps because they offer such clear views of the stage.
Target reverberation times of performance spaces When we make rooms larger they naturally become more reverberant, primarily because of the increased delay times between successive reflections. The normal reverberation time for a large lecture hall might be in the range of one second; if we designed a small lecture room with that same reverberation time, it would in fact sound '"too live," due to the pronounced room modes in the smaller space. There is also a desirable range of reverberation time depending on the room's function or kind of music normally played in the room, as shown in Figure 2-12.
Chapter 2
42 • Direct sound (0 to 25 msec) • Early reflections (25 to 100 msec) ft Reverberation (>100 msec)
S - Sound source L - Listener
Effect of a single delay at 45° off-axis of a sound source in front of the listener
B
Image shift
Disturbance
CO
"in c
c-10 (D
-g-15 -20 -25
\40
60
80
Time (milliseconds)
Figure 2-11. Direct, early reflections and reverberation in an auditorium. View of space (A); the subjective effect of a single delay on subjective impression (B).
Variation of reverberation time with frequency The normal tendency in any kind of listening space is for the reverberation time to decrease at high frequencies and increase at low frequencies. This is a result of the normal increase in atmospheric sound absorption at high frequencies and decrease in absorption of most building materials at low frequencies. The average effect of this is shown in Figure 2-13. It is not unusual for a room with a midband reverberation time of 2 seconds to have a reverberation time of 3 seconds at 50 Hz, as indicated by this figure.
43
Hearing Protection 2.2
^
2.0
f
I
I
I
I
I ^ ^
8 1.8 0)
(D
E 1.6
I
g I 'lo 1.4
I
*
^^^^^
^^^^^^
'
I 1 ^ ; ^ ^ ^ -
§ 1.2 1 ^ 0)
§"10 CO
I I [•^ I j
I
t
I
I
- I I
I I
I I
*
I I
I I
, I I
I I I
1^ I I
*
I
I
5.000 45
31,500 90
63 k 180
^ •
^
^^^^^^^
^i^^*^^^
* ^^^0^^'^^
I
I I
'
*
^
*
I
I
I I
r I
I I I 3peedr^JU-^^ I \ ^ ^ ^ r ^ I
^ ^
I
I I I
r I I
I
I
I
125 k 355
250 k 710
500k 1420
IMCUft 2840 m^
VOLUME Figure 2-12. Target reverberation time versus room volume for various activities.
HEARING PROTECTION As the world around us becomes increasingly louder, and as reinforced music performance becomes the norm, recording engineers must be concerned with hearing protection. This is also true in the modem workplace, and both the Occupational Safety and Health Act (OSHA) and the Environmental Protection Agency (EPA) have laid down regulations regarding allowable noise exposure on the job. The OSHA criteria are given below: Sound Pressure Level (A-weighted) 90 92 95 97 100 102 105 110 115
Daily Exposure: (hours) 8 6 4 3 2 1.5 1 0.5 0.25
44
Chapter 2
•1
1.5
\J
1 ^
^ O Co'
rs ^
^
0.5
25
50
100
200
400
800
1600
3150
6300
12.5k 20k
Frequency (Hz)
Figure 2-13. Normal variation of low and high frequency reverberation versus midrange value.
If you have ever left a rock concert, recording session or remix session with even a slight ringing or tingling sensation in your ears, you are at risk for some degree of hearing loss over time.
eartip
stem
end cap
B
0.5
1
2
Frequency (kHz)
Figure 2-14. Hearing protection. Sketch of an earplug suitable for high-level music listening (A); transmission losses for various methods of hearing protection (B). (Data courtesy Etymötic Research)
Hearing Protection
45
Ear defenders range from simple foam plugs to elaborately molded, custom-fitted models which provide fairly uniform attenuation over the frequency range. For most applications here you vs^ill find that models designed to provide a loss of 15 dB uniformly over the frequency range will be quite satisfactory. However, if you are on a firing range, heavy-duty, externally worn ear defenders providing considerably more attenuation will be required for absolute safety. Figure 2-14 shows a sketch of a typical earplug suitable for those in the music business. Note that the transmission loss of this design is fairly uniform at about -20 dB over a large portion of the frequency range.
Chapter 3 MICROPHONES: BASIC PRINCIPLES
INTRODUCTION The microphone was introduced with the earliest telephone systems in the 1870s. Broadcasting and electrical recording came approximately a half century later. The earliest telephone transmission systems did not employ electrical amplification, and the signal from the microphone, or transmitter, was used directly to drive the telephone receiver. By the time electrical recording and broadcasting were introduced, amplification had become an integral part of audio signal transmission, and as a result microphones could be engineered for higher quality rather than for maximum power output. By the time "Hifi" arrived in the late 1940s microphones had reached a high level of performance—so much so that many of the old German and Austrian models of those days are still in use and may sell in the $5,000 range or greater. Today's microphones cost much less than the earlier models and generally have more uniform and extended frequency response as well as a lower noise floor.
PRINCIPLES O F TRANSDUCTION A transducer is a device that converts energy from one domain to another. A microphone is such a device and converts acoustical pressure variation into a corresponding electrical signal. In this chapter we will analyze the primary methods of transduction that have been used in microphone design over the years.
OLDER DESIGN PRINCIPLES Sidebar 3.1 gives details on the operation of two older microphone designs: the carbon microphone and the crystal microphone. The carbon microphone is still used today in telephone applications, where its relative simplicity and ruggedness are its strong points. In operation, variations in sound pressure cause the diaphragm to move, creating variations
Condenser Microphones
47
in the compression of the carbon granules and varying the net resistance in the electrical circuit. The current in the circuit is modulated according to the acoustical signal, causing an acoustical output from the receiver. The crystal microphone was used at one time for paging purposes and is still used today in some very high pressure applications as w^ell as in underwater acoustics. Certain crystalline materials exhibit ?ipiezoelectric (from the GxQQk piezein, "to press") effect; that is, when they are bent or flexed, a proportional voltage is produced across a pair of the crystal's facets. Neither the carbon nor crystal microphone has performance characteristics suitable for recording; however, the piezoelectric effect has been used for contact pickups on guitars and piano sounding boards. Sidebar 3.1: Carbon and crystal microphones Figure 3-1A shows details of the carbon microphone as used in early telephone engineering. Sound impinges on a diaphragm which is connected to a carbon button, a cup containing granules of carbon and a movable electrode. When the diaphragm moves, the carbon granules are alternately compressed an uncompressed, causing a similar variation In the electrical resistance in the circuit. The bypass capacitor provides a low Impedance path for the audio signal around the voltage source, and the varying ac current flowing through the receiver produces an output signal. Figure 3-1B shows details of a typical crystal microphone. In this example, two crystals are cemented together (known as a bimorph) in order to increase the output voltage. When flexed by the motion of the diaphragm a signal voltage appears at the output.
As recording and broadcasting got underway, microphone design shifted to capacitor (condenser) and dynamic principles or transduction because of their inherent greater frequency bandwidth and lower noise floors. Companies such as Western Electric in the United States and Neumann in Germany were among the first to develop high-performance capacitor microphones.
CONDENSER MICROPHONES Capacitor microphones are universally known as condenser microphones, and that is the term we will use throughout this book—even though capacitor is the preferred technical term. The condenser principal is based on the following equation: Q-CE
3.1
Chapter 3
48
A. Carbon microphone system Speech input Pp:::^
Modulated Carbon current ^^^ranules
Speech output
Telephone transmitter Bypass capacitor
Sound input
B. Crystal microphone
o
Electrical output
Figure 3-1. Details of the carbon microphone (A) and crystal microphone (B).
where Q is the electrical charge on the plates, C is the capacitance, and E is the applied voltage. In microphone application, one plate of the condenser, the backplate, is fixed, and the other plate, the diaphragm, is placed very close to it and is free to vibrate when sound strikes it. The combination of backplate and diaphragm in a single structure is generally referred to as a capsule. As the diaphragm moves in and out under the influence of sound waves the capacitance will also vary. As the diaphragm gets closer to the backplate the capacitance will increase, and vice versa. If the charge across the condenser is held constant, the changes in capacitance will result in corresponding changes in the voltage across the condenser. This voltage is the output of the microphone. Sidebar 3.2 gives details on the operation of the condenser microphone.
49
Condenser Microphones
Sidebar 3.2 Operation of the condenser microphone Figure 3-2A shows the basic relationship among fixed charge, capacitance and voltage across the plates of the condenser. The Greek delta (A) indicates a small change or variation in the quantity it is attached to. As we can see in the equation, a small variation in capacitance (AC) will result in a small variation in output voltage (AE). If O remains constant, then voltage and capacitance will vary inversely; that is, when C increases E will decrease. For Q to remain constant, a polarizing DC voltage is applied to it externally through a very high resistance. For small values of delta the variation in output voltage will be a near replica of the variation in capacitance, and the output of the microphone will be linear. A. Voltage (E) across a variable capacitance (C) with fixed charge (Q)
B. Section view of a condenser microphone Diaphragm
Microphone
V ^ Reference
0 decreases; E increases
C increases; E decreases
Backplate
Q = CE AE = Q/AC
C. Externally polarized condenser microphone
Diaphragm
Insulator
Insulating ring
D. Electret polarized condenser microphone
Diaphragm /I II a\j\ I I
Insulator y
Electret coated backplate
Amplifier
Polarizing voltage
Figure 3-2. Details of the condenser microphone. Effect of variations in capacitance (A); cutaway view of a typical condenser microphone capsule (B); operation of the externally polarized condenser microphone (C); operation of the electret condenser microphone (D). A cutaway view of a typical condenser microphone is shown in Figure 3-2B. Figure 3-2C shows the circuit for the standard externally polarized form of the microphone. Here, a battery (or other DC voltage source) is used
50
Chapter 3 to establish the charge on the condenser. The resistor R is in the range of 10 megohms so that the charge on the condenser remains constant. The signal voltage is then amplified directly at the microphone capsule and reduced to a lower impedance so that the microphone can drive the signal over sufficient distance via a microphone cable without loss. Many newer microphone designs make use of prepolarlzed condenser elements known as electrets. An electret is a material that maintains a fixed charge across its front and back surfaces. Such materials have been know for at least a century, but their application to microphone design dates only from the 1960s. The basic design shown at Figure 3-2C can be reconfigured as an electret design as shown at D. Here, there is no polarizing voltage supply and the overall design is simpler. As before, an amplifier must be provided directly at the microphone capsule to produce a low Impedance signal output.
Early electret materials tended to be unstable over time, but those problems have been solved. While the electret has tended to dominate lower cost microphone design, it has also been used in some of the highest quality models of recent years. Most notably, the Bruel & Kjaer company makes use of electret technology in their superb series of studio microphones.
DYNAMIC MICROPHONES Dynamic microphones make use of the principle of magnetic induction^ in which a coil of wire produces a small output voltage as it moves through a magnetic field. It is the inverse of a traditional dynamic loudspeaker, which you are all familiar with. In order to cover the necessary audio frequency range, the voice coil, as it is called, is normally no larger in diameter than about one-half inch. It is attached to a very light diaphragm, normally made of plastic, or in some older designs, thin aluminum. A close relative is the ribbon microphone. Here, the voice coil has been replaced by a thin corrugated ribbon that is suspended in a magnetic field. The ribbon is open on both sides, and its directional response takes advantage of this to produce a "figure-eighf' pickup pattern. Both of these designs are discussed in Sidebar 3.3. Sidebar 3.3: Microphones based on magnetic induction The basic principle of magnetic induction is shown in Figure 3-3A. If the flux (or flow) of a magnetic field is in the direction shown, and if a piece of wire Is moving in the direction shown, a positive voltage will be
Dynamic Microphones
51
induced in the wire in the direction shown. A practical application is shown at B. Here, the wire is in the fornn of a coil and the magnet is shaped to produce a flux path that is circular (or annular) as well. A diaphragm is attached to the coil as shown and moves under the influence of sound pressure, causing a voltage output at the terminal as shown. B. Section view of magnet/coil/ diaphragm assembiy
A. Vector relationships
Diaphragm
Coil Voice coil motion
i
Voltage
Magnetic flux Magnetic flux path
C. Perspective view of the ribbon microphone
Output
D. Response (poiar) pattern of the ribbon microphone
Ribtx)n
180°
-90° Output step-up transfomier
Figure 3-3. Details of the dynamic microphone. Magnetic induction (A); cutaway view of a moving coil dynamic microphone (B); perspective view of a ribbon microphone (C); polar response of a figure-eight microphone (D).
52
Chapter 3 The structure of a typical ribbon microphone is shown in Figure 3-3C. Here, the coil has been replaced by a straight section of corrugated aluminunn known as a ribbon. The magnetic field is likewise straight and cuts through the ribbon over its entire length. The signal output is taken from the ends of the ribbon and is normally stepped up through a small transformer located directly in the microphone case. The figure-eight directional, or polar, pattern is shown at D. It is clear that the output resulting from a sound source in the direction of zero or 180 degrees will be maximum. But for sounds in the directions of plus or minus 90 degrees the response will be zero, since those sound pressures will cancel at the ribbon, resulting In no net motion. Mathematically, the figure-eight shape can be described by the polar equation: p = cos e
3.2
where p (Greek letter rho) is the output signal magnitude and 0 (Greek letter theta) is the angle of sound incidence.
Chapter 4 MICROPHONES: THE BASIC PICKUP PATTERNS
INTRODUCTION Recording engineers have at their disposal a variety of microphone pickup patterns. The two fundamental "building block" patterns are omnidirectional (omni) and bidirectional, or figure-8. The omni is basically uniform in all directions, although at very high frequencies it will show some directionality along its principle axis of pickup. These two patterns are shown in Figure 4-lA and B. By combining these two basic patterns we can produce a cardioid microphone, as shown in Figure 4-2. Today, however, virtually all cardioids are produced using a single diaphragm capsule. In this chapter we will discuss the derivation of the various patterns and introduce the reader to the basics of usage of the patterns.
Figure 4-1. The basic patterns. Omnidirectional (A); bidirectional, or figure-eight (B).
CARDIOID (Unidirectional)
p=1
p = cos e
p = .5 + .5 cos e
Figure 4-2. Producing a cardioid pattern by summing omnidirectional and bidirectional patterns.
Chapter 4
54
PRODUCING A CARDIOID PATTERN The polar equation for the standard cardioid pattern is: p - 0.5(1 + cos e)
4.1
Producing this pattern by combining two separate elements was common at one time, but today we can produce the cardioid pattern more efficiently and accurately as shown in Figure 4-3. The response to a frontal signal (0 degrees) is shown at A, The delay path (At) allows the diaphragm to be actuated since the signal reaching the back of the diaphragm is always delayed by a fixed amount relative to the signal at the front of the diaphragm.
Magnet
ATB
= ATF
Figure 4-3, A single diaphragm dynamic cardioid microphone. Response at 0 degrees (A); response at 180 degrees (B).
55
Producing a Cardioid Pattern
However, for a signal arriving from 180 degrees, the internal delay path and the path around the microphone to the front of the diaphragm are designed to be equal. In this case there will be a cancellation at the diaphragm over a fairly wide frequency range. This design principle applies to dynamic microphones, as shown here, as well as to condenser microphones, as shown in Figure 4-4.
<
[
:
Eia
J 90^
T III 1 1
T II
Jii'T
Pisa I
1
^
It!*}
it 1 1
nm
1 180° itijM I
k^tiT?!l
ic:^ Figure 4-4. A single diaphragm condenser cardioid microphone. Action at 0 degrees, 90 degrees, and 180 degrees is shown.
Chapter 4
56
THE CARDIOID FAMILY There are four cardioid patterns you will normally encounter: subcardioid, cardioid, supercardioid, and hypercardioid, and we will discuss them below:
Subcardioid This pattern is generally represented by the polar equation: p = .7 + .3 cos 0
4.2
The pattern is shown in Figure 4-5. The directional response is -3 dB at angles of ± 90 degrees and -10 dB at 180 degrees. The subcardioid pattern is favored by many engineers who do large scale scoring work. It is sometimes referred to as a "forward-oriented omni."
Linear Plot
L^g pi^^
Figure 4-5. The subcardioid pattern, shown in linear and log (dB) polar plots.
Cardioid This is the standard cardioid we have already discussed and is represented by the polar equation: .5 -f .5 cos e
4.3
The pattern is shown in Figure 4-6. The directional response is -6 dB at angles of ± 90° and ideally zero at 180 degrees. It is by far the most common pattern found in the recording studio. Its usefulness lies mainly in its high rejection of direct field sounds arriving at an angle of 180 degrees.
57
Cardioid Family
Linear Plot
Log Plot
Figure 4-6. The cardioid pattern shown in linear and log (dB) polar plots.
Supercardioid This pattern is represented by the polar equation: 4.4
p - 37 H- .63 cos e
The pattern is shown in Figure 4-7. Directional response is -8.6 dB at ±90 degrees and -11.7 at 180 degrees. There are nulls in response at ±126 degrees. The supercardioid pattern exhibits the maximum frontal pickup, relative to total pickup, of the cardioids, and as such can be used for pickup over a wide frontal angle.
Linear Plot
Log Plot
Figure 4-7. The supercardioid pattern shown in linear and log (dB) polar plots.
Chapter 4
58
Hypercardioid This pattern is represented by the polar equation: p = .25 + .75 cos e
4.5
The pattern is shown in Figure 4-8. Directional response is -12 dB at ±90 degrees and -6 dB at 180 degrees. There are nulls in response at ±110 degrees. The hypercardioid pattern exhibits the greatest random efficiency, or "reach," in the forward direction of the cardioid family. In the reverberant field, the hypercardioid pattern will provide the greatest rejection, relative to on-axis pickup, of randomly arriving reverberant sounds, and for these reasons it is normally the first choice for speech pickup in sound reinforcement systems.
Linear Plot
Log Plot
Figure 4-8. The hypercardioid pattern shown in linear and log (dB) polar plots.
SUMMARY OF THE CARDIOID FAMILY The cardioid family we have described here is often referred to asßrst-order cardioids. The term is a mathematical one and refers to the fact that the equations defining them contain a cosine term to the first power. The basic performance characteristics of this family are shown in Figure 4-9. Most of the terms used in the figure are self-explanatory; however, random efficiency (RE) and distance factor (DF) will need some explanation. RE is a measure of the on-axis directivity of the microphone, relative to its response to sounds originating from all directions. A value of RE of 1/3, for example indicates that the microphone will respond to reverberant acoustical power, which arrives equally from all directions, with 1/3 the sensitivity of the same
Summary of the Cardioid Family
59
SUMMARY OF FIRST-ORDER CARDIOID MICROPHONES
CHARACTERISTIC
I
PRESSURE COMPONENT
1
GRADIENT COMPONENT
SUBCARDIOID
1
CARDIOrO
M hs POLAR RESPONSE PATTERN
- ß:':
\ii
POLAR EQUATION
1 1
PICKUP ARC 3dB0OWN
360°
90*
180°
1 1
PICKUP ARC 6dBDOWN
360*»
120°
264°
Cose
.5 + .5Cos e
JM
ßy-
^iii ^
m: <m
• ' • * * ^
mm ip^
.37 + .63COS e .25 + .75COS e 115°
105°
180°
156°
141°
-3
-6
-8.6
-12
-00
-11.7
-6
1 1
RELATIVE OUTPUT AT 180* (dB)
0
0
-8
1 1
ANGLE AT WHICH OUTPUT = ZERO
-
90°
-
1 1
RANDOM EFFICIENCY (RE)
1
.333
.55
1 1
DIRECTIVITY INDEX (Dt)
OdB
4.8 dB
1
DISTANCE FACTOR (DSF)
1
1.7
'
1
180°
1
M 1^
t^^
131°
RELATIVE OUTPUT AT 90» (dB)
-00
\:^ iS^
.7 + .3Cos e
1 1
0
i1
w
1 1
1
M [^
tp itl#
SUPERCARDICMD 1 HYPERCARDOD
126°
1
110°
.333
.268 ^^>
.25^2)
2.5 dB
4.8 dB
5.7 dB
6dB
1.3
1.7
1.9
2
(1) MAXIMUM FRONT TO TOTAL RANDOM EFFICIENCY FOR A FIRST-ORDER CARDIOID. (2) MINIMUM RANDOM EFFICIENCY FOR A FIRST-ORDER CARDIOID.
(Data presentation after Shure Inc.)
Figure 4-9. Data on first-order car<Jioid patterns.
acoustical power arriving along the principal axis of the microphone. RE is related to the directivity index (DI), as discussed in Chapter 1, by the following equation: DI - 10 log RE
4.6
Distance factor (DF) is a measure of the "reach'' of the microphone in a reverberant environment, relative to an omnidirectional microphone. For example, a microphone with a distance factor of 2 can be placed at twice the distance from a sound source in a reverberant field, relative to the position of an omnidirectional microphone, and exhibit the same ratio of direct-to-reverberant sound pickup of the omnidirectional microphone. DF is related to directivity index by the following equation: DI = 20 log DF
4.7
60
Chapter 4
Stated differently, under reverberant conditions, the recording engineer does not think of directional microphones in terms of their rejection of sounds arriving from some back angle; instead, the microphone is judged in terms of its ability to reject all reverberant sound, relative to the on-axis response. Figure 4-10 shows graphically the on-axis pickup properties of microphones that have different values of DF.
e
Sources
Omnidirectional
1.3
e
Subcardloid
Gradient
1.7
Cardloid
1.7
Supercardloid
1.9
Hypercardloid
2.0
Figure 4-10, Distance factors (DF) for the various cardioid patterns.
VARIABLE PATTERN MICROPHONES Outside of the early RCA 77-series ribbon microphones, virtually all of the variable pattern microphones you will encounter in the studio are large format condenser models. Figure 4-11 shows the basic operation of the classic Braummühl-Weber dual diaphragm structure, and a view of a typical studio model is shown in Figure 4-12. Basically, the system acts as a pair of backto-back cardioids which can be either added or subtracted in their effect by the voltage switching around them. Figure 4-13 shows the net result of the various switching combinations.
61
Variable Pattern Microphones
Cardioid 2
Cardioid 1
xO Ö:-:-;:
•:'m ^'^
^>W
4—o
Omni Cardioid
Figure 4-11. Circuit details for Braunmühl-Weber dual diaphragm condenser design.
These microphones generally work best in their normal cardioid configuration, and their omni patterns are probably the least effective. If you need a good omni, we recommend that you choose a fixed pattern omni microphone.
Chapter 4
62
Figure 4-12. Photo of a typical large format, variable pattern condenser microphone. (Courtesy AKG Acoustics)
HIGH DIRECTIONALITY MICROPHONES Engineers working in television and motion pictures very often have a need for a microphone that can be placed 8 to 10 feet away from an actor and still produce good dialog quality. There are a number of so-called "rifle" microphones that will provide the directivity necessary for this application. Figure 4-14 shows a view of such a model. The designation rifle comes from the appearance of the microphone.
63
High Directionality Microphones Cardioid 1
Cardioid 2
Resultant
0
0
Figure 4-13. Producing various patterns by adding and subtracting back-to-back cardioids.
In design, most of these microphones have a hypercardioid capsule as their basis, and an interference tube is placed in front of the capsule. The tube provides a clear path for sounds originating on-axis; for sounds off-axis the presence of the openings along the tube will produce an interference pattern which reduces the net signal reaching the hypercardioid capsule. The effect of this is dominant at high frequencies, and typical polar response of such a microphone is shown in Figure 4-15.
Chapter 4
64
Figure 4-14. Photo of a rifle microphone about 18 inches in length. (Courtesy AKG Acoustics)
180
125 Hz 250 Hz 500 Hz 1000 Hz
180°
2000 Hz 4000 Hz / aOOOHz I 16000 Hz
Figure 4-15. Polar response of the rifle microphone shown in previous figure. (Courtesy AKG Acoustics)
Chapter 5 ENVIRONMENTAL EFFECTS AND DEPARTURES FROM IDEAL PERFORMANCE
INTRODUCTION In this chapter we will discuss some of the aspects of microphone performance in their normal working environments. Topics such as proximity effect and interference effects due to reflections and combined multi-microphone outputs will be discussed. We will also discuss the normal variations in ideal response which all microphones exhibit to some degree.
PROXIMITY EFFECT If you talk closely into any first-order directional microphone you will hear a rise in bass response. This arises from the fact that the directional microphone has both front and back paths to the diaphragm. For close-in sound sources, there will be a significant difference between the sound pressure levels at the two entry points, and this diiference is dominant at low frequencies, causing a rise in response. Omnidirectional microphones do not exhibit this effect, since sound pickup takes place only at the front side of the diaphragm. The following discussion in Sidebar 5.1 presents a more detailed discussion of the proximity effect. Sidebar 5.1: Proximity effect in first-order microphones Figure 5-1A shows the basic cause of proximity effect in a figure-eight microphone. Here, S represents the sound source; D^ is the distance to the front of the microphone and D^ is the distance around the microphone to the back opening. The net force on the diaphragm is shown at B. We can see that there are actually two forces on the diaphragm, a gradient force which rises with frequency and an inverse square force which is constant with frequency. The equation that defines the amount of LF proximity rise is: A/I + (kr)^ Boost (dB) = 20 log -^—-^-^
5.1
Chapter 5
66
Microphone
Net force on gradient element CQ "O)
>
^
/ > ^ Frequency (phase) dependent ^ \ force (6 dB/octave slope)
Log frequency
CD
"05 > 0
Log frequency Figure 5-L Proximity effect. A sound source close to a gradient microphone (A); Net force on the microphone diaphragm (B); net output from the diaphragm (C). The electrical output of the nnicrophone is further nfiodifled by the velocity of the diaphragnn's motion, which causes a 6-dB-per-octave rise at lower frequencies. The net output of the microphone Is shown at C.
Proximity Effect
67
For a figure-eight microphone the proximity effect at several operating distances is shown in Figure 5-2. You can see that for very close operation the LF response rise due to proximity effect is slightly greater than 24 dB at 50 Hz. Even at an operating distance of about 21 inches the rise at 50 Hz is about 7 dB.
36
30
S:-
24
|N^-—H-\-.—H
[-
-j
j
i
J
18
12 I 21 i i ^ ^ 1 10.6irV^ \
4.25irNL
2.1 I r ^ ^
J
!
|
600
Ik
2k
6
0 12.5
25
100
200
5k
Frequency (Hz) Figure 5-2. Proximity effect for various operating distances from a figure-eight microphone.
For a cardioid microphone the degree of proximity effect is less than with a figure-eight since there is a considerable omni component in the derivation of the cardioid pattern. Figure 5-3 shows the proximity rise for several operating distances from a cardioid microphone. Figure 5-4 shows the variation in proximity effect for a fixed operating distance (24 inches) and with varying angles about the microphone. You can see that at 90 degrees there is no proximity effect; this Is because the gradient (cosine) component is zero at that angle. At 180 degrees the proximity effect rises very rapidly at low frequencies. Many so-called "vocal microphones" have a bass-cut switch that compensates for proximity effect, as shown in Figure 5-5. Other vocal microphones are purposely rolled off at low frequencies, with the knowledge that they are going to be used at close distances, as shown in Figure 56.
Chapter 5
68
36
30
^ 24' OQ
i 12
21 In
^^0.5 in
J^^Jn
i ^ ^ •"
I
J
i
•
J
6
0
1
12.5
• ^'''^'''''^^^^*^^ ^"'^'^^^P^'^'^'^H^ 50
25
100
200
i 2k
Ik
500
5k
Frequency (Hz) Figure 5-3. Proximity effect for various operating distances from a cardioid microphone. 6
r^^ 1 : h-
yj
! I
CD
0°
"0
30**
>
H-
60°
h-
CC
-6
90°
U
Vso** 1
20
50
100
200
1 —
500
L-. Ik
2k
Frequency (Hz) Figure 5-4. Proximity effect for various operating angles at a distance of 24 inches (0.6 meter) from a cardioid microphone.
On- and Off-Axis Microphone Performance at High Frequencies
— i
:r|"rT_.: —
-r-j-
69
"^ 1.1: i. .1 - i- - 1 "~i~Tj — —- — __.--—~Z— ZZZL Z l rr: z: ' ~ r ' r \\~ —' iizzz: ___- —-"~:~-^ 2 £r;M: r« [Ü^ Zr^Zi ^:^^^. Vzzz JJ:: m z' rt Tp"- -- EIZ".IE' r j i r "i1 :7->SJ „lIZ J \ 1 'f-i^.-^ -10 ^ — ^ ü^ü ttet ff—p ;Z 4 1 .z:i. ----- ZL E|EJ 4|E:-i 'El-pE/ i."i r.' "7 7:1: ^ -20 L_-Z = I L
\SI
.zz
Z71' 7.: •"Zjl'-f
—
i
fl— — —
T.-.:
I L L Z T Z
j-j;^-'
y\^j^^^
20
50
gJbfeE 100
200
500
1000
2000
5000
10,000
20.000 Hz
Figure 5-5. Effect of a bass rolloff switch on a vocal microphone. (Courtesy Neumann/USA)
oo +10
3 mm (V8IN) 25 mm (1IN) 51 mm (2 IN)
o Q. 00 Ui DC UJ
n ^
> -10 § LU
;
oc
20
. - ' * " •'- v, .--— tv\ ,,^"*. ^ 1
: ^
• • • i
52 .''
1 '1
rl*» ^
1 ^f
fes
0.6 m (2 FT)
50
100
200
500
1,000
10,000 20,000
FREQUENCY IN Hz Figure 5-6. Response of a vocal microphone at several operating distances. (Courtesy Shure Inc.)
ON-AND OFF-AXIS MICROPHONE PERFORMANCE AT HIGH FREQUENCIES: The diffraction effects we studied in Chapter 1 have an important effect on the HF performance of microphones. The data shown in Figure 5-7 indicates the general trend. If a microphone is designed for flat on-axis response, its response to random signals will be as shown a t ^ . The microphone can also be configured for flat response in the random field, and its on-axis response will rise, as shown at B. Each microphone type has its intended uses, and the engineer should always be aware of these HF on- and oif-axis differences. In general usage, microphones that are flat on-axis are most useful when you are operating fairly close-in in the studio environment. If you are operating at a distance (for example, in a concert hall), it may be to your advantage to choose microphones that are flat in the random or diffuse field.
Chapter 5
70
OD •o
\
"
>
10 dB
>
T"
05
I On-axis | B& K DD 0251 grid
\\
0°
\r V
UL
20
40
80100
1 Random r
200
400
8001 k
2k
4k
8 k 10 k 20 k 40 k
Log frequency (Hz) B
1, 1 On-axis |
>
10 dB
a>
1"
>
••-^
(0
DC
20
40
1 1
[Random | \
Bgfk' n n n 0 0 7 nrirl
\^
80100
1 200
400
0"
8001k
2k
4k
8 k 10 k 20 k 40 k
Log frequency (Hz) Figure 5-7. Response of a microphone designed for flat on-axis response (A) and flat random incidence response (B). (Data after Brüel&Kjaer)
INTERFERENCES DUE TO REFLECTION A classic case of microphone interference is shown in Figure 5-8. Here, a microphone is placed at some distance from the sound source; floor reflections interfere with the direct sound from the source, resulting in uneven response as shown. As the microphone is moved closer to the reflecting surface, the disturbances is less. When the microphone is placed directly on the reflecting boundary the effect disappears. Boundary layer microphones are quite useful in picking up sound in the theater as well as on tables, podiums and the like. The microphone model shown at C has been designed for surface mounting and exhibits uniform response from all operating angles. In many studio applications, the severity of floor or other surface reflections can be minimized by using a directional microphone. In some cases, the null angle of the directional microphone can be aimed directly at the source of the reflection, reducing it to inaudibility.
71
Interferences Due to Reflection
Microphone positions
500 Frequency
1k
5k
Figure 5-8. Effect of floor reflections. Positions of talker and microphone (A); responses of microphone (B); photo of a boundary layer microphone (C). (Photo at C courtesy Crown International)
72
Chapter 5
MULTI-MICROPHONE PICKUP PROBLEMS We can get away with many things in stereo recording which may come back to haunt us when the recording is played back in mono. A typical situation here is shown in Figure 5-9, where a piano is recorded in stereo with a pair of spaced microphones. Their distances from the instrument are Z)y and D2> When played in stereo the recording may sound excellent, but if there is a requirement for good stereo-to-mono compatibility, the sum of the two microphones may present problems. Specifically, there will be reinforcements and cancellations in the combined response, as given by the following equation:
/ = ^ ^
5.2
D, • D j
where D2 is the longer distance and c is the speed of sound. There will be signal reinforcements at frequency multiples of 3/2/ 5/2/, 7/2 fand so forth, and cancellations at frequencies intermediate between these values. As you can see, this problem is related to the one shown in Figure 5-8. There is no clear solution to this problem as such; if there is a requirement for good mono compatibility, the engineer and producer should make a mono summation and approve it before moving on with the recording project. This will usually entail moving the microphones somewhat closer together and making sure that their distances from the center of the instrument are minimized. Microphone 1 Microphone 2
Figure 5-9. Interference effects with a single sound source and multiple microphones.
Variations in Microphone Directional Response
73
VARIATIONS IN MICROPHONE DIRECTIONAL RESPONSE Do not make the mistake of assuming that a cardioid microphone has a uniform pickup pattern over the entire frequency range. What you are most likely to see is data such as is shown in Figure 5-10. A set of typical polar plots is shown at A and the corresponding axial measurements at 0, 90, and 180 degrees is shown at B. Study such data carefully if you want to know the frequency range over which the microphone actually has a recognizable cardioid pickup pattern.
270° 16 kHz
180°
125
250
50
1kHz
2 kHz
4 kHz
8 kHz
16 kHz
Figure 5-10. Cardioid microphone directional response aberrations. Polar plots (A); off-axis frequency response curves (B).
Chapter 6 MICROPHONES: ELECTRONIC PERFORMANCE AND THE ELECTRICAL INTERFACE
INTRODUCTION This chapter covers the basic electronic aspects of the microphone and its integration into the audio signal chain. We will discuss the microphone's performance in terms of its basic performance characteristics, such as: output sensitivity, self noise floor, distortion and electrical output impedance. Additional topics will cover powering of condenser microphones, losses in microphone cables and loading effects at the downstream console input. We will also touch briefly on the wireless microphone.
BASIC MICROPHONE ELECTRONIC PERFORMANCE Output sensitivity The output sensitivity of a microphone expresses its signal output for a specified acoustical input. Today we universally use a reference acoustical pressure input of one pascal, which is equivalent to a sound pressure level of 94 dB. The microphone's output when placed in the reference sound field is given in millivolts per pascal (mV/Pa) or as a voltage level per pascal (dBV/Pa). For example, a typical studio condenser microphone may have a rated sensitivity of 20 millivolts per pascal, indicating that the output voltage will be 0.020 volts (or 20 mV) when the microphone is placed in a sound field of 94 dB-SPL. Another way to express this is as a dB rating relative to one volt: Sensitivity = 20 log (0.02 V)/Pa = -36 dB re 1 Pa Typical sensitivity values for various studio microphones are shown in Table 6.1. You will note that the design sensitivity of the microphone is tailored to its application. Microphones designed for normal studio use represent an average of many models. Those microphones intended for close-in use on stage have lower sensitivities, and those that are intended for distant use on-
Basic Microphone Electronic Performance
75
stage or for distant pickup in television or film recording will have higher sensitivity. The aim is to keep the basic microphone output signal fairly uniform, regardless of its primary application. Table 6.1. Microphone Sensitivity Ranges by Use Microphone Usage: Close-in, hand-held Normal studio use Distant pickup
Normal Sensitivity Range: 2-8 mV/Pa 7-20 mV/Pa 10-30 mV/Pa
Microphone noise floor This rating states the electrical output noise level of a microphone relative to the actual environment in which you may be using that microphone. As an example, assume that you are recording in a very quiet concert hall w^ith a noise rating of NC 10 (noise criterion 10). This means that the inherent noise floor in the hall falls below the 10-phon curve as shown in Figure 2-1. If the microphone's self-noise floor, as measured using the A-weighting curve (Figure 2-3), falls just within the 10-phon curve, we state that the microphone's self noise rating is lO-dB(A). You can think of it this way: a microphone with a self-noise rating of 10dB(A) behaves as if it were an ideal, noiseless microphone in a performance environment with an acoustical noise rating of NC-10.
Distortion at high levels There is a limit to the sound pressure level that a microphone can handle before the onset of distortion in the microphone itself. For studio-grade condensers the reference value is 0.5 percent total harmonic distortion (THD). For dynamic microphones normally used on-stage the reference may be either 1 or 3 percent. The microphone's noise floor and its distortion rating define its useful dynamic range, as shown in Figure 6-1. Here we see the dynamic range for a typical studio microphone. Between the noise floor of 10 dB-A and the onset of distortion at 135 dB-SPL there is a useful operating range of 125 dB. This is slightly greater than the dynamic range of a digital recorder operating with 20-bit conversion. Most studio condenser microphones have a built-in switchable pad (output attenuator) that introduces 10 or possibly 12 dB reduction of output level. As you can see in Figure 6-1, the effect of the pad is to shift the entire operating range of the microphone upward, including the microphone's noise floor itself.
Chapter 6
76
150| 0.5% THD i\
140| 130
0.5% THD ii
120 110 100 5
90
2
80 —
1 70 (D
&. 60
T3
i 50 o ^ 40 30 20 10 0
)^ ^ Nor mal
1 ^^ With 10-dB pgId
Figure 6-1. Operating level ranges of a studio condenser microphone with and without integral -10-dB pad engaged,
Microphone output impedance and recommended load impedance Figure 6-2 shows a schematic diagram of a microphone looking into the input of a recording console. Professional microphones, whether condenser or dynamic, are all balanced; that is, the signal is developed between a pair of conductors placed within a shield, as can be seen in the figure. The standard Microphone cable
Console input
Signal is transmitted between pins 2 and 3; pin 2 is "hor and 1 is the ground or shield
For condensers, Zs = 50 - 200 ohms for dynamics, Zs = 200 - 600 ohms
For typical cable: R = 0.025 ohm/foot C = 30 pF/foot
ZL = 3000 - 5000 ohms
Figure 6-2. Illustration of source impedance, cable, and load impedance in a microphone transmission circuit.
Basic Microphone Electronic Performance
11
input/output hardware is the XLR receptacle, with male configuration for outputs and female configuration for inputs. Sidebar 6.1 analyzes in detail the complex relation between microphone output and console input sections. Sidebar 6.1: Microphone signal flow The microphone has an Internal (or source) Impedance that can vary from 50 to 200 ohms for condensers and from 200 to 600 ohms for dynamics. As the microphone output "looks" Into the cable and the console Input downstream, It "sees" a load Impedance. In modern recording system design, the load Impedance Is at least five times that of the microphone's source impedance. In the example shown here, the microphone's source Impedance Is 200 ohms, and it looks at a load of 3000 ohms at the input of the console. The ratio of load-to-source impedance Is 15-to-1.The microphone cable normally consists of two inner conductors surrounded by a shield. The length of the microphone cable may be anywhere between 10 feet (3 meters) and 660 feet (about 200 meters). Typical high quality microphone cable will have a resistance of about 0.025 ohms per foot and Inter-conductor stray capacitance of about 30 picofarads per foot. In a typical studio setting, the microphone cable length will not exceed about 60 feet (20 meters), and cable losses will be negligible. However, for very long runs the stray capacitance may result in HF attenuation as shown in Figure 6-3. If you are using a dynamic microphone with an output impedance of 600 ohms, the loss will be greater than with the condenser microphone. 33 ft (10 m) cable (200 ohm source) -0.5 CD
200 ft (60 m) cable (200 ohm source)
-1
S -''•5 -2 CO
O
200 ft (60 m) cable (600 ohm source)
-2.5 -3 -3.5
JL 50
L 100 200
i 500
1 1 k
I 2k
J
L
5 k 10 k 20 k
Frequency (Hz) Figure 6-3. Microphone cable losses over distance as a function of source impedance.
Chapter 6
78
The stand-alone microphone preamplifier Many recording engineers routinely use dedicated, stand-alone preamplifiers for all recording activities as an alternative to the microphone input sections of recording consoles. While a good console has excellent microphone preamps, a separate preamp may have some very desirable performance features such as variable input impedance, step-type trim controls and higher output capability. Specifically, the variable input impedance allows a better match with the output impedance of the microphone, which may result in smoother frequency response, and the higher output capability may result in better performance using older tube-type studio condenser models with their typically higher output levels. Figure 6-4 shows a photo of a modem stand-alone microphone preamplifier.
0 OUTPUT
' ^^r*
f ^
^
Z our ^
Figure 6-4. Front and rear views of a stand-alone microphone preamplifier. (Data courtesy FM Acoustics)
Powering condenser microphones Modem solid state condenser microphones make use of phantom powering (also known as simplex powering), in which microphone capsule polarization and signal amplification is powered by 48-volts dc across the signal leads and ground or shield through the microphone cable. The basic phantom powering
Basic Microphone Electronic Performance
79 48 V de
R = 6800 Ohms
To preamp
To microphone
1'
5
To preamp
Signal Is transmitted between pins 2 and 3; dc is provided between pins 2-3 and 1
To preamp
Figure 6-5. Details of 48-volt phantom powering.
circuit is shown in Figure 6-5. As you can see, the positive voltage is applied to each signal lead through a 6800-ohm resistor. The powering system is generally referred to as P48. While not widely used, there are two other phantom powering standards, P24 and P12. Table 6.2 details the three phantom powering standards: Table 6.2. Phantom Voltages and Current Limits Supply voltage Supply current Feed resistors
12 ±1 V max. 15 mA 680 ohms
24 ±4 V max. 10 mA 1200 ohms
48 ±4 V max. 10 mA 6800 ohms
Chapter 6
80
T-powering is also used, but to a much more limited extent than phantom powering. Figure 6-6 gives circuit details of this powering system. Here, the dc power source is fed between the two signal leads, and any slight variation in the power supply will be reflected through as signal output from the microphone.
: 180 ohms
To microphone
ü
i3
]C To preamps
Both signal and power transmitted between pins 2 and 3.
? r-L
EI
]C To preamps
12 volts dc Figure 6-6. Details of 12-volt T-powering.
Battery powering Many condenser vocal microphones are powered with a single 9-volt battery placed within the microphone case so that they may be used with older mixing consoles that do not have integral phantom powering. Similar powering is used for wireless, hand-held microphones (See Sidebar 6.2).
WIRELESS MICROPHONES Today, wireless microphones are used throughout the entertainment industry for on-stage and other pickup. The recording engineer will not normally use them in the studio, but live recording will certainly include them. While wire-
Wireless Microphones
81
less microphones have improved over the years, their performance is not as good as wired models. Specifically, w^ireless microphones make use of complementary compression and expansion to attain a workable dynamic range, and this action is sometimes audible. Also, even with the best of care in setup procedures, wireless microphones may present noise problems in dense urban areas where there are many radio frequency (RF) communications channels in operation. See Sidebar 6.2 for technical details concerning wireless microphones. Sidebar 6.2: Details of wireless microphones Wireless microphones operate with a radiated RF power of no more than 10 milliwatts, and their normal operational range can be as high as 300 to 500 feet, if there are no obstacles. Each microphone must have Its own dedicated receiver, although multiple receivers can operate via a common receiving antenna. Within the microphone's transmitter, the signal is compressed in dynamic range by 2 to 1, and the signal given a HF pre-emphasls. At the receiver, a complementary expansion curve and an inverse HF de-emphasis are applied. These processes are shown In Figure 6-7.
Chapter 6
82 A. Action of compressor/expander Pre-emphaslzed
compressor out Transmission
input to transmitter
Input to expander
De-enr)phasi2ed output from receiver
OdB
-20 dB
-
-40 dB
'
-60 dB —
'
/
Noise
^,
^^x
-80 dB Noise
B. Pro-emphasis and de-emphasis curves "1
T
Pre-emphasts (+6 dBADctave)
•
' T
y
T
De-emphasis (-6 dB/octave)
» ./^
'S
5 20
200 2k Frequency (Hz)
20k
20
200 2k Frequency (Hz)
N
20k
C. Principle of diversity reception ^-S,"'-»
"??, "N
\\. pjrec1signal_
W Transmitter
^S'...
V I
_Diirecl_a9naj
B Diversity receiver
I
j
When a diversity receiver is used, two antennas, spaced by atxHit orw-fourth to one wavelength, pick up the signal, and there is a very low likelihood that cancellation will take place at tx>th antennas simultaneously.
Figure 6-7. Wireless microphone details. Companding action (A); pre- and de-emphasis (B); diversity reception (C).
Wireless Microphones In the United States wireless microphones operate over channels ranging as follows: VHF (very high frequency) range: Low band: 49-108 MHz High band: 16^216 MHz UHF (ultra high frequency) range: Low band: 450-806 MHz High band: 900-952 MHz The reception process is further aided by the so-called d/Vers/Yy process, in which there are two receiving antennas for each microphone placed about one-fourth wavelength apart at the transmitting frequency. The stronger of the two received signals is always used, thus ensuring adequate reception at all times. A photograph of wireless transmitters and receiver Is shown in Figure 6-8.
Figure 6-8. Photo of wireless microphone, bodypack, and receiver. (Courtesy AKG Acoustics)
83
Chapter 7 MICROPHONE ACCESSORIES
INTRODUCTION Microphones are rarely used without accessories. Even a simple hand-held microphone will require a foam windscreen, and of course in any studio application, the microphone will have to be positioned by a stand, boom or other mounting method. Accessories fall basically into the following groups: mounting accessories (stands, booms, stand adapters, shock mounts and stereo mounts), environmental protection (wind screens of all types), and electrical accessories (in-line adapters and microphone splitters).
STANDS AND BOOMS Figure 7-1 shows a group of microphone stands and booms as used in studio recording. Stands range from fairly lightweight models that can reach a height of about 12 or 14 feet (3 to 4 m) to more robust models that can fly a large microphone array. A boom is a swivel attachment positioned at the top of a stand and allows the engineer to place a microphone over the heads of the studio performers or to reach inside a drum set. Large booms need to be counterweighted for stability, and some larger models used in scoring sessions may provide a reach into the orchestra of 8 to 10 feet (2.5 to 3 m). A family of hand-held booms is shown in Figure 7-2. These are used throughout the film and video industries for close miking of actors. In normal application the boom operator is required to keep the boom and microphone outside the film or video frame.
Microphone Mounts
85
Figure 7-1. Typical studio microphone stands and booms. (Courtesy AKG Acoustics)
MICROPHONE MOUNTS Every microphone is provided w^ith its own clip, a small attachment that screv^s onto the top of a microphone stand and to which the microphone is snapped, or clipped, in place. These are adequate in many cases, but where there is any possibility of floor-transmitted vibrations a shock mount may be required. Figure 7-3 shows a typical large format variable pattern condenser
Chapter 7
86
Microphone
Boom
Boom operator
Boundary of video or film frame
'^^^^^^m^^^^^^!^m^^^^^p::m^^^^m^m^^m^mm^^m^:^^. Figure 7-2. Hand-held microphone booms and typical usage. (Courtesy K-tel)
microphone mounted on a stand with a shock mount. Mounts of the kind shown here are normally designed for a given model of microphone. For effective performance it is important that the mechanical resonance of the microphone-shock mount combination be well below the audible range.
Stereo Mounts
87
Figure 7-3, Shock-mounted microphones in the studio. (Courtesy Neumann/USA)
STEREO MOUNTS For many studio applications a pair of microphones need to be precisely positioned relative to each other, and there are many stereo mounts to choose from. Figure 7-4 shows a simple stacked arrangement that allows a pair of small format microphones to be closely arrayed. The model shown in Figure 7-5 is more flexible and allows a pair of microphones to be spaced and angled relative to each other.
Chapter 7
Vxgure 7-4. A simple stereo mount for small format condenser microphones. (Courtesy AKG Acoustics)
Figure 7-5. An articulated stereo mount. (Courtesy Audio Engineering Associates)
Hanging Cable Mounts
89
HANGING CABLE MOUNTS For many live concert recordings the engineer has to dispense with stands in favor of hanging microphones. A number of manufacturers provide a mount similar to the one shown in Figure 7-6. A single microphone can be swiveled and tilted as required.
Swivel joint
Tilting microphone clip
Figure 7-6. A typical cable mount allowing adjustment of horizontal and vertical angles.
WIND AND POP SCREENS These accessories are used both outdoors under windy conditions as well as in the studio. For moderate wind problems, relatively small foam screens can be slipped over the microphone to reduce the effects of puffs of wind from the
90
Chapter 7
performer's mouth. Such sounds and "p" and "b" are notorious for causing "pops." It is better to stop these at the source rather than try to reduce them by equahzation during postproduction. A typical example is shown in Figure 7-7.
Figure 7-7. A typical foam windscreen. (Courtesy AKG Acoustics)
For studio vocal recording a Nylon screen is preferred because it is virtually transparent acoustically. A typical application is shown in Figure 7-8. For outdoor use in effects and news gathering a shroud such as is shown in Figure 7-9 may be necessary to provide substantial reduction of wind noise.
IN-LINE ELECTRICAL ACCESSORIES In your work you will come across a variety of plug-in electrical accessories, including filters, polarity switchers, loss pads, matching transformers and the like. These have traditionally been intended for semi-professional public address activities and are not recommended for general use in sound recording. The only device that you may make extensive use of is a microphone splitter—and then only under controlled conditions. A microphone splitter is a device that allows a microphone's output to be fed to two, possibly three, downstream activities: primary recording, sound reinforcement and broadcast feeds. A schematic drawing is shown in Figure 7-10. The model shown here is passive and accommodates the output of a single microphone. Multiple splitters are also available that provide for a direct feed for the primary activity and multiple amplified outputs for other activities.
In-Line Electrical Accessories
Figure 7-8, A Nylon windscreen in typical studio use. (Courtesy Schoeps)
Figure 7-9. A shrouds for high-wind environments. (Courtesy beyerdynamic)
91
Chapter 7
92 Transformer Electrostatic shield -02
n Split
r
output 1
20-t HD3 - 0 1 -» -02
From microphone
3 0 1 0
Split output 2
-03 -01
20-J Direct
-»
jfitlilil-
output
1 0
1^: Ground lifts
Figure 7-10. Circuit details of a passive microphone splitter.
As you can see in the figure, the path from microphone to recording console or preamplifier is straight-through and is thus unaffected by the splitting process. The microphone is fed to a transformer that has two secondary windings, and each of these is used to feed other activities. Note particularly that phantom powering from your console will reach the microphone; the other console destinations are isolated by the transformer. A set of ground-lift switches can be used in case there are hum or buzz problems arising from improper system grounding. There are two important recommendations. If you are in charge of a recording, make sure that your activity is the one that receives the direct microphone output. Also, when everyone else has been connected to the system, make sure that there are no unusual noises or buzzes. You should be completely free and clear from any "hard" connection to the electronic systems the other activities are using. You should also ensure that the transformer secondary circuits look into standard microphone input impedances no lower than about 3,000 ohms.
93
In-Line Electrical Accessories
Figure 7-11 shows a group of in-line accessories. The "tum-arounds" shown at A can be used to straighten out certain miswirings; the polarity inverter shown at B is used to reverse the polarity of a miswired cable; a balanced loss pad is shown at C; a step-up transformer is shown at D; and a lowpass filter is shown at E, Items at A and B may be used with phantom powering, but the others cannot. As you can see, most of the problems that are solved with these in-line accessories are better solved through competent engineering.
B
D
3 2 1
XLR>M
XLR-M
XLR-F
XLR-F
XLR-F
XLR-M
XLR-F
XLR-M
XLR-F
XLR-M
IL 1 1
XnilU
(step-up transformer)
XLR-M
XLR-F E
3 2 1
__ 1 1
T"
_ (high-pass filter)
Figure 7-lL In-line microphone electrical accessories.
Chapter 8 BASIC AUDIO SIGNAL ANALYSIS
INTRODUCTION In this chapter we will take a close look at audio signals of all kinds. These may be speech or music programs in mono, stereo or multichannel, or they may be test signals which are used to diagnose various kinds of transmission problems. Many of the problems you will encounter in recording will be clearly audible, while others may require some kind of test intrumentation or test procedure to be identified precisely. Other problems in signal transmission have to do basically with subjective appraisals, such as loudness and spectral matching, as well as stereophonic judgements regarding spatiality, image specificity, and so forth.
CHARACTERISTICS OF A PROGRAM CHANNEL The term channel is often used to describe an audio program path intended for final delivery to the consumer. It may be a radio signal, TV audio signal, or a signal intended for a home audio playback medium. We can also think of channel groups, such as stereo or surround sound, where a number of channels are intended for simultaneous playback. For now, we'll consider a single, or monophonic, channel. (The term mono is normally used instead of monophonic) A brief "time history" of such a channel can be represented as shown in Figure 8-1. This figure shows the variation in signal level over some period of time. There is an average signal level, a maximum possible signal level and a system noise floor. The maximum level represents the upper limit of signal transmission. For example, in radio broadcasting the maximum level is defined as the degree of signal modulation that will fit into the "broadcast space" allowed by the Federal Communications Commission without interference with a neighboring broadcast station. In an analog recording, it represents the upper signal limit of the recording medium itself before the onset of a stated degree of distortion. In a digital channel, the maximum level is defined as digitalfull-scale, that value beyond which the signal cannot be represented by the digital code.
95
Program and Playback Requirements Program envelope Maximum possible
T — Average level
Time Figure 8-1. Illustration of a program envelope as it varies over time.
The noise floor of the channel is the residual level of the system when there is no signal applied to it. In analog electronic systems the noise results from thermal agitation arising at the molecular level. In a digital system the noise floor arises at the lowest levels of digital signal quantifying (more about this in a later chapter). Analog recording media have their own noise characteristics, and these usually result from granularity in the medium itself. Figure 8-1 also introduces the notions of headroom and signal-to-noise ratio. Headroom is that signal space between average modulation level and maximum possible level. Signal-to-noise (S/N) represents the normal operating range of the transmission channel.
PROGRAM AND PLAYBACK REQUIREMENTS If you are in the broadcast business you are aware of the competitive requirements of keeping your station's signal at the highest possible loudness level relative to other stations on the dial. This means that your average level must be high, and this requires that headroom be minimized. In other words, you will want to "contain," or limit the signal so that its average level might be no less than about 8 or 10 dB below the maximum level available. The same requirements will apply to any kind of commercial recording. Record producers want their product to "sound loud" and catch the immediate attention of the listener, and both producers and artists are keenly aware of the wide variety of conditions under which their product will be heard. The automobile has a very narrow loudness range into which
Chapter 8
96
the signal has to fit. In spite of some of the loud rigs you may hear on the road, the average automobile listener plays music levels no greater than perhaps 80 to 85 dB SPL. Since the road noise level in the average automobile is in the 55 to 60 dB range, this leaves only about a useful 20-dB range for music presentation in that environment. In Figure 1-8, we showed the waveforms for several kinds of audio signals. If we take a continuous, flowing speech signal and compress its time scale into about 20 seconds, it will resemble that shown in Figure 8-2. As you can see, the signal has occasional high peaks, but the bulk of the signal lies at much lower values. This signal has a peak-to-average ratio of about 12 dB, and this means that the average level, which relates directly to program loudness, will be about 12 dB lower than peak levels. If the maximum allowable signal is ±1 volt, then the average signal will be about ±0.25 volts, as shown in this figure. Peak envelope of speech signal over 20-second period
Time
Figure 8-2. A speech waveform over a 20-second time period with a peak-to-average ratio of 12 dB.
A transmission channel carrying this signal will not be very efficient, since it is already "maxed out" with its normal levels at -12 dB. If we compress or limit the signal's amplitude, we can increase its average value while leaving the peak signal values the same as before. For radio use we could easily compress the signal so that the peak-to-average ratio was no more than about 8 dB, as shown in Figure 8-3. For music, we would normally not want to compress the program any more than is shown here.
97
Signal Frequency Spectra Peak envelope of speech signal over 20-second period +1
i l l II llilll 1 lliili 11 III ill II llilll 1 llii 11 IJI
+0.4 V [average
£
|||(|||fi'^p«'fP'|ll||Hf'^P"¥T«l
CD
§ CO
c g) CO
ifiiriiPi^^ l l i l l l 1 ill 1 1 1 r'l 1 1 I I l l i l l l 1 III 1
1 1 >l I I
0.4 V [average
1 li
Time Figure 8-3. A speech waveform over a 20-second time period with a peak-to-average ratio of 8 dB.
SIGNAL FREQUENCY SPECTRA The frequency spectrum of a signal is the envelope normally occupied by the signal averaged over some period of time. Some examples are shown in Figure 8-4. Aflat spectrum, as the name indicates, is uniform across the audible frequency band. Certain test signals have a flat spectrum, and a great deal of instrumental rock music has a spectrum that is fairly flat out to about 8 kHz. A symphony orchestra has a spectrum that begins to roll off above 250 Hz, as shown. Male speech has a spectrum that peaks at about 250 Hz and rolls off above and below that frequency, as shown in Figure 8-5. Since most transmission channels have flat transmission capability over their frequency band, it is fairly obvious that, with a given speech signal, the signal could be easily boosted in the 2-kHz octave band by about 4 to 6 dB, resulting in increased intelligibility. We can do this without modifying the channel to any degree. While we would be affecting the quality of the speech signal by making it somewhat unnatural, there is no question that we would at the same time be increasing intelligibility. In noisy environments, such as transportation terminals, this is a common practice among sound reinforcement engineers.
Chapter 8
98 Typical rock and electronic music spectrum
250
500
Ik
2k
Frequency (Hz) Typical symphonic music spectrum
250
500
1k
2k
Frequency (Hz)
Figure 8-4. Typical octave-band spectra for electronic music and symphonic music.
SIGNAL POLARITY Polarity (or phase) defines a signal in the "positive/negative" sense. Perhaps the best way to explain this is by way of Figure 8-6A. In this recording/playback system (or chain, as it is often called) each element, from microphone to loudspeaker, maintains the same output polarity as shown at the acoustical input of the microphone. The logical way to maintain this condition is to design each element in the entire chain to be non-inverting—^that is, to preserve input polarity at the output of each device. Modem electronic components preserve identical input and output polarity as shown at C. By comparison, an inverting device will operate as shown at D, You can fall into a polarity trap if you aren't careftil. If two of the devices in the audio chain are inverting, the output of the entire chain will still exhibit matched polarity between input and output. But, depending on which devices are
99
Signal Polarity
Long-term Speech Spectrum
1k
500
2k
Frequency (Hz) Figure 8-5. Long-term male speech octave-band spectrum.
involved, you could end up with a polarity problem if the system were to be reconfigured. Such problems happen more often that we'd like to think. In mono transmission there may be no dire consequences; however, if there is a mismatch in a stereo pair of channels you will be in trouble. Referring back to Figure 2-8, the creation of a clear phantom center image requires that the exact signal be fed to both loudspeakers. If one of these signals is inverted, the phasor reconstruction corresponding to a frontal phantom image cannot take place and the resulting sound will be confiising and unnatural. A
Recording chain
1 Acoustical signal input
Preamp
Microphone
• ^
D^O^
Medium
Recorder
Mixer
A=
Playback chain
Preamp
Medium
A= +
a
Non-inverting
Amplifier
Loudspeaker
^
^{>A<
a +
a
a
Acoustical signal output
Inverting
w; O
Figure 8-6. Signal polarity in a recording chain (A) and in a playback chain (B). Illustration of a noninverting processor (C) and an inverting processor (D).
100
Chapters
Try the following experiment: Listen to a recording such as a pop vocal that has a prominent center phantom image. Then, switch the polarity of one channel by reversing the wires at the loudspeaker. Listen carefully and get a clear idea in your mind of how the vocal sounds when it is presented in antiphase. You will have no problem determining which is correct. (And don't forget to reconnect the loudspeaker properly when you are through.) A note regarding terminology: The terms in-polarity and out-of-polarity are equivalent to in-phase and out-of phase (or anti-phase). All of these terms are in common usage.
TEST SIGNALS AND MEASUREMENTS Audio testing is a complex field, and we will only cover the basics in this chapter. In any studio, an audio oscillator is a very useful test instrument. The oscillator produces a sinewave output than can be conveniently swept over a wide band of frequencies. It can be used for tracking down such problems as rattles, buzzes, or distortion in a control room monitor system. It is also useful in spot checking the record/playback frequency response of an analog tape recorder or in measuring the response of an equalizer. It can also be used for identifying various resonance frequencies and standing waves in a control room when presented over a loudspeaker. Noise signals are also useful in measuring loudspeaker response. A typical noise generator has two kinds of outputs: white noise and pink noise. These terms are taken from the characteristics of light. White light contains all visible light wavelengths in equal amounts, and white noise contains all audible frequencies in equal amount (equal acoustical power per-cycle). Pink light exhibits a rolloff of shorter light wavelengths; correspondingly, pink noise has a similar rolloff (equal acoustical power per octave). Pink noise is most often used in checking the response of monitoring systems, as shown in Figure 8-7. The direct response of the pink noise generator as seen on the face of a 1/3-octave analyzer is shown at A. The flat response is an indication of the uniform power output per-octave, which translates directly into equal power per-1/3-octave. When the pink noise signal is used to drive a monitor loudspeaker, a microphone placed at the engineer's listening position will register the combined loudspeaker/room response, as shown in Figure 8-7B.
101
Electrical Signal Summation 1/3-cx:tave real-tin^e analyzer
A Pink noise generadtor
dB-
——
o ooo Control room
Pink noise generadtor
Test microphone
1>
o—
1/3-octave real-time analyzer
O O O O
Figure 8-7. A 1/3-octave real-time analyzer fed with a pink noise signal (A); typical measurement application in a control room.
ELECTRICAL SIGNAL SUMMATION When audio signals are added, either directly in the electrical domain or as separate loudspeaker outputs in the acoustical domain, the summations are not necessarily what you might expect. Let's take the electrical case first. As shown in Figure 8-8A, two identical signals will sum directly, producing a value that is twice either one, representing a level increase of 6 dB. At B the two signals have been summed anti-phase, and it is obvious that they will cancel. At C we show the effect of summing two sine waves of the same frequency and level, but differing in their relative phases. The 90 degree shift shown here will resuh in a net output that is 1.4 times the amplitude of either of the input signals and which has a relative phase or 45 degree. At D we show the effect of summing two separate noise sources of the same level and spectral characteristics. The net output has increased by a factor of 1.4, representing a level increase of 3 dB.
Chapter 8
102 A Two equal signals of like polarity (in-phase)
4-
B Two equal signals of opposite polarity (anti-phase)
ZERO
+ 0
C Two sine waves of same frequency & level with 90^ phase shift 4^ = 0° Amplitude = .75 (
\
'0 dB"
1 ^— W
«^ = -90° Amplitude = .75
+1
-+ 0 -1
f
4> = -45'* Amplitude = {.75)x(1.4)
yo dB"
J V -/
V.
Time •
D Two independent noise sources at same level "OdB"
"+3dB"
+° Figure 8-8, Electrical summation of signals. Two equal signals of same polarity (A); two equal signals of reversed polarity (B); two sine waves of equal amplitude at a 90-degree phase angle (C); two independent noise sources of the same level (D).
ACOUSTICAL SIGNAL SUMMATION Let's now present each of these signal pairs in stereo. In the case shown in Figure 8-9A, two identical signals will appear as a phantom center image in stereo, and the level in the listening room will be approximately 3 dB greater than either channel alone. When the anti-phase pair is presented in stereo (B), there is no clear localization of the signal, but the level in the listening room will be approximately 3 dB greater than either channel alone.
Acoustical Signal Summation Identical signals of like polarity O O
o Listener
103 Identical signals in anti-phase
O
O
O Listener
Stereo image: precisely in center
Stereo image: unlocalizable
Level: about 3 dB higher than one channel alone
Level: about 3 dB higher than one channel alone
Identical signals with 90'' phase shift O O
o Listener
Independent noise signals at same level
o a
O Listener
Stereo image: wide center location
Stereo image: wide, natural stereo
Level: akx)ut 3 dB higher than one channel alone
Level: atjout 3 dB higher than one channel alone
Figure 8-9. Stereo image presentation and level of various signals. Identical signals (A); identical signals in anti-phase (B); equal signals shifted 90 degrees (C); independent signals at same level (D).
When the phase-shifted pair is presented in stereo (C), you will hear a "wide" center image, and the level in the listening room will be approximately 3 dB greater than either channel alone. When the two separate noise signals are presented in stereo {D\ the effect is a very broad sound front extending over the entire stereo stage. As with the other cases, the level in the listening room will be 3 dB greater than either channel alone. In all of these cases the stereo level in the listening room was about 3 dB greater than either channel alone. The reason for this is simply that acoustical loudness results primarily from the collection, or ensemble, of reflections in the listening room. In each of these four cases, the contributions from each channel were identical in amplitude, differing only in time domain characteristics between the elements of each signal pair. Since we are summing two
Chapter 8
104
equal signal levels, the acoustical power in the room will be doubled, resulting in a 3-dB increase in perceived level. Note that I have used the term "about 3 dB." Why not exactly 3 dB? The reason here has to do with the spatial distribution of sound energy density in the listening room, which tends toward an average value independent of the instantaneous polarity of the signals. This correlates well with what we hear in the room, and it is what we will read on a sound level meter averaged over the listening area.
STEREO SIGNAL CORRELATION The correlation between a stereo pair of signals is a measure of the commonality of the signals. As an example, two identical (or mono) signals will have a correlation coefficient of unity, or 1. If one of these signals is anti-phase the resulting correlation will be - 1 . If the two signals have no commonality whatever, then their correlation coefficient will be zero. We normally observe these relationships with an oscilloscope, as shown in Figure 8-10. The basic
Vertical input Horizontal / - \ Input
B
Left only
Right only
Stereo (largely uncorrelated)
Left = right
Stereo (with strong LF mono component)
Left = -right
Left and right equal at 90**
Stereo (with strong antiphase LF nrK)rK) component)
Figure 8-10. Oscilloscope patterns. Normal application (A); Lissajous figures for various signal combinations (B to I).
Stereo Signal Correlation
105
use of the oscilloscope is shown diXA, Here, a sine wave is introduced as the vertical signal, and an internal sweep circuit provides the horizontal signal. The resulting display shows the sine wave as a function of time, just as we observed it in Chapter 1. The polarity of the oscilloscope is vertical positive upward and horizontal positive to the right. In observing stereo signals on the scope we introduce the left signal at the vertical input and right signal at the horizontal input. For a left-only signal the display is as shown at B, and a right-only signal is shown at C. Identical signals (left = right) will produce the display shown at Z), and the same pair of inputs with an antiphase relationship is shown at E. If the stereo signals are identical with a 90^ phase shift between them the display is as shown at F. These displays are known as Lissajous figures. A highly uncorrelated stereo signal will appear as shown at G. Such a signal might be generated with a single pair of widely spaced microphones. A normal stereo signal with LF information panned to the center (in-phase) will appear as shown at //, and the same signal with anti-phase LF information is shown at 7. Oscilloscope displays are often tricky to read, but they contain a great deal of useful information. For most stereo recording or remix activities a correlation meter, as shown in Figure 8-11, will be easier to use. The meter operates by shaping both input signals, multiplying them, and then displaying the product of the signals as a fairly slow average value. The signal integration time is in the range of second or so, and the value indicated by the meter represents the short-term average program correlation. A normal stereo program will tend to hover around a zero value with occasional "excursions" into the positive area. Any signal that has exhibits a high degree O-
Wave shaping
Rlght input Q -
Wave shaping
Left Input
Averaging
Normal range for stereo
Figure 8-11. Details of a correlation meter for stereo.
Chapter 8
106
of negative correlation should be carefully analyzed for a possible anti-phase condition. Such a condition may normally be fixed by simply flipping the polarity of one channeL
ADDING ACOUSTICAL SIGNAL LEVELS If we have two acoustical levels, each of 93 dB, and we add them, the sum will be 96 dB. This is simply because the levels are equal and their sum will by definition be 3 dB greater than either one alone. When the values are different we can calculate their sum from the nomograph given in Figure 8-12. 10
2.5
1.5
1
0.9
0.8
0.7
0.6
Figure 8-12. Nomograph for adding signal levels in dB.
Take any two levels, such as 60 dB SPL and 65 dB SPL. Their sum can be determined by taking their difference, 5 dB, and locating that value on the line indicated D in the nomograph. Reading directly below that value you and get 1.8. Then, add 1.8 to the higher of the two original values, or 65 + 1.8 = 66.8 dB SPL. Try summing the values of 50 dB SPL and 60 dB SPL. Your answer should be about 60.4 dB SPL. Remember that when the difference between the levels to be summed is about 10 or greater, the resulting sum is very nearly equal to the higher value. If you are adding a number of individual levels, take them two at a time and sum each pair; continue until you have summed them all.
Chapter 9 RECORDING CONSOLES, METERING, AND AUDIO TRANSMISSION SYSTEMS
INTRODUCTION The console is the control center of any recording activity. It receives all program inputs from the studio, routes them through signal processing devices and assigns them to the various outputs. It provides for monitoring of its own output signals as well as the outputs of recording devices. The majority of consoles you will encounter are of traditional analog design, and they will be the focus of this chapter. Digital consoles will be discussed in Chapter 13. The earliest consoles were not much more than basic summing networks for a group of microphones which were fed to a single output channel. Eventually, engineers needed greater flexibility, and equalizers were added to each input channel. With the advent of tape recording it became necessary to include monitoring switching so that the engineer and producer could audition playback from tape in addition to monitoring the bus output of the console. Later, stereo recording demanded additional output channels, and engineers also required auxiliary outputs for sending signals to external devices such as reverberation (echo) chambers. It is at this point that our discussion in this chapter begins. First, we will cover some important fundamental concepts in signal transmission.
BASIC CONCEPTS Equivalent input noise (EIN) The noise floor of any console or audio transmission system is normally established at the "front-end" of the system and is the result of self-noise in the input resistance of the system. The noise arises from thermal agitation at the molecular level and is fundamental to all audio systems. A detailed discussion is given in Sidebar 9.1.
108
Chapter 9 Sidebar 9.1: Input noise depends on the input resistance of the circuit, the ambient tennperature and the measurement frequency bandwidth. The rms voltage is given by the following equation: -rms
= V4kRTDf
9.1
where k is Boltzmann'c constant (1.38 x lO"^^ joules per kelvin), R is the input resistance (ohms) and Df is the audio bandwidth. Typical values are 7 = 300 degrees K (80 degrees F), R = 200 ohms (typical of current studio quality condenser microphones) and Df = 20,000. These values give a thermal noise of about 0.26 microvolts rms, which is equivalent to -129.6 dBu. Today, studio quality input preamplifiers come within about 2 dB of this theoretical limit. If you look ahead to Figure 9-10 you will note that the microphone noise floor is just at this limiting value. A short-circuited input to the console would result in a noise floor about 10 dB lower, so the dominant noise floor in the audio chain is normally that of the microphone. The measurement of EIN is also shown in Figure 9-1. As we have seen, the noise floor of the microphone is normally expressed in terms of an equivalent acoustical noise level stated in dB(A). Source of EIN
Measurement of EIN
O
Low pass filter
-O
EIN =
e = VikRTAf' T In **Ke(vin k = Boltzmann's constant
Figure 9-1. Origin and measurement of equivalent input noise.
Special circuitry used in console design Figure 9-2 shows details of the operational amplifier (opamp). The opamp is universally used in audio distribution systems because of its flexibility in performing functions of signal addition, subtraction, and combining. The basic amplifier is shown symbolically a t ^ . It has both inverting and noninverting inputs and has a signal amplification of about 100 dB; its input impedance is very high and the output impedance is very low. When combined with external feedback resistors as shown at B and C, the audio bandwidth gain of the amplifier becomes a function of the resistance ratios. The circuit at B is
Basic Concepts
109
inverting while that shown at C is noninverting. The arrangement show at D acts as a combining amplifier, and is used when multiple microphones or other signals are combined into a single output bus. (The term bus is used throughout audio engineering to indicate the various output circuits of a console that are used to distribute signals to their various destinations in the control room, or throughout a broadcast or recording facility.)
rryi^ Rf*-Ri 60 = (e, + 6 2 + 6 3 + . . . . e n ) enCH—W^-*
Microphone
-F\
y
. -^
1 kQ
..
-A—AAA^
y
Output
Figure 9-2. Details of the operational amplifier. Basic element (A); inverting amplifier (B); noninverting amplifier (C); combining amplifier (D); balanced input amplifier (E).
The circuit at E has a balanced input and is often used directly as a microphone preamplifier.
Symbols and conventions in signal flow diagrams As you read various system schematic diagrams you will see elements such as those shown in Figure 9-3. These are defined as follows:
Chapter 9
no A. Line amplifier (gain often stated in dB) B. Variable gain line amplifier C and D. In-line faders (volume controls) E. Ganged faders F. One-in/two-out panpot (panoramic potentiometer) G. Signal processing module (ftmction normally stated) H. In-line termination resistor I. Line-crossing/intersecting conventions J. Line termination conventions K. In-line transformer L. Meter conventions
- ^
EQ
1
I
Either, depending on choice of 12 or 13
-o Non-intersecting
-D
-< Intersecting
L Q)
©
Figure 9-3. Conventions used in audio signal transmission diagrams.
Not all manufacturers use the same conventions, but "translations" among them are generally easy to make.
Basic Concepts
111
Patch bay conventions The patch bay section of a large console makes it possible to introduce external recorders or processors into an audio chain with a minimum of clutter and also to reassign console elements for greater user convenience. The word "jack" indicates either the plug at one of a patch cord or the receptacle into which it is inserted. A jack is shown at Figure 9-4A, as both circuit and symbol. A line termination is shown at B\ when a jack is inserted (as shown at C) the termination is lifted and the signal is sent onward. A normaledjack pair is shown at D, In this configuration there is continuity between the two jacks. An input and output of an external device can be inserted into the two jacks by lifting the normaled connection between them. The configuration shown at E is often called a half normal. A patch cord inserted into the upper jack will lift it, while a patch cord inserted into the lower jack will not. A portion of a typical patch bay is shown at F. The jacks are normally used in pairs, and signal flow is generally from top to bottom. For example, studio microphone receptacles appear at the top and are normaled into the console preamps. If you need to switch microphone positions on the console, it is often easier to re-patch in the control room than to reposition cables in the studio. In the middle set ofjacks, all of the console's insert points are shown. Any piece of line-level outboard gear can be easily patched into the system, as earlier indicated at D. (Line level refers to signals whose normal operating levels are in the range of 0 dBu. By comparison, microphone level signals are in the operating range of-40 dBu.) The bottom row of jack pairs connect multi-track outputs to their normal console inputs. On many occasions you will want to lay out these signal returns in a different order. You will usually find full-size patch bays located in a rack adjacent to the console. Mini-patch bays are normally located on the console's working surface itself.
Chapter 9
112 Schematic
Symbol
Shield
<-
D
Schematic
E
Schematic
T
Cfcl
Shield
Shield
Symbol
Symbol
X PARTIAL VIEW OF A PATCH BAY:
NORMALED
NORMALED
NORMALED
OOOOOOOOOO
oooooo loooooo loooooo loooooo loooooo
o o o o o
MIC LINES FROM STUDIO
TO CONSOLE PREAMPS ooo INSERT SENDS oo INSERT RETURNS oo MULTI-TRACK RETURNS o CHANNEL LINE INPUTS o
Figure 9-4. Patch bay conventions. Basic forms (A through E); section of a patch bay (F).
113
The Split Configuration Console
THE SPLIT CONFIGURATION CONSOLE Recording demands at the beginning of the rock era grew rather quickly in a progression from 2, 4, 8, 16 and finally to 24 tracks. The following requirements also had to be met: 1. The need for greater signal processing capability in each module. 2. Pair-wise (odd-even) bus selection, with panning between the pair. This allows output busses to be treated as stereo pairs, if that is needed. 3. Simplified console routing and switching for the requirements of overdubbing and tracking at a later date. 4. The need for many AUX (auxiliary) send busses for headphone monitoring and feeding external processors. 5. Use of the studio recording console for remix activities as well. During the early years of the rock era split configuration consoles were common, and today most small consoles you will encounter are of this type. A generic layout for a split configuration console is shown in Figure 9-5, and a photo of a small split configuration console is shown in Figure 9-6. We will now discuss the performance of a typical model. GENERIC SPLIT CONFIGURATION CONSOLE LAYOUT
INPUT CHANNELS (EQ, PAN, AUX SENDS, ETC.)
M A
BUS ROUTING TO GROUPS
T E R
MONITOR SECTION (FROM TAPE)
s
t
i^i
sssss
GROUP OUTPUTS TO TAPE
ft
INPUT CHANNELS:
SET GAIN STRUCTURE, EQ AND RELATIVE MIX LEVELS TO GROUPS
BUS ROUTING:
DETERMINE WHICH GROUPS (TAPE CHANNELS) TO ENGAGE
GROUP OUTPUTS:
SET GAIN TO MULTITRACK RECORDER
MONITOR SECTION:
SET MIX LEVEL TO TWO TRACK MIX AND ADJUST CONTROL ROOM MONITORING LEVEL
Figure 9-5. The split configuration console, conceptual view.
Chapter 9
114
Figure 9-6. Photograph of a small split configuration console. (Courtesy Soundcraft)
INPUT MODULE Figure 9-7 shows views of an input module and the corresponding signal flow diagram for that module. Sections indicated in bold letters in the module views match those in the flow diagram: A. Phantom powering on/off B. Input gain (trim) control C. Mic/line level input switch D. Module polarity reversal (not shown in diagram) E. Low cut in/out F. Equalizer in/out G. Peak indicator (indicates at 6 dB below clipping) H. PFL (pre-fader listening); routes module input signal to monitor for isolating problems (note: PFL does not affect signals sent to tape) L Module ON switch J. SIP (solo in place) routes module input signal to monitors panned as that signal appears in stereo (note: SIP does affect signals sen to MIX busses) K. Mute function for module L. Module fader M. Internal jumpers for shifting AUX sends from post-fader to pre EQ
Input Model
115 VIEWS OF INPUT MODULE Bottom section
Top section
5 Eig 30
GAIN
2 MIC70 10 LINE 20
Mm —CZH
[lOOHz
6J?^6 HF 15^-^15 2.
10K I 3-0^3
9f(
Mil
))9
I
12V^..^12
.n
p.1ff
LOW| MID
O
& I
9 —' 12
.3^3
LF
712
CED
SIGNAL FLOW DIAGRAM
o
, ,„j. LINE OUT
,
ACTIVE COMBINING NETWORKS 1
I
11 MUTE I
,
24 L F
L
Figure 9-7. Input module, split configuration. Views (upper); signal flow (lower). (Courtesy Soundcraft)
N. AUX send level controls O. Active combining networks for output busses (1 through 24) and stereo mix {L and R) P. Module odd/even bus pan control.
116
Chapter 9
OUTPUT MODULE Figure 9-8 shows views of the output module and its corresponding signal flow diagram. In this console there are 24 output busses, or groups. The output module controls the signals sent to each of 24 inputs on the multitrack recorder and playback signals returning from the recorder, Q. Group output bus fader R. Bus output (to tape recorder) S. Tape playback return T. Bus/tape return switch U. Group equalizer V. Group on/off switch W. Group return level Y. AUX 1 and 2 send level controls Z. SUB routes group output to L/R mix busses AA. PFL routes signal to monitors for diagnosing problems; does not affect signals sent to tape. KK. Group output signal metering point
117
Output Module VIEWS OF OUTPUT MODULE Top section
Bottom section T—1
0
-] h-5 ^
^ 5
1 10 j 1—20
o
u
1—30
A 1—40 [—50 1—eo
SIGNAL FLOW DIAGRAM
|[wg
r^
PAN
H>^=m>^
i'jBuses 1 • 24 Identical
c
Tape Retunf) 1 - 24 Identical
Figure 9-8. Output module, split configuration. Views (upper); signal flow (lower). (Courtesy Soundcraft)
118
Chapter 9
MONITOR/MASTER MODULE Figure 9-9 shows views of the monitor/master module and its corresponding signal flow diagram. Console functions such as monitor selection, AUX master send levels, L/R mix levels, and talkback function are controlled at this module. BB. AUX send levels (1 through 6) CC. L/R MIX output levels to tape DD. Control room monitor select (MIX busses and outputs from 3 recorders) EE. Switching logic for PFL functions FF. Control room monitoring level GG. Dim control for monitors (reduces monitor level by a fixed amount) HH. Mono switch for monitors (sums left and right channels for mono compatibility checking) II. Studio monitoring switch (used in playback mode only) JJ. Oscillator and talkback controls (oscillator is used to set levels; talkback enables engineer and producer to slate information to out put and MIX busses) LL. MIX metering point MM. SIP (solo in place) switch NN. Master mute (mutes all input modules whose mute bus is engaged)
119
Monitor/Master Module VIEWS OF MONITOR/MASTER MODULE Top section Bottom section SOLO 1 Lfsjpl MASTER MUTE
HMT^ NN
H
VA [—10
I
—f OU
i
CC h-40
j
j
AH
1—50
[—60 U70
h^ (§) PHONES
SIGNAL FLOW DIAGRAM Auxl
MixL
F
MixL Output
Aux
M>#r-|>-[>^" K>^^^^^M>-I>^^ Mix R Identic^
^ Aux 1 - 6
[^ P>K>ne8
Monitor Select Right Channel Identicat
CR Room •** MonOut Studio Loudspeaker Output
3+
"^=^MF/gwre 9-9. Monitor/master module, split configuration. Views (upper); signal flow (lower). (Courtesy Soundcraft)
Chapter 9
120
LEVEL DIAGRAMS Most console technical literature includes a signal level diagram. Essentially, this shows the normal operating level and noise floor as these values vary from input to output of the console. In the example shown, you can see that the noise floor is set at the input by the microphone preamplifier's noise floor. The microphone signal is amplified by nearly 70 dB at the earliest stages. It is reduced a nominal 10 dB by the input fader in order to give the engineer needed operating range. After that 10 dB has been restored in the following stage, the level varies only slightly from that point onward to the output of the console. Today, virtually all consoles follow the general plan shown here-but it is still possible to get into trouble: for example, when you operate the input fader too low when the input trim has been adjusted too high, you're likely to encounter overload at the microphone preamplifier stage. Thus the rule: try to keep the trims in the "comfort zone" between 10 o'clock to 2 o'clock. +27 dBu maximum Maximum ievel Headroom
Figure 9-10. Level diagram, split configuration console. (Data courtesy Soundcraft).
Setting Proper Gain Structure
121
SETTING PROPER GAIN STRUCTURE There are only three points in the input-to-output chain of the split configuration console where gain can be adjusted: microphone input trim, input fader, and group or bus output fader. We recommend the following procedure for establishing levels through the console: 1. Adjust each microphone input separately, with all others turned off. 2. Set the input fader to its nominal -10-dB position (this is some times labeled 0-dB on the fader scale) 3. Set the group output fader to its full-on position (normally labeled 0-dB). 4. With normal program input at the microphone, adjust the input trim control so that the level at the group output indicates normal modulation on the bus output meter. When you to this, you will find that the marker on the trim control will usually fall within the range from 10 o'clock to 2 o'clock. 5. If this is not the case, then check the feed from the studio. The musician may be playing very loudly, in which case you may need to switch in the microphone's output pad. Or, in extreme cases, you may want to use a less sensitive microphone. 6. If more microphone inputs are added to a given group output, you will have to reduce each one by trimming downward slightly. A good rule is to reduce all levels by 3 dB for each doubling of inputs fed to the same output group. 7. Once you have established a basic mix, feel free to make further gain modifications at the faders.
THE IN-LINE CONFIGURATION CONSOLE Today the in-line console is widely used in multitrack recording. In terms of layout, it does away with the output section of the console, integrating that function with the input in what is called the I/O (input/output) module. As a result of this, the overall console size can be reduced, but the console will be much more complex. The basic idea behind the in-line console is that, in the ultimate case, there will be one microphone or direct pickup for each track on the multitrack recorder. Therefore, why not provide a means of getting from the microphone
Chapter 9
122
to the recorder as simply and directly as possible, saving for postproduction all decisions regarding equalization, dynamics control, reverberation, and all other aspects of signal processing? During the tracking process, this is exactly what can be done. At the same time it is possible to monitor the recording with all the desired signal processing and make a rough two-track mix. Figure 9-11 shows a generic view and basic functions of an in-line console. Outside of the master section, the entire operating surface consists of I/O modules. The master section itself remains very much the same as for the split configuration design. GENERIC IN-LINE CONSOLE LAYOUT
INPUT AND OUTPUT CHANNELS
1
M A S T E R
INPUT AND OUTPUT CHANNELS
CHANNEL FADERS DIRECT TO TAPE (OR BUS ROUTE) MONITOR FADERS FROM TAPE
CHANNEL FADERS TO TAPE (OR BUS ROUTE) 1
MONITOR FADERS FROM TAPE
INPUT CHANNELS:
ADJUST INPUT GAIN AND SET CHANNEL FADER FOR LEVEL TO TAPE
BUS ROUTING:
DECIDE WHICH TAPE TRACK TO SEND SIGNAL TO - OR DIRECT TO TAPE
MONITOR FADERS:
MIX MULTITRACK CHANNELS TO 2-TRACK MIX
MASTER:
SET MIX LEVELS TO 2-TRACK AND SET CONTROL ROOM MONITORING LEVEL
Figure 9-11. The in-line configuration console, a conceptual view.
A LOOK AT THE I/O MODULE A simplified signal flow diagram for an I/O module is shown in Figure 9-12. For clarity we have omitted a number of functions, primarily the auxiliary send busses, since they are virtually the same here as in a split configuration console. There are two principal paths through the I/O module: the channel path and the monitor path, and they are clearly indicated in Figure 9-12. The module is described in detail:
A Look at the I/O Module
123
Figure 9-12, I/O module, basic flow diagram. (Data after Soundcraft)
Channel path A. Microphone or line inputs from the studio B. Transfer switch shown in normal position C. Signal processing functions; can be switched (via "swap mode") between channel path or monitor path D. Transfer switch shown in normal position
124
Chapter 9 E. Fader for setting level R Transfer switch shown in normal position G. Transfer switch shown in normal position H. Output of channel path; normally sent to multichannel recorder
Monitor path J. Return from multichannel recorder; input level adjusted as required K. Transfer switch shown in normal position L. Fader for setting level (Note: AUX sends would normally be located adjacent to this fader; omitted for clarity) M. Panning of signal into stereo (bold lines in diagram indicate a stereo signal pair) N. Routing of panned signal to either stereo (MIX) or to surround (group busses) O. Output of monitor path; normally sent to stereo recorder and studio monitors R Alternate path to stereo mix (recorder and monitors) Q. Output to surround (or other) 8-channel recorder R. Return path from group to multitrack S. Return path from multitrack tape send to monitor return While the foregoing description may make the I/O module seem needlessly complex, remember that paths R and S and transfer switches (J5, D, F, G, K, and N) are normally operated in the positions shown. These "extra" controls allow the engineer to reroute signals for a wide variety of applications, including "bouncing" tracks on the multichannel recorder or overdubbing vocalists or instrumentalists. (Bouncing refers to an operation in which two or more tracks are combined and reassigned to a new track.)
APPLICATIONS The tracking session The best way to gain an appreciation of the in-line console is to observe it in operation. Figure 9-13 shows a single I/O module as it would be set up for a tracking session. In a typical application, there would be as many I/O paths as there were music sources in the studio. There is only one adjustable element in the channel path, and that is the channel fader, which is used to set the proper level going to tape. As many tracks as you care to use at this point in your project are available within the limits of your multitrack capability. If for any reason you preferred to equal-
Applications
125
Figure 9-13. I/O module set up for tracking session.
ize a track (for instance, a low frequency hum from a guitar amp) prior to going to the multitrack recorder, you could swap the assignable filter set to the channel path and correct the problem at this point. In the control room, you are primarily auditioning the outputs of all of the monitor channels. As you can see, the various signal processing modules have been switched to the monitor path, and you can make a "wet" monitor mix, complete with reverberation, equalization and limiting. None of this will be
126
Chapter 9
reflected in the multichannel feed—only in the stereo monitor mix in the control room. If you wish, you can record this mix for immediate playback and future reference. (Note: A "wef mix is one that includes reverberation, and by extension various equalization adjustments, as opposed to a "dry" mix, which is composed of basic tracks.) In a normal production environment, your experimental stereo mix would be reviewed by artists and producer, and any decisions for overdubbing or adding new tracks would probably be made at this point. Using the switching capability of the I/O modules and the added tracks available on the mutichannel recorder, these changes could be made, along with any new tracks the producer or artist may desire. When all of this has been done the project is ready for a final mix session.
The mixing session The console setup is shown in Figure 9-14. The multichannel recorder outputs are all assigned to the monitor path, and all signal processing modules are likewise assigned to that path. In addition to gain control and panning, equalization and dynamics control can be carried out at this point with those modules assigned to the monitor path of each I/O module. Reverberation sends via the auxiliary send busses can be returned to open faders on the console and can likewise be assigned through the channel return path. With this preparation, a final two-channel mix can be made. Alternatively, a surround sound mix can be made via the group busses.
Applications
ujUicoO
CC Ol 0 . CO
f ?
<
2^9
Bzö-zorB
Figure 9-14. I/O module set up for mixing session.
127
Chapter 9
128
In-line consoles vary slightly in nomenclature, layout, and specific functions, and each new design you encounter will take some getting used to. Figure 9-15 shows views of a typical I/O module. Here, the upper, middle, and bottom sections are shown side by side. Numerical markers in the figure refer to the legends given below: 1. LINE/MIC INPUT SWITCH 2. PHASE REVERSAL 3. INPUT SENSITIVITY 4. LF CUT 5. TAPE RETURN TRIM 6. LINE/MIC AND TAPE RETURN SWITCH BETWEEN CHAN AND MON PATHS 7. HF/LF EQ CONTROLS 8. SWAPS EQ TO MON PATH Top section
Mid section
Bottom section
Figure 9-15. Views of an I/O module. (Courtesy Soundcraft)
Applications
129
9-11. PARAMETRIC EQ (ALWAYS IN CHAN PATH) 12. PARAMETRIC EQ IN/OUT 13. AUX 1/2 SENDS (CHAN PATH ONLY) 14. AUX 1/2 SENDS (CHAN PATH ONLY) 15. AUX 3/4 SENDS (AVAILABLE IN MON OR CHAN PATTH, BUT NOT IN BOTH AT ONCE) 16. AUX 3/4 SENDS TO MON PATH 17. USES AUX 3/4 KNOBS TO SEND SIGNALS TO AUX 5/6 18. AUX 7 (STEREO) IN CHAN PATH 19. AUX 8 (STEREO) IN MON PATH 20. MIX B (SOURCE FOR MON PATH: UP-FROM TAPE DOWN-CHAN 21. MIX B - ROTARY FADER FOR MON PATH 22. CUT: MUTES MON PATH 23. PAN CONTROL FOR MON PATH MIX OUTPUT 24. PFL ( PRE-FADE LISTEN) FOR MON SIGNAL 25. MAIN FADER FOR CHAN PATH 26. CHAN SIGNAL MUTE 27. PAN CONTROL FOR CHAN PATH 28. OUTPUT ASSIGN MATRICES 29. SOLO FUNCTION FOR CHAN PATH 30-31. CHAN SIGNAL INDICATORS Figure 9-16 shows a view of an inline console with 32 I/O modules and a 24-input expander section. Note that this console also provides group feeds to both an 8-channel as well as the 2-channel recorder.
Chapter 9
130
Figure 9-16. Photo of an in-line console. (Courtesy Soundcraft)
CONSOLE AUTOMATION Today, console automation is digitally controlled, and it functions pretty much in a fool-proof manner. If the console is analog, then automation may be fairly limited in the scope of its operations; perhaps only the input faders
Console Automation
131
and subgrouping of output faders will be controlled. For many applications this will be sufficient. In normal operations, console automation is used primarily in mixing sessions where fader positions may be stored. On replay, the faders move under servo control, and the previous mix session is duplicated with the engineer never touching a fader. In the UPDATE mode, the engineer can switch a fader to UPDATE mode and change its position for any length of time. This may be done to correct a mistake the first time around—or it might represent a change that artist or producer wishes to experiment with. After the change has been made, the UPDATE function is disengaged and the previously encoded fader position information is restored. By contrast, a digital console can be automated so thoroughly that virtually all functions can be stored, recalled, and updated. It is this flexibility during the mix session that may be one of the most compelling reasons to invest in a digital console. Details of automation of an analog console are shown in Figure 9-17. In this system only the fader positions are controlled. A view of the console fader is shown at A, and a simplified control diagram is shown at B. Note the three buttons at the lower right of the fader. When the OFF button is depressed, automation is disengaged for that fader. When the fader's WRITE button is engaged, the fader positional data will be encoded. On replay, the READ function is engaged, and the original fader positions will be "re-enacted" Fader control (capadtive sensor)
1(6)
1
Audio out O
JciT
Audio in O
"Ti
1—LL-^
1(2) DCreferf vottage
H^
Linear motor
(5)
1
(3)
Comparator
(4)
T P ead TV
Vrite
jUp o •datd
o«
To From CPU CPU
To CPU
Figure 9-17. Console automation. View of fader (A); signal flow diagram (B).
Chapter 9
132
under the control of the automation system. In some touch-sensitive automation systems, the UPDATE function can be engaged merely by touching the fader. This will be sensed through a capacitance change at the fader which puts the system into the UPDATE mode. When the engineer releases the fader, the UPDATE mode is also disengaged and the original encoded data takes over. Looking at Figure 9-17B, you will see that the fader position (6) is linked to the audio potentiometer (1) and to a DC voltage potentiometer (2). Both of these are controlled by a linear motor (5) which is determined by the output of a comparator. The comparator's control signal attempts to reduce the error between the actual fader position and the digitally encoded position. Stated differently, the motor-driven fader position will track the intended fader position within the operating accuracy of the system. A summary of these basic functions is shown in Figure 9-18. In many systems, multiple passes can be individually encoded and stored so that playback comparisons can be easily made. A final composite of multiple mixes can then be made.
A
B
WRITE MODE
READ MODE
Position data from CPU sent to comparator
Position data from faders sent to CPU
C UPDATE MODE Automated control
Manual control
Faders
30 ? Position data to and trom CPU sent to comparator
Figure 9-18. Console automation. Descriptions of write (A), read (B), and update modes (C).
133
Metering
Systems of the type discussed here rely on time code data stored on the muhitrack master tape. The same time code data is stored in the automation system, and can be written to a floppy disc file, which then may be stored with the master tape itself for later sessions.
METERING Traditionally in the United States, the VU (volume unit) meter has been the most widely used device for measuring operating levels in audio transmission systems. The face of the meter is shown in Figure 9-19A. Two scales are shown; one in decibels and the other in percentage. The upper scale is normally found on consoles intended for recording, while the lower scale is often found on broadcast consoles, where percentage values are related to signal modulation capability. In Europe, various kinds of PPM (peak program metering) meters are widely used, and the model shown in Figure 9-19B is the lEC Type IIa, or socalled BBC, version. The various PPMs are electronic devices, while the VU meter is a passive device. A
B
— PPM ballistics • VU meter ballistics
0.1
0.2
0.3
0.4
500 Hz pulse length (seconds)
Figure 9-19. Metering. View of VU meter (A); view of peak program meter (B); meter ballistics (C).
134
Chapter 9
The term meter ballistics refers to the dynamic response of the meter under program conditions, and there are two important aspects of this: Meter rise time: the time required for the meter to reach 63 percent of its maximum deflection when full-level steady-state signal is presented at zero (100 percent) level. Meter fall-back time: the time required for the meter to fall back 63 percent toward its rest point after the signal is discontinued. These values for both VU and peak reading meters are given:
VU meter PPM meter
Rise time: 0.3 sec 10 msec
Fall-back time: 0.3 sec 4 sec
On dynamic program material the PPM meter will read somewhat higher than the VU device, since it responds more quickly. It also holds the peak value longer than the VU meter does, and this makes it easier to read. When both meters are calibrated sso that "zero" on the VU corresponds to marker number 4 on the PPM for steady-state tones, normal speech program peaks on the PPM meter will tend to read an average of 6 dB higher than on the VU meter. This difference is commonly referred to as lead factor. The scale on the VU meter covers a range of about 23 dB, but the bottom portion of the scale is difficult to read and is not very useful. By comparison, the scale on the PPM meter has a range of 24 dB (4 dB between each numbered marker), and that scale is evenly distributed throughout the physical range of the meter. An engineer using the PPM meter can thus determine operating levels more accurately than can the user of a VU meter. In actual practice a good engineer may use either type of meter with equally good results.
Normal calibration values The VU meter will read zero when a steady-state signal of 0.775 volts rms (0 dBu) is applied to it. In most applications, a 4-dB attenuator is placed ahead of the meter so that a level of+4 dBu will read zero. By convention, the same steady-state +4 dBu signal will be calibrated to read on the PPM devices as shown in Figure 9-20. The meter types we have discussed so far are normally associated with consoles. Digital recording devices may also have built-in metering enabling the engineer to set levels properly. The design shown in Figure 9-21 is typi-
The Console and Its Interconnection to Peripheral Gear
lEC Typet
-20 dB 1 PPM (FRG. OIRT)
lEC Typella
1 2 1 1 PPM (BBC)
-12 -8 i . i lEC Type lib PPM (EBU) -7 -20 -10 1 1 1 VU Meter
0 dB M i l l
-5 dB 1 1
-10 dB 1
1
3 1
4 1
-4
0 1
6 1
5 1
1 >
.
1
+4 t
+8 1
135 +5 dB 1 11
7 1
1
1
+12 dB 1
1 -5 1
-3 1 t
0 1
•
1
••3 dB 1 1
1
Alignment (reference) level
J
Normal maximum level
Figure 9-20. Metering. Comparison of types.
cal of stand-alone digital meters. Such meters can be configured to read the instantaneous value of a digital signal, and the exact level of a single digital sample can be observed. The normal steady-state calibration point for digital metering is -20 dBFS (20 dB below^/w// scale modulation). In summary, when the console outputs are calibrated at values of+4 dBu, the digital recorders will be set at -20 dBFS, and this condition requires that the console output sections have 20 dB headroom over the +4 dBu reference point, requiring a 24-dBu output capability. 41^ Mm ^•m'4A_--»m'»m<»'»4n-nm'-i»M'-^^
lOOQQQQQQOQOflOOQflDOODDODODDDOODDDDUm DODODDOOtlDOOODQQIlTOOl
m
Figure 9-2L View of a two-channel digital meter. (Courtesy Dorrough Electronics)
THE CONSOLE AND ITS INTERCONNECTION TO PERIPHERAL GEAR The modem control room is very likely contain some video equipment, and it is important to ensure that the audio signals to and from the console will be virtually immune to any electrical noises that video equipment can generate. Specifically, the sync signals used in video must not find their way into lowlevel audio lines.
136
Chapter 9
Figure 9-22A shows the principle of a balanced audio Hne. An outer shield contains a pair of conductors, and the signal from each balanced output is fed to the balanced input of the next item in the audio chain.
A \L
Ground loop Ground loop in unbalanced line. Ground current travels through shield and returns through inner conductor, causing noise.
Ground loop Ground loop in balanced line. Ground current travels through shield and returns through tx>th inner conductors, thus cancelling at the following input. Figure 9-22. Signal transmission. Balanced line (A); unbalanced line (B); noise induction, unbalanced line (C); noise induction, balanced line (D).
Most home high fidelity gear is unbalanced, as shown at B, but every now and then a piece of home-type audio gear finds its way into the control room. Usually, a producer or artist will bring in a CD player or Cassette recorder, and there may be noise or other interface problems. Any kind of annoying "hum" is usually the result of di ground loop. This occurs when a chain of electronic devices exists in the vicinity of some device whose power supply produces an external alternating magnetic field. The
Section Head
137
field can induce minute currents into the various cabling between electronic elements, and induced noise may result from what is called a ground loop, A balanced system is far less susceptible to the effects of the ground loop than an unbalanced system. As shown at C, the unbalanced line, with its single inner conductor, may pick up the interfering signal as shown. By comparison, the balanced line will pick up the signal equally in each inner conductor. Since the program signal is transmitted "push-pull" (positive in one line and negative in the other), the induced signals, which are "push-push," will be cancelled, as shown at D, The types of cables shown in Figure 9-23 are normally used in the studio. Microphone cables are shown at A, and the standard XLR pin configurations are indicated. Balanced signal transmission: Microphone signal transmission: XLR-F ?
•
XLR-M
_ .
- ^
2
—f^z
^
W1
Signal
1 - shield 2 - signal + 3 - signal -
1 - s.*
Line level signal transmission (lift shield at sending end): XLR-M
XLR-F
•©
Signal
1 -shield 2 - signal + 3 - signal -
1/4-inch plug Ring Sleeve
Sleeve Ring ^Tip
Tip Off Tip to tip Ring to ring Sleeve to sleeve
RCA output Sleeve
Unbalanced signal transmission (from consumer-type gear):
Tip
Tip to pin 2 Sleeve to pin 3 Pin 1 (shield) unconnected
1/4-jnchplug Sleeve I TipOCZZd Tip to pin 2 Sleeve to pin 3 Pin 1 (shield) unconnected
cr
XLR-M
cr
XLR-M
^
Figure 9-23. Control room interconnecting cables.
^
^
'
138
Chapter 9
For line level signals originating at the console outputs, the same cable type can be used. But should there be any problems, the wiring configuration shown at B may be necessary. Here, the shield connection has been lifted at the sending end. This will not interrupt the signal, since it is balanced and is being fed through the inner conductor pair. It will however break the continuity path of any induced noises in the shield. The so-called ring-tip-sleeve patch cord (C) is also balanced and widely used in the control room for making various changes in console routing. The cables shown at D and E may be used to interface consumer gear to the console inputs at line level—but watch out for various hum problems. Today, many consoles are outfitted with Phoenix or Euroblock type receptacles for multichannel tape sends and returns. These connections are very positive, but are small and difficult to work with. These installations should be made only by experienced technicians.
Chapter 10 MONITOR LOUDSPEAKERS
INTRODUCTION What is a monitor loudspeaker? If you tour a modem recording facility you will see a wide variety of loudspeaker types used in the judging of recorded product. You will see popular high fidelity models along with relatively unknown smaller models. But the chances are very good that every loudspeaker you see will reflect conventional technology in cone and high frequency dome drivers and, in the larger systems, horn/compression driver units. You will also notice that the majority of these loudspeakers make use of ported low frequency sections. Very rarely will you find an electrostatic loudspeaker or any of the dynamic planar models. Engineers and producers use all of these products as they make judgements of timbre and balance at every stage of postproduction, and—given the wide variety of monitoring conditions—it is remarkable that the end product is as consistent as it is. A relatively small number of loudspeaker manufacturers have developed reputations in the field, and their products tend to dominate the field. It is their attention to details such as consistency from unit to unit, overall quality control, and ruggedness that set these manufacturers apart from the others. There are three distinct categories of loudspeakers used in recording operations today:
Large format, flush-mounted models in control rooms These systems are very often custom designed for their applications and normally consist of a pair of 15-inch (380 mm) diameter low frequency (LF) drivers and one or more high frequency (HF) compression driver/horn elements. They are normally two-way or three-way in design. A typical installation is shown in Figure 10-1.
Mid-size bookshelf models These loudspeakers are portable and may be moved from one environment to another within a given recording facility. They are usually three- or four-way in design, and the components are normally cone drivers for LF and mid-frequency (MF) drivers and dome drivers for HF. The LF driver diameter is generally 10 inches (250 mm) or 12 inches (300 mm). Many of the more recent
140
Chapter 10
Figure 10-1. Typical view of custom monitor loudspeakers flush-mounted in a control room.
Introduction
141
designs in this category are self-powered; that is, they have built-in ampHfiers and electronic dividing networks. A typical non-powdered model is shown in Figure 10-2.
Figure 10-2. A 3-way cone/dome monitor system. (Courtesy JBL Professional)
Small, so-called "near-fieW designs These are intended primarily for close-in program audition at normal home or automobile listening levels. The intention is to give the producer and mastering engineer an idea of how a given mix will sound when played back over a system representing the lowest cost category. The system shown in Figure 10-3 is typical.
Figure 10-3. A small monitor system. (Courtesy JBL Professional)
The models shown in Figures 10-1 through 3 are ported; that is, there is an opening in the enclosure that allows the LF driver to vent, or port, to the outside. Through careful porting and enclosure resonance (tuning) adjustment, the output capability of LF drivers can be increased in the range of the
Chapter 10
142
tuning frequency and immediately above that frequency. This tuning frequency determines the lowest usable frequency range of the system, and it is customary to filter the drive signal below enclosure tuning in order to avoid over-excursion of the LF driver. Figures 10-4 through 10-6 show electrical details of the three loudspeaker types mentioned above. HF horn and driver Amplifiers
H>
Two LF drivers in parallel
Figure 10-4. Electrical diagram, control room monitor.
Figure 10-5. Electrical flow diagram, 3-way system.
Relevant Specifications for Monitor Loudspeakers
Highpass
C\
\J
143
1 D 1
HF driver
i
' "i •
Lowpass
LF driver
Figure J0-6. Electrical flow diagram, small monitor.
RELEVANT SPECIFICATIONS FOR MONITOR LOUDSPEAKERS We now present typical specifications for the three classes of monitor systems discussed above: Large studio system
3-way bookshelf
Small monitor
Sensitivity (1 W @ 1 m): LF: 96 dB HF: 105 dB
92 dB
87 dB
Impedance (nominal): LF: 4 Ohms HF: 8 Ohms
4 Ohms
8 ohms
300 watts
50 watts
116dBSPL
104dBSPL
90° by 40°
100° by 100°
Power rating: LF: 800 watts HF: 50 watts Peak output @ 1 m: 125dBSPL Nominal dispersion (horizontal and vertical): 90" by 40°
Chapter 10
144
For accurate monitoring we require loudspeakers that excel in the following performance categories.
FLAT FREQUENCY RESPONSE There are two aspects here, axial response and angular coverage. Axial response is measured in a direct free field (one having no reflections), normally along the principal axis of the loudspeaker. It is shown as the output amplitude versus frequency measured at a fixed distance along the primary listening axis of the loudspeaker. The angular response shows the coverage of the system in both horizontal and vertical planes Figure 10-7 shows the on-axis response of the three-way bookshelf model shown in Figure 10-2. Also shown in the figure are the individual contributions of the three elements. The on-axis response is uniform over a fairly large frequency range. This is essential for accurate monitoring, since primary judgements of sound quality are made on the basis of the first-arrival sound at the listener's ears.
s!
II /
«5
/
f
tit 7
V
V
V 1 \ 1 1
\\\
«0
hA
TIT^'^
J
li \vJ
W— UilU
1i
/
III
]
1
LLf
min
10000
\ ': 20K
Figure 10-7. On-axis response, with individual component contributions, for a 3-way monitor.
Figure 10-8 shows the beamwidth measurements of the three-way system. Beam width is defined as the included angle over which the loudspeaker's response does not vary by more than 6 dB. We refer to this as the -6-dB beamwidth response. We can see that the horizontal beamwidth is fairly uniform throughout the MF and HF range, with a general narrowing trend above about 5 kHz. At LF the horizontal coverage angle widens considerably. The vertical beamwidth for the systems shows significant narrowing of the coverage angle at each of the crossover frequencies between adjacent components. This cannot be avoided in system design, but the deviations here are minor enough not be problematic in most professional listening spaces.
However, if you are listening to the loudspeaker system in a fairly
Flat Frequency Response
360 300
145
1 1 r"'| 1 I JJUTJ I I I !
c!
200 ^180
1 >!
M 3 iH
S 140
SP
DS
f QF = ^ /
OTi
60 ^ 40
M
^
/
\
\
pd
O-Horizontal r • -Vertical I
30
1
1
F^
- 100 « 80 o S 5 2 ?
I t 1 1I 1 1' '
1
^
'
1 I ' I I
T
/
I—
10 50
100
200
500
1000
2000
10k
5000
20k
Frequency (Hz) I I I
CQ
1 i
1 r
1 1 1
1 1
1 1
i l l
1 1
1 1 1
X
lo 20
50
100
200
500
1000
2000
5000
10k
20k
Frequency (Hz)
Figure 10-8. -6-dB beam width and directivity index for a 3-way monitor.
reflective space, the reflected sound field you hear will not be as smooth as the first-arriving direct sound field. The bottom portion of Figure 10-8 shows the directivity index (DI) of the system. You can see that the two "dips" in the vertical coverage angle only cause a shift of about 2 or 3 dB in the DI, which is negligible. Overall, this system maintains a uniform DI (±1.5 dB) from 500 Hz to 10 kHz, which is considered excellent. You can maintain a very flat, uniform direct sound field over a reasonably wide seating area. If the listening environment has been designed according to modem trends in studio design, the reflected sound field variations will be minor enough that they can be ignored. Many engineers tend to get into trouble when they set up a studio in a living room environment where room reflections have not been controlled. They quickly find out that there is more to putting a studio together than merely assembling recording gear. These options are shown in Figure 10-9.
146
Chapter 10
Dead room
Live room
Listener
Room deoendent Loudsoeakef dependent
/ V - r\ i^^ —"^ _>V>r'\//\:f\j
Y \
—
--
-
Live room Dead room r 100
1 Ik Frequency (h
1
1
1 1 10k
Figure 10-9. Interaction of room and loudspeaker. A live room (A); a ''dead" room (B); observed variations in LF response.
TIME DOMAIN ACCURACY The primary study of the audibih'ty of time domain errors in loudspeaker response was made by Blauert and Laws (1978). These studies developed the data in Figure 10-10, which show the audibility of group delay deviation in loudspeaker delay response. An ideal loudspeaker will have no group delay over its normal frequency passband. All loudspeakers however will have some degree of group delay deviation, and in modem designs the deviation is
lk
2k
Frequency (Hz) Figure 10-10. Blauert & Laws criteria for audibility of group delay in loudspeaker systems.
Distortion in Monitor Loudspeakers
147
relatively small. In the bookshelf models composed of cone and dome drivers, the group delay is below the detection threshold, and we can generally consider it to be a design goal already met. Actually, most problems involving delay factors in monitoring show up in stereo or multichannel listening, where distances from loudspeakers to listeners tend to vary and be unequal. Weil discuss this in a later chapter.
DISTORTION IN MONITOR LOUDSPEAKERS At low operating levels a good monitor loudspeaker may have distortion components measured in the 1-percent range or lower. As drive power is increased the distortion will gradually rise. It is customary to show second and third harmonic components of distortion at a fixed output. Figure 10-11 shows typical measurements for a three-way cone/dome system. In this graph, the distortion components have been raised 20 dB for ease in reading. Over the range from about 100 Hz to 5 kHz the distortion is more than 40 dB below the fundamental and is thus in the 1 -percent range. At about 8 kHz there is a peak in distortion that is about 3 percent.
500 1000 Frequency In Hz
2000
5000
10000
20000
Figure 10-1L Harmonic distortion in a 3-way bookshelf monitor (distortion raised 20 dB).(Courtesy JBL Professional)
As a practical observation, a level of 100 dB at one meter corresponds to listening level of 94 dB at a distance of 2 meters, or 6.5 feet, and this is a normal listening distance. A pair of monitors operating in stereo would raise this level by about 3 dB, so you can appreciate the fact that distortion is quite low at relatively loud playback levels
Chapter 10
148
POWER COMPRESSION Power compression results from excessive heating of the drivers in a system and is noticed primarily in the LF portion of the system. When the driver temperature rises over a long period of heavy use, the increase in voice coil resistance becomes significant and actually changes the response of the loudspeaker. This can be clearly seen in Figure 10-12. The response of the LF driver at normal ambient temperature is shown in the upper graph and the response of the heated driver is shown below. Here, the temperature has risen from 80 to 300 degrees Fahrenheit. There has been a slight reduction in mid-band sensitivity, and the LF response has risen in relation to the mid-band. The effect, which is referred to as a low frequency alignment shifi, is quite audible.
Impec ance
/ ^
J
\ y y^
K,\ s.
1 tn
// \
Am jlltude
" X \ J 20
_i_i_. 50
i
i
100
. ^ . . . . j . . . . ! . - ,,-.i -J 200
500
1 t....
1000
i
2000
i.....i.... , i 5000
\
I 10000
200000
Frequency In Hz
imped ance
/
v^
/ " \ .
*/ 75
> \
:/ :
Am >iitude
\
65 E..i 20
1 1..... 50
1
L... \
100
I
L.n. \
200
1
1
1
i.n.i
500
1.... ;
i
1000
i....
i i....i-.a... ^ . i
2000
5000
i
...i..jtJ
10000
200000
Frequency In Hz
Figure 10-12. Alignment shift in LF driver. Response at 80 degrees Fahrenheit (upper curve); response at 300 degrees Fahrenheit (lower curve).
Electrical Considerations
149
ELECTRICAL CONSIDERATIONS Impedance You should always follow manufacturer's recommendations in choosing a power amplifier for a given monitor model. The loudspeaker system itself has a stated power rating, and this usually an excellent guide to amplifier selection. The loudspeaker also has a rated impedance value, which is an average of its load impedance magnitude over the operating frequency range. If a manufacturer specifies a given monitor as having an impedance of 8 ohms and a power rating of 250 watts, then you should choose a power amplifier that has output capability of 250 watts into a load of 8 ohms. Be careful; you may find an amplifier that will deliver 250 watts into a 4-ohm load—but that will not be sufficient for your application. Typical impedance response for an 8-ohm system is shown in Figure 10-13. Note that the impedance varies with frequency and that the rated value of 8 ohms is an average value. Normally, the loudspeakers you will encounter in recording applications will have nominal impedance values of either 8 or 4 ohms. Impedance magnitude of a 3-way monitor system
E o ^31.61
'S
10
Q.
E
8 ohms
I 20
40
>-^
1 1 80 100
1
i
1 1
i
i
200
400
8001k
2k
4k
1 8k 10 k
20 k
Frequecy (Hz) Figure 10-13. Typical impedance magnitude for a 3-way bookshelf monitor.
Wiring and interconnections Good engineering practice calls for placing the power amplifier as close to the loudspeakers as is practical in order to reduce power loss in the interconnecting cables. One of your goals is to keep the damping factor of the amplifier, as "seen" by the loudspeaker, as large as possible with the intended load. Damping factor is defined as: Damping factor = Ri/Rn
10.1
where Ri is the nominal value of the loudspeaker load and R^ is the effective source, or generator, resistance of the amplifier. Any added resistance in the interconnecting wires adds to the generator resistance, thus lowering the damp-
Chapter 10
150
ing factor. See Figure 10-14 for a graphical definition of damping factor. Damping factors in the range of 200 to 250 are common. As a general recommendation, we recommend that you use wiring equivalent to AWG (American Wire Gauge) #12 or lower, keeping the runs between amplifier and loudspeaker as short as possible •O 'source
load Constant ^ . ^ voltage f ' Y
j
source
Damping factor:
Moad 'source
Figure 10-14. Damping factor.
Multi-amplification Many two-way studio monitors are biamplified; that is, the two sections of the system are separately powered. Some large systems may be triamplified. The benefits of this division of powering can be seen in Figure 10-15, where two amplifiers rated at 100-watts output may be equivalent to a single amplifier rated at 400-watts output. The primary advantage of multiamplification is the reduction of overall distortion at high signal output levels.
SYSTEM EQUALIZATION Most flush mounted large control room monitor systems are electrically equalized to conform to a preferred equalization curve. This is usually done by inserting a set of one-third octave equalizers just before the electronic frequency dividing network that feeds the power amplifiers. In many cases a device known as a loudspeaker controller can be used to provide all necessary elements in loudspeaker frequency division, delay compensation, as well as equalization. The process of carrying out such equalization is fairly complicated and is best left to professionals who have a reputation for "fine tuning" monitor systems.
151
System Equalization HF waveform
Input
HF waveform j/\ 100 W J ^ I average '—vj power
100-watt amplifier HF
Input Dividing networ1<
LF wavefomi 100 W average power
'A
100-watt amplifier
Figure 10-15, Loudspeaker driven by a single amplifier (A); loudspeaker biamplified (B).
Because of room boundary conditions in the vicinity of the loudspeakers, there may be variations in LF response that can be smoothed out by equalization (the less equalization, the better), and there is usually a slight HF rolloff that most engineers, artists and producers prefer at very high frequencies. Some of the acoustical "room curves'' that may be used are shown in Figure 10-16.
M
F
5dB
curve (fi m and VIdeoposi
^
\ \
fÄsuggesth ome playback curve | lÄsijggestec1 controlroomcu rve 1 ^""^
: 25
50
100
200
400
800
1.6k
3.15k
6.3k 12.5k
20k
One-third octave center frequency (Hz) Figure 10-16. Suggested room equalization curves for various listening applications.
152
Chapter 10
The "X-curve" is used in film and video postproduction and matches the playback curve in motion picture theaters with a rolloff of 3 dB per-octave above 2 kHz. It is a uniform performance standard worldwide. The middle curve has been suggested as a good compromise for home playback of both classical and popular program material. It is flat to 5 kHz and rolls off about 2 dB per-octave above that point. The bottom curve is suggested for control room equalization; it is flat to 8 kHz, rolling off about 3 dB per-octave above that point. These various suggested playback curves are not rigorously defined below about 50 Hz, although most users choose maintain response curves flat to about 30 Hz.
SUBWOOFERS Many control rooms are provided with dedicated subwoofer systems. These are LF systems designed to provide response down to the 20 to 25 Hz range. As such, they require space that is not readily available in the wall or soffit locations used for the principal monitors. They may be conveniently located at floor level in a comer. There will usually be a pair of such systems, and the crossover point with the main monitors is normally in the range of 50 to 60 Hz. Multiple 18-inch (460 mm) diameter drivers are often used.
Coaxial Monitor Loudspeakers
153
COAXIAL MONITOR LOUDSPEAKERS Many engineers express a strong preference for coaxial monitor loudspeakers. The most highly regarded of these models are more than 50 years old in basic concept. As a group, they have the advantage of producing sound from a virtual point source. The British Tannoy Dual-Concentric model is shown here in section view.
Figure 10-17. Section view of Tannoy Dual Concentric loudspeaker. (Courtesy Tannoy)
Chapter 11 ANALOG MAGNETIC RECORDING AND TIME CODE
INTRODUCTION Magnetic recording has shaped the evolution of contemporary recording in such a fundamental way that it is difficult to think of music creation without the benefits of multitrack capability, editing, and overdubbing. The principle of magnetic recording has been known since the late 19th century, but it wasn't until the late 1920s that the benefits of high frequency (HF) biasing were discovered. Further developments were made in the 1930s, and after World War II refinements came quickly. By the 1950s tape had supplanted the 16-inch acetate disc as the primary storage medium of choice for both broadcasting and recording industries. As recorders and tape quality continued to improve, multitrack capability grew quickly to 16 and 24 tracks during the 1970s, profoundly shaping a new generation of pop and rock music. Even in the present age of advanced digital recording technology, analog tape recording has continued to reach new levels of performance, and there are still many engineers, young and old, who prefer to lay down their basic tracks in the analog format.
THE BASICS OF MAGNETIC RECORDING The basic elements of a tape recorder are shown in Figure 11-1. As shown, the system is in playback mode, and erase and record functions are disabled. When the machine is put into record mode, both erase and record switches are engaged. The erase head then removes any remnant signal on the tape, and a new signal is laid down by the record head. The AC bias oscillator feeding the erase and record heads produces a signal in the range of 100 kHz or higher. A relatively small amount of bias current flows through the record head along with the audio signal to be recorded. Figure 11-2 shows the combination of bias and audio signal which is applied to the record head. Note that the bias component is much greater than the audio component.
The Basics of Magnetic Recording
155
Record bias adjust
Audio input
Audio output
Playback
z^"^' tt
R-^
^c^
kvys\\\v\\\\^^^
Tape motion
Figure 11-1. The basic elements of a tape recorder. Machine is shown in playback mode.
Signal
HFbias Signal plus bias
Figure 11-2. The input audio signal plus HF record bias.
The rapidly changing bias signal in the erase head demagnetizes the tape before it reaches the record head. In the record head the effect of the bias is to linearize the recording process as shown in Figure 11-3. If there were no bias present (indicated by the "zero" curve in the figure), then the process would be nonlinear; only a distorted signal would be recovered from the tape. When the AC bias is increased (in this example, to a value of 350 oersteds) an input signal will be accurately recorded over the linear range shown. The linearizing function of the AC bias signal depends on its symmetry and its magnitude; if there is too little bias, the signal will be distorted—if bias level is too high, then HF audio signals may be partially erased. If the bias signal is not symmetrical, it will introduce a DC component, which increases the noise level of the recorded signal.
Chapter 11
156
1000 800
—
J
Magnitude of AC bias field (oersteds)
^ r ^ X X
I 300
1 400
600 400
- 1
Linear range
E 200 (D cr
Z
I \I
—^' f ^ 200
1
Input dc signal (oersteds)
Figure US. Transfer characteristic of tape with zero HF bias and with optimum HF bias.
HEAD CONSTRUCTION Figure 11-4A shows a perspective view of a record head. The structure of erase, record and playback heads is essentially the same; they differ mainly in their gap dimensions, as shown: -*^A/2|<-
^ POLE PIECES
Perspective view of head
Front view of mis-aligned head and tape Playt}ack head
m r^^rP^r^^^^^S^m^r^^ ^
Recorded wavelength
^ Gap__ length
Figure 11-4. Head-to-tape relationships. Perspective view of a 1/4-inch record head (A); azimuth error (B); wavelengths along the tape (C); output from playback head showing response nulls at multiples of gap length equal to recorded wavelength. {A courtesy of Ampex Corporation)
The Tape Medium
Erase head Record head Playback head
157 Gap length (fxm): 25-125 2.4-12 1.5-6
The heads are constructed of thin laminations of magnetically soft material, and the purpose of the laminations is to minimize eddy current (magnetic induction) losses in the material. Gap length (measured in the direction of tape motion) and its uniformity are essential to the proper performance of the heads. The record head's gap is roughly twice as large as that of the playback head, and the recording action takes place at the trailing edge of the gap. In the playback head, the entire gap length determines the reproduced signal level, and the gap must be small in order to "read" the HF signals on the tape. Both record and playback gaps must be absolutely perpendicular to tape motion. Any error here, in either or record or playback, is known as azimuth error. As shown in Figure 11-4B, if the skew of the gap is equal to one-half a recorded wavelength, then the signal will be canceled. For wide recorded tracks the requirement is for very stringent azimuth alignment, while for narrower tracks the requirement is less critical. Aside from azimuth errors, the playback head gap length determines the HF response limits of the playback function. As shown in Figure 11-4C, if the recorded wavelength is equal to the playback head gap length, then the output will be zero. In recording system design, the frequency of the first cancellation is well beyond the 30 kHz range and is of little concern when recording at 15 ips. As shown in Figure 11-4D, it is important to design playback head gap lengths so that system response limits and tape recording speeds are all compatible.
THE TAPE MEDIUM Manufacturing In the late 1930s a uniform formulation of plastic and gamma ferric iron oxide (Fe304) had been developed in which the oxide was interspersed throughout the plastic. This was replaced by a dense coating of ferric oxide on a backing of plastic in the 1940s, and typical dimensions for modem professional tape are shown in Figure 11-5. In manufacturing, a wide web of plastic base material (normally polyester) is fed into a mill in which a slurry of oxide, plasticizer, wetting agents, and lubricant is evenly deposited onto the web. The slurry is ideally a uniform mixture in which the size of the oxide particles has been carefully controlled. The oxide particles are acicular, or needle-shaped.
Chapter 11
158 oxide layer ( 1 7 M)
base layer (33 ji) back coating (2^)
Figure 11-5. Typical magnetic tape composition.
and the aim in milling is to ensure that the particles fall into a relatively narrow size range. If there are too many smaller particles the tape will be subject to print-through (transfer of signal between tape layers), which may cause slight pre- and post-echo of the signal when the tape is played. Directly following the oxide deposition stage, the coated web is exposed to a strong magnetic field oriented in the direction of motion so that the wet particles of oxide particles will orient themselves along the direction of tape travel. When the particles are oriented longitudinally they can be more easily magnetized, and the tape will have higher recording sensitivity. In the final steps of manufacture, the tape is transported through a drying oven and then cured for a period of time. The final processes are surface polishing and slitting of the web to the desired tape width. Early problems in uniformity of slitting and coating irregularity caused "drop-outs." These problems have largely been solved. Tapes dating from the 1950s and 1960s were usually made with acetate base material, which was unstable and absorbed moisture over time. The general conversion to polyester base material has resulted in a much more stable product. Even then, there are rare batches of manufactured tape stock that may have one problem or another. Over the years, tape has improved in both HF sensitivity at 20 kHz and dynamic range (A-weighted). The following table outlines these improvements, taking 3M Scotch 111 tape as a reference formulation: Formulation: Scotch 111 Scotch 202 Scotch 206 Scotch 250
HF sensitivity: OdB +1 dB +3.5 dB +4 dB
Dynamic range: Comment: 68 dB Reference formulation 72 dB "Low-noise" 75 dB "Low-noise" 79 dB "Low-noise/high outpuf
These improvements have resulted from better process control:
The Tape Medium
159
1. Low noise and reduced print-through (results from better control of oxide particle size distribution) 2. Improved HF sensitivity (results from finer surface polishing and denser packing in oxide layer) 3. Higher output (results from magnetic orientation of oxide particles and higher density in oxide layer) 4. Improved high-speed mechanical handling (results from matte back coating)
Sensitivity to bias setting An essential part of tape recorder alignment is setting optimum bias levels. Using modem tape formulations, bias is adjusted (for a tape speed of 15 ips) by maximizing the response output at 10 kHz, and then increasing the bias level so that the output drops by 3 dB. This degree of overbiasing at 10 kHz results in a minimum in 3rd harmonic distortion, as shown in Figure 11-6. Tape manufacturers provide detailed alignment recommendations for their products, and these should be carefully followed.
Figure 11-6. Biasing high quality tape for minimum distortion. (Data courtesy 3M)
160
Chapter 11
Care and storage of tape The following recommendations are presented for long-term care of tape: 1. Tapes should be stored tail-out, at proper winding tension, 2. Ensure that the work area is clean and free of small particles of debris. 3. Tape reels should be stored vertically at 40-90 degrees Fahrenheit and relative humidity of 20-60 percent. 4. All valuable archival tapes should be stored on flanged reels, not hubs. 5. Archival tapes should be inspected periodically and rewound to alleviate any tendency for the tape to adhere to itself. Periodic inspection (every two years or so) may also identify any deterioration that might call for immediate retransfer of the recorded program material. Tapes from the 1970s and 1980s have shown a tendency to develop mechanical "squeals" on playback due to delamination and pile-up of oxide on the heads. A common cure for this is to "bake" the tape for several hours at about 130 degrees Fahrenheit in a convection oven, and then allow the tape to return gradually to room temperature. The treatment is said to be good for a couple of weeks, during which time an archival retransfer should be made.
THE TAPE TRANSPORT Figure 11-7 shows the layout of a typical tape transport. In normal tape motion, back torque is applied to the feed reel to the left, and forward torque is applied to the take-up reel to the right. Following the path from left to right, there is an inertia idler whose purpose is to stabilize tape motion and reduce any irregularities in tape motion. Following this is a tension sensor which maintains uniform tape tension from the beginning of a tape reel to the end, thus ensuring uniform tape speed throughout the reel. As the tape passes through the head stack, it encounters a set of idler/guides which maintain proper alignment with the heads and damp out any slight motion irregularities. Then follows the capstan/pressure roller, which actually drives the tape through the mechanical path. On the take-up side, there is a tension arm whose primary purpose is to detect tape breakage. Should this happen, the transport will shut down in order to avoid tape spillage. The take-up torque is applied to the degree nee-
The Tape Transport
161
Feed reel
Inertia idler ^
^
Takeup reel
^ [Record
Capstan
Tension cutoff switch
O - Idlers/guides
Figure 11-7. Functional view of a typical tape transport.
essary to maintain a smooth wind on the take-up reel. For fast rewind or take-up, the capstan is released from the pressure roller and a set of tape lifters moves the tape away from the heads. The transition from one mechanical mode to another ensures gentle handling of tape. For instance, if you want to go directly from fast-forward mode to record mode, the tape motion will slow down, coming to a stop when the tape has reached a suitably slow speed, and then go directly into record mode. The above detailed requirements may seem simple, but only a fairly complex transport can provide them all. Figure 11-8 shows a photo of a modem 2-inch 24-track multichannel recorder, and Figure 11-9 shows a photo of a modem 2-track recorder.
Mechanical problems Modem tape transports are virtually problem-free, but that wasn't always the case. The major problems in early transports were: Flutter. Flutter, as the word implies, is caused by a fairly rapid change in tape speed in the range of 10 to 30 Hz. It is usually caused by any eccentricity or once-around dragging in a rotating idler or drive capstan. If there is insufficient rotational inertia in the feed idler, then tape motion irregularities caused by the feed reel may not be adequately damped. Wow. Again, as the name implies, wow is a very slow variation in pitch with a period of about one second or longer. It can occur when there is any binding of tape against the feed reel flange.
162
Chapter 11
Figure 11-8. Photograph of a modem multitrack recorder. (Photo courtesy Studer)
Scrape flutter. This problem can happen even on the finest transports and results when there is a very slight "violin bow" action as highly polished tape moves against a highly polished head surface. It is very high in frequency and cannot be heard as such. It often shows up as a kind of modulation noise when pure tones are recorded and played back. It can be minimized by placing small idlers with precision bearings between the heads.
Putting the Signal on Tape
163
Figure 11-9. Photograph of a modem two-track recorder. (Photo courtesy Studer)
PUTTING THE SIGNAL ON TAPE The magnetic playback process depends on magnetic induction of the recorded signal into the playback head. The induced output voltage from the playback head is sensitive to the rate of change of the magnetic flux on the tape, and this causes a uniformly recorded signal to increase 6 dB per octave with rising frequency. In order to attain a flat signal output, the playback signal must be rolled off 6 dB/octave over most of its frequency range. This trend can be seen in the playback voltage equalization curves shown in Figure 11-10. A detailed discussion of these processes is presented in Sidebar 11.1.
Chapter 11
164 Sidebar 11.1: EQ curves and fluxivity standards
Figure 11-10 shows several standard playback curves In professional use. The lEC curves for 7.5 and 15 ips and the AES curve for 30 ips are shown at A. As the tape speed increases fronn 7.5 to 30 ips there is a less need for HF response to be raised. This is due to the greater HF sensitivity of tape at higher tape speeds. In the United States, the older NAB (National Association of Broadcasters) curve (shown at B) is used for both 7.5 and 15 ips, while the AES curve for 30 ips is used internationally. Engineers often identify the inflection points in playback curves by stating the time constant of the transition frequency: 11.1
T = V27cf
The terminology has to do with circuit design and the selection of resistance and capacitance values. The following table details the various inflections points used the curves of Figure 11-10:
z
\
\
\
= 1 Jt-IEC7.Sip»ft9cm<») ~(2-IEC ISi|»(38ctTtf») |3-AESS0lp9(78(hi/s)
1 0)
1 a.
1J
..
E 1 :
E :'^*
^
t^i
2
^l Frequency (Hz)
•^
jN
^^^^ m
1 E 1 1
1 j i
\ hi \\\\ p in 1
^^^^ 1
: 1
<•
V 4 - r^=ftt
Fi©qüency(Hz)
Figure 11-10. Playback curves for professional tape formats- At A, lEC 7.5 ips (1), lEC 15 ips (2), AES 30 ips (3); at B, NAB 7.5 and 15 ips.
The Tape Recorder in Application
165
Standard:
LF time constant: (frequency)
HF time constant: (frequency)
NAB I EG (7.5 ips) I EC (15 ips) AES (30 ips)
3180 msec (50 Hz) No transition No transition No transition
50 msec (3180 Hz) 70 msec (2275 Hz) 35 msec (4550 Hz) 17.5 msec (9100 Hz)
Tape operating levels are defined by their surface fluxivity value in nanowebers-per-meter (nW/m). The original Ampex standard level was measured to be 185 nWb/m and is taken as the base reference in the following table: Fluxivity:
185 200 250 360 510
Description:
Level:
Old Ampex reference level "Rationalized reference level" New US elevated level Old European (DIN) level New European (DIN) level
OdB +0.7 +2.6 +5.8 +8.8
Measurement (nWb/nn) frequency: 700 Hz 1000 Hz 1000 Hz 1000 Hz 1000 Hz
You can see that the European reference levels, old and new, are about 6 dB higher than the U.S. standard. This is purely the result of metering practice. While U.S. engineers used the sluggish VU meter for level indication, the Europeans used peak-reading meters. In the discussion of metering in Chapter 10, you will recall that peak meters have an approximate lead factor of 6 dB relative to the VU meter. When European recording engineers align a recorder, the tape reference level is about 6 dB greater than in typical American alignment. As European engineers monitor recording level they will, because of the meter ballistics, tend to keep the indicated level approximately 6 dB below their American counterparts. But since they have calibrated the system 6 dB higher, the net recording level on the tape will be just about the same on both sides of the Atlantic. Alignment tapes: Standard alignment tapes are routinely used throughout the industry to adjust tape recorders. These tapes are carefully made and are expensive. A/ever make a copy of an alignment to use for routine adjustment; use the original—but take very good care of it. A more detailed description of alignment tapes and their use is given in Sidebar 11.2 on page 173.
THE TAPE RECORDER IN APPLICATION Overdubbing and Sel-Sync One of the most useful functions of a multitrack recorder is laying in another signal on a spare track in synchronism (sync) with previously recorded tracks.
Chapter 11
166
Ampex introduced the process in their three and four-track machines during the 1950s, and the process, along with a number of enhancements is very much in use today. The basic procedure is shown in Figure 11-11. In this case, tracks 1, 2 and 3 have already been recorded with instrumental tracks, and you want to add, or overdub, a vocal on track 4. To do this, the vocalist will need to hear the instrumental tracks over headphone in sync with the new track. Using the record heads for tracks 1, 2 and 3 as temporary playback heads, a monitor signal is fed to the vocalist. Erase
Record
Playback
Figure 11-11. Diagram showing operation of Sel-Sync.
Only track 4 is put into record mode, and a new vocal signal is added at the point indicated in the figure. Since the new track is added at the precise point of the playback of tracks 1 through 3, absolute synchronism is ensured. The ability to add tracks in this manner has both advantages and disadvantages. A singer who is ill at the time of a recording session can add tracks later, and this is to the benefit of the recording company and everyone else. On the other hand, a vocalist who isn't really prepared for a grueling studio session can fall back on the technique at a later date as a hedge against calling overtime for a studio full of musicians. You may have a question regarding the use of record heads as playback heads. They are not ideal, inasmuch as their gaps are too large for flat HF
The Tape Recorder in Application
167
response; but they are good enough to give the vocalist a clear signal picture of what is on tracks 1 through 3.
Punching in "Punching in" is a version of overdubbing in which only a small section of a vocal track, perhaps only a word or two, needs to be changed. It is done by actuating the erase and record heads just for a short time. To do this smoothly the recorder must be capable of going in and out of record mode without the usual "thump" caused by the sudden turn-on and turn-off of HF bias. There is the further problem of avoiding erasing any part of the previously recorded vocal track that you want to keep. As shown in Figure 11-12, it is important that the erase and record heads, because of their spacing, are actuated separately. This ensures that only the necessary portion of tape will be re-recorded over. The ramp-up and rampdown of the bias signal are required to eliminate any LF thump in the process. Gradual bias turn-on
AT:
Gradual bias turn-off
Distance between erase and record heads (Inches) Tape speed (ips)
Figure 11-12. Gradual turn-on and turn-off of erase and record head bias.
Automatic indexing and rehearsal mode In the process of making an insert into a previously recorded track, it is necessary for the performer to "rehearse" with the tracks already recorded on the master tape. Shuttling the tape back and forth between starting and stopping points is facilitated by automatic indexing. In this procedure, the engineer enters beginning and ending times on the tape counter. When the recorder reaches the end timing, it will recycle to the beginning and stand by for the next pass. The function is controlled by a SMPTE or MIDI time code track that has been recorded on the master tape.
168
Chapter 11
In the rehearsal mode the recorder will not go into record mode when the engineer punches in; instead, the monitor selector on the track to be modified will switch to input so that the producer and engineer can hear the new material to be added. When both are confident that the procedure has been rehearsed well enough to work, the engineer will go out oi rehearse mode. When this is done, punching-in will activate the record mode until the engineer punches out ofthat mode. While doing this, the engineer will be working at both the console and the recorder's remote control unit. The remote control unit provides a duplicate set of recorder mechanical, indexing and monitor functions which can be more conveniently positioned near the console. The remote control unit can be clearly seen in Figure 11-8.
OTHER MECHANICAL FEATURES Some tape recorders have features that are useful in editing:
Shuttle function Shuttle fucntiona allows the engineer to shuttle the tape back and forth at a chosen speed so that a given program segment can be found quickly.
Jog mode Jog mode allows tape motion to follow exactly the position of the shuttle knob. The engineer can thus use it for identifying the exact place on the tape where a splice is to be made.
Edit-play function In many editing jobs, it is desirable to remove long segments of unwanted tape. The easiest way to do this is simply to play the segments through the machine, allowing the tape to fall onto the floor or into a waste can. In this mode, the take-up reel torque is disabled.
Noise Reduction
169
NOISE REDUCTION As multichannel recording became the standard during the 1960s, many engineers and producers were concerned about the build-up of cumulative tape noise as tracks were "bounced" and combined in the process of carrying out a complex production. The final mastering to a two-track stereo format always represented another generation of tape transfer. While there had been several attempts over the years to use complementary compression and expansion in an effort to "fit" a wide range signal into a smaller range envelope, most of these systems were quite audible in their action. The cure was often worse than the problem. In the mid-1960s the Dolby Type-A noise reduction system was successfully introduced to the recording industry, and it became a worldwide standard for master tape copy exchange between licensors and licensees as well as a studio standard for multitrack recording and stereo mixdown. (During the 1970s, the dbx noise reduction system was widely used; however, it gradually fell out of favor from the mid-1980s onward. Telcom-4 was another successful system popular during the 1970s that is rarely seen today.) Figure 11-13 shows the layout of a compression/expansion noise reduction system. The basic signal flow diagram is shown atv4. As the input signal drops in level it is expanded before going to tape and thus is recorded well above the inherent tape noise level, as shown at B. During playback, the signal is expanded downward according to a complementary curve, again as shown at B, The final input/output process is shown at C, where the output expander restores the original signal, but with a varying noise floor below it. The problem traditionally has been the audibility of the noise floor shifts. The Dolby system operates on the principle of "least action;" for normal full-level recording the system has no effect on the recording process. Only when signals are of low level (-40 dB below reference) is the compression/expansion process brought into action. In addition, the process splits the audio range into four bands which are acted upon individually, further masking the audibility of the code/decode process. Overall, the Dolby-A process increases the dynamic range by 10 dB at LF and MF and by 15 dB at HF.
Chapter 11
170
Compressor (10 dB)
Tape recorder (60 dB dynamic range)
Expander (10 dB)
dB input
B
Input
Compressor out
Tape out
Expander out
-20-
-20
-40-
-40
-80-
Noise
Figure 11-13. Basic principles of complementary compression and expansion.
Figure 11-14 shows details of the Dolby-A system. The band-splitting is shown at A. The recording compression and playback expansion actions are shown at B, Details of the band splitting differential networks are shown at C, and the -40-dB compression curve is shown at D. The calibration tone at the head of each Dolby-encoded tape is used to match the playback function with levels established during recording .
Dolby SR During the 1980s Dolby Laboratories introduced Spectral Recording (SR), a very elaborate record/playback process in which variable, rather than fixed, filter bands are used. The result is a greater overall improvement in dynamic range with virtually no audibility in the entire encode/decode process.
171
Noise Reduction
80 Hz
B , ^ Input
3 kHz
Record (compression) function """^ ^—V recorder
Differential network
9 kHz
Playback (expansion) ^^^ function _^ ^ recorder / - ^ uutput
OutL
Out
Differential network
Detail of differential network
A
Filter 1
Compressor
Filter 2
Compressor B -O
Filters —^Compressor
Filter 4 —^Compressor
input
Output
H-40dB
Compressor response Figure 11-14. Details of Dolby Type-A noise reduction.
How good is analog tape recording? Adding all of the significant improvements in analog tape recording over the last half century, we have a remarkable medium for the creation of product in the studio. Figure 11-15 shows the signal space that can be accommodated by the best of contemporary tape stock and the best recorders available operating at 30 ips. The crosshatched area indicates the range in which the system handles high-level signals within a limit of 3 percent harmonic distortion and the noise floor measured on 1/3-octave centers. The midband dynamic range is about 90 dB.
Chapter 11
172 3% distortion level
-100
Noise floor
-120!
J L 50
25
L
100
I 200
J
500
L
1k
2k
J
5k
L
10 k 20 k
Figure 11-15. Signal space for a tape recorder operating at 30 ips.
Using Dolby SR, as shown in Figure 11-16, the overall dynamic range is increased by about 20 dB, accounting for an operating dynamic range at MF ofabout llOdB.
PLAYBACK TRACK WIDTH STANDARDS Figure 11-17 shows profiles of all the professional playback track widths that you are likely to encounter. A and B show 1/4-inch mono and stereo formats. Today you will rarely, outside of broadcasting, encounter a mono recording. C, D, and E are 1/2-inch formats that were widely used during the 1960s for multichannel recording. G shows the 8-track format introduced in the mid-1960s. F shows a fairly recent addition to the list: two tracks on 1-inch tape. It is used in high-performance analog mixdown to the finished stereo product and is favored by those engineers who want to maintain a pure analog path from studio to the finished master. H and / show the standard track widths for both 16- and 24-track recording on 2-inch tape. J shows a new format for high-performance analog mixdown to a surround master format. A time code track is provided at the bottom of the head stack for any sync purposes.
173
Playback Track Width Standards 3% distortion level
Noise floor -1201 25
J
50
\
L
100 200
I
500
I
Ik
I
2k
J
5k
L
10 k 20 k
Frequency (Hz) Figure 11-16. Signal space for a tape recorder operating with Dolby Spectral Recording (SR). Sidebar 11.2. Tape recorder alignment procedures Tape recorders differ somewhat in their specific alignment requirements because of varying details of mechanical design. However, the following sequence of procedures is fundamental to all machines, and every recording engineer should know how and why they are performed. The following procedure assumes a typical two-channel stereo recorder. A. Mechanical alignment and check-out: 1. Before subjecting an expensive alignment tape to an unknown transport, new or old, the engineer must first make sure that the transport is in proper mechanical order. Holdback tension of the supply and take-up reels should be within the range specified by the manufacturer. These should be measured in the normal play mode as well as in the fast forward and rewind modes. 2. The capstan pressure roller thrust (in the play mode) should be within the range specified by the manufacturer. For these measurements, a simple 5-pound spring scale available at a hardware store will be useful.
Chapter 11
174
^.,,^
T .234
;)^M i i 1/4" half track (.075 tracks on .169 centers)
1/2" two track (.200 In tracks on .260 centers)
1/2" tt>ree track (.10 tracks on .185 centers)
1/2" toiff track (.070 tracks on .130 centers)
1"two track (.47 tracks on .520 centers)
1" eight track (.70 tracks on .130 centers)
U=^
2" sixteen track (.070 tracks on .127 centers)
2" twenty-four track (.043 tracks on .084 centers)
2" eight track (.180 tracks on .188 centers)
Figure 11-17. Playback head track dimensions and spacing for professional format. 3. Using a roll of expenciable tape, the engineer shouW make sure that the machine operates properly in all of its transport mo(Jes anci that the tape travels over the head stack with a minimum of up an (jown motion or skew. 4. The transport shouW be thoroughly cleane(d with an appropriate solvent recommen(deci by the manufacturer. Most common here Is Isopropyl alcohol. Hea(^s shoul(d be cleaned only with a cleaning solution approved by the manufacturer.
Playback Track Width Standards B. Playback function check-out: 1. All tape guides, idlers, and heads in the tape path must first be thoroughly demagnetized (degaussed), using a head demagnetizes Remnant magnetism of heads or guides may increase the noise level and result in some degree of HF signal erasure from any tape played on the machine. Be sure that the machine is turned off before demagnetizing the heads. Proper use of a head demagnetizer is best learned from an experienced engineer. The typical demagnetizer has a fairly small gap, located between two metal probes. Ensure that the gap region has been covered with a small piece of electrical tape, so that there is no chance of scratching the laminations in the heads. Demagnetization of the heads requires that the gap of the demagnetizer be placed directly at the head surface, straddling the head gap. The demagnetizer is then moved up and down several times, and then pulled away from the head with a smooth motion. 2. Azimuth is the perhaps the most critical head adjustment in terms of audio performance. The alignment tape has a segment Intended for alignment adjustment. Usually, this a long segment of a HF tone (normally 15 kHz). Adjust the playback head azimuth until the output is maximum. Be careful. If you are far off azimuth, you may observe one or more smaller peaks in response. Make sure that you are reading the highest value, which is the principal peak. 3. Playback equalization is then adjusted with the sequence of tones provided on the tape for that purpose. Typically, the tones might be at the following frequencies: 31.5, 40, 63, 125, 250, 500, 1000, 2000, 4000, 6300, 8000, 10,000, 12,500, 14,000, 16,000, 18,000, and 20,000 Hz. These tones may vary among manufacturers, but what is obvious is that they are closely spaced at the frequency extremes, while being at octave intervals in the mid-band. Since the critical frequency adjustments on the playback of tape recorders are above 5 kHz and below 100 Hz, the extra "density" of frequencies is useful in fine-tuning the playback response. Using the various controls provided for the purpose, the engineer should adjust the playback equalization for flattest response. If a full-track alignment tape is used for stereo equalization adjustment, there will be some LF errors, due to fringing of the long wavelengths around the head gaps. Most manufacturers of alignment tapes provide correction tables for this purpose so that correct frequency response adjustment can be made. 4. The final playback adjustment Is setting the reference output level, and this is made with a portion of the tape specifically intended for that purpose. If a 185 nWb/m alignment tape Is used, normal calibration (not elevated) requires that the gain be set so that this tone produces a meter reading of 0 dB. If the
175
176
Chapter 11
engineer wishes to calibrate to elevated level, this tone should be set for a meter reading of —3 dB. If an elevated level test tape Is used, and the desired calibration Is for elevated level, then the gain should be set for a meter reading of 0 dB. Use only one standard tape operating level. A consistently applied standard Is essential In any recording studio complex, and no engineer should be allowed to make an arbitrary change in tape operating level. C. Record function check-out: 1. Using the playback head as a secondary standard, the azimuth of the record head is adjusted by recording a HF tone (normally 15 kHz), and adjusting the azimuth of the record heads until there is a clear maximum in response in the output of the playback heads. An alternative method is to put the machine in sync mode and adjust the output of each record head directly for maximum response, using the HF portion of the alignment tape intended for that purpose. This is the recommended procedure for late model machines. 2. Setting record bias level is done by recording a HF signal and adjusting record bias level so that the playback output is overbiased by 3 dB. This is normally done by first achieving peak output at the high frequency, and then further increasing bias until the output level drops 3 dB. The engineer should check the specification sheet for the particular tape stock in use for any particular instructions here. Also, the users manual for the recorder may recommend alternate procedures. 3. Some machines have additional adjustments for bias waveform purity. These should be carefully adjusted, following manufacturer's instructions, inasmuch as this will affect the recorded noise level. If there is such an adjustment on the machine, it is wise to again demagnetize all heads after It has been adjusted. 4. Adjustment of record equalization is done by inserting frequencies with an oscillator at the input and adjusting the various record equalization controls until the flattest playback output Is attained. There are no standard recording curves; the record equalization is adjusted as needed to ensure that the through-put signal from the recorder is as flat as possible. 5. Record level calibration Is set, so that when making an A-B check of machine input versus output, the two levels will be the same. Thus, the tape recorder will be a unity gain device under all operating conditions.
Time Code Operations
177
D. Adjustment of bias erase current: Manufacturer's instructions should be followed here. Older machines often exhibit interaction between record and erase bias requirements, resulting in fluctuations which depend on how many tracks are in record mode at a given time. This is not a problem in modern machines.
TIME CODE OPERATIONS The need for synchronizing (or "sync-ing") audio tracks began in the late twenties when talking motion pictures were developed. Film has always been fitted with equally spaced sprocket holes, and as such exhibited linear velocity that could be locked into a fixed shaft rotational rate. Optical sound on film was recorded on the same medium, and the two could be locked into sync indefinitely, with single frame accuracy. By comparison, magnetic tape is a free running medium. The actual speed of the tape through the recorder is only approximate, due to varying tension and slippage from the beginning to the end of a reel; and of course tape itself can elongate slightly while under tension. Eventually, the motion picture and television industries merged their technologies of film, video, and studio recording, and it was necessary to develop a method of sync-ing all recording and video machines as well as cameras. During the sixties, the Society of Motion Picture and Television Engineers (SMPTE) and European Broadcast Union (EBU) collaborated on a recorded time code standard, known in the United States as SMPTE (pronounced "simpty") time code. If a time code signal is "striped" onto a roll of magnetic tape, the tape medium can be run in sync with video or film transports with absolute control. In this sense, the time code acts as "printed sprocket holes" on the tape. Later developments include musical instrument digital interface (MIDI) time code. MIDI was developed to allow interconnecting and controlling electronic instruments and synthesizers, and it was only natural that its control code be adapted to SMPTE requirements for broader application in the recording studio.
178
Chapter 11
SMPTE TIME CODE Frame rate and structure Normally, SMPTE time code runs at 30 frames per second (fps), since this is the basic black-and-white video transmission rate in the United States, and it is normally used in video based digital audio recording. The frame itself consists of 80 bits numbered consecutively from 0 to 79, and the time code is expressed in hours, minutes, seconds, and frames. The structure of the 80-bit word is shown in Figure 11-18A, and a typical readout of the code appears as shown in Figure 11-18B. SMPTE code can normally define up to 24 hours of discrete time locations on the tape or film at frame intervals. The code is written in binary coded decimal (BCD) form, as shown in Figure 11-19A, and the modulation method is bi-phase, as shown in Figure 11-19B. A "0'' binary value is shown as a single transition between clock pulses, while a "V binary value is shown as two transitions per frame. The recorded frequency of "0" transitions is 2400 Hz, while the recorded frequency of"!" transitions is 4800 Hz. Successive 80-bit words are separated by a 16-bit sync word that occupies bits 64 through 79, and a new word begins with bit number zero. In addition to the recorded time code data itself, each frame contains an additional 32 user bits. These bits may be used for general housekeeping, such as identifying reel numbers, dates, recording locations, and so forth.
Locations and recorded levels of the code On a reel-to-reel multitrack analog recorder, time code is normally placed on the highest numbered track (which is the outside track at the bottom of the tape). The recorded level of the time code signal is set between -5 and -10 dB on the VU meter, where 0 dB represents the normal reference fluxivity used in the particular studio. If code-decode noise reduction is used with the analog recorder, make sure that it has been defeated on the track that is carrying the time code signal. On U-matic (3/4-inch) video recorders, the signal is placed on the first (or left) analog audio track at a level between 0 and -5 dB on the VU meter. Every effort should be made to keep the time code signal physically removed from any sensitive or low-level analog program tracks because of the tendency for time code, with its 2400- and 4800-Hz components, to crosstalk into the analog signal. Care must also be taken to route the audio signal and time code output cables carefully. As an added precaution against crosstalk from the time code track into the program, the track adjacent to the time code track on a multichannel recorder should, if possible, be left blank.
179
SMPTE Time Code
Continuation of word
Beginning of word Bit number
0
1
1 2
^ Ö
Bit numt)er 40
1 - Frames units
i
10
2Ö 4Ö
4
\- Minutes tens 1
Unassigned bit 43
44 - 1st users group 10
8
2Ö
48
- Frames tens
1
- 6th users group
1 1
2
4
Drop frame Fixed zero
8
- Hours units
1
52
12
- 7th users group
- 2nd users group 1 2 4 8
16
10
56
2Ö - Seconds units
20
' - Hours tens Unassigned bit 58 Fixed zero
60 - 8th users group - 3rd users group 64
10
24
2Ö 4Ö
h Seconds tens
0 0 1
Unassigned t)it 27 68 1
28
Y Sync word h 4th users group 32
72 1
1 1 1 2
1
4
8
r Minutes units 76
36 h 5th users group
1 1 0 79 1
B
Db:a3;4"l:5q Hours
Minutes
Seconds
Frames
Figure 11-18. SMPTE time code. Basic 80-bit word structure (A); typical readout of a frame address (B).
In normal use, the code can be accurately detected on machines in slow winding mode or fast shuttle mode over ratios from 0.1-times-speed to 100times-speed. It can be read in either forward or reverse direction as well as in normal or reverse polarity.
Chapter 11
180
Multipliers
BCD form 8 4 21
Decimal value
0000 0001 001 0 001 1 01 00 01 01 01 10 01 1 1 1000 1 001
0 1 2 3 4 5 6 7 8 9
-
B I
I
I
I
I
I
t^L
•
I
I
I
I
I
T^
Clock pulses
. 8-blt word
Figure 11-19. SMPTE code. Illustration of binary-coded decimal values (A); structure of the bi-phase word (B).
Drop frames For color video transmission using the NTSC (U.S.) standard, there are 29.97 frames per second, rather than 30. When using time code with a color video signal, frames are periodically dropped so that the long-term error is reduced to zero. A frame is dropped each 33 1/3 frames, making a total of 108 dropped frames per hour. When this is done, the elapsed time shown on the counter or the time code generator will match the clock on the wall. Use of time code in digital audio editing: Figure 11-20 shows a simplified signal flow diagram for a so-called assembly editing system using video tape transports. Here, the program source tapes (video or digital audio) are placed on transport 1 or 2. Through control by the editor, those segments of the source tapes desired for the final assembled master are sequentially recorded onto the master assembly transport, all under the control of time code. In an editing system such as this, crossfades may be executed between new and previously transferred program segments, and the level of incoming segments can be altered as well.
181
MIDI Applications in Recording Playback source transports
Record
'
» ->•
Oigitai audio data Time code data
Figure 11-20. Use of SMPTE in video or digital audio assembly editing.
MIDI APPLICATIONS IN RECORDING Overview MIDI was introduced in 1983 and has undergone considerable modification since that time. In addition to electronic musical instruments, it is now used to control a variety of signal processing devices in recording and performance environments. Even extra-musical events, such as lighting, can be MIDIcontrolled. In its normal musical environment, MIDI is a serial digital interface that operates at a basic data rate of 31.25 kilobits per second ± 1 percent. A "standard" MIDI message consists of one status byte followed by two data bytes. (A byte is a group of eight bits.) The status byte identifies a particular musical channel or instrument, while the data bytes determine the musical operations to be performed. As a typical example, the status byte may indicate the musical source; data byte number 1 then identifies the note to be played, while data byte number 2 indicates the attack velocity (or level) ofthat note. The MSB (most significant bit) of the status byte is always one, while the MSB of the data bytes is always zero. The MIDI user has access to tables indicating the specific musical function in byte notation. As shown in Figure 11-21, the MIDI code can operate at a rate of 1,302 messages per second. This number of instructions per second is quite sufficient to quantify normal musical transient detail well within acceptable limits of timing perceptibility by careful listeners. Figure 11-22A presents an example of MIDI hook-up in a normal live performance situation. Here, instrument 1 acts as a master, feeding data
Chapter 11
182 Data byte 1 1
status byte 1
1
hl t
1 n
s s| s| s| s| s s|o| d 1
Data byte 2 1
^1 d| ^1 ^1 d|
1d | 0r" | d
1
Zero
One
1
1 ^1 ^i ^1 ^\d
A
Zero
~>
- 0.768 msec •
Figure 11-21. Structure of a typical MIDI message.
downstream to a number of electronic instruments. In other applications, a MIDI sequencer may be used to store detailed musical instructions, as shown in Figure 11-22B. In this case, the sequencer acts as a MIDI recorder, storing information played into it from a MIDI-equipped instrument. This data can be stored on a conventional floppy disc, edited, and later used for MIDI playback on the original instrument, or other MIDI-equipped instruments.
MIDI code
Master
MIDI code
Instrument 1
MIDI code
instrument 2
B Instrument 1 MIDI code
MIDI sequencer (recording, editing of MIDI input)
MIDI code
Instrument 2
MlE^o MiDlcode
Instrument 3
Figure 11-22. MIDI applications in music. First instrument is played and sends MIDI instructions to "slave" instruments downstream (A); MIDI sequencer stores messages in a predetermined order for replay to instruments downstream; all MIDI instruments have MIDI feed-through capability for chaining instruments together (B).
MIDI Time Code (MTC) MTC provides a practical means of distributing SMPTE time code in a studio environment, since the SMPTE data can be nested well within the relatively high data rate that the MIDI code allows. At the same time, the MIDI code can carry normal musical or signal processing instructions at its basic high rate.
MIDI Time Code (MTC)
183
A normal SMPTE time code frame contains more information than can be contained in a single MIDI instruction, so the data is normally distributed over eight MIDI bytes. To save transmission "overhead" in the MIDI system, time code frames may be updated every two frames. In all other respects, MIDI time code can be treated as you would treat SMPTE derived time code. Simple conversions between the two codes can be provided by a stand-alone MIDI/SMPTE converter. We recommend that recording engineers familiarize themselves with all aspects of MIDI, since the system has established itself as a standard for musical instrument control, as well as a medium of control for recorders, signal processing gear, and console automation in the studio. Further discussion of it belongs in a book devoted to its musical applications.
Chapter 12 DIGITAL RECORDING
INTRODUCTION Beginning with Nippon Columbia's 14-bit quad-video-based system in the early 1970s, commercial digital recording was developed further in the U.S. by Soundstream and 3M in reel-to-reel fixed-head formats. By the early 1980s, Sony and JVC had introduced formats based on video helical scanning techniques. Sony, Mitsubishi, Studer, and Otari developed large multichannel fixed head recorders during the 1980s, and the 1990s saw the introduction of modular digital multitrack (MDM) recorders providing 8-channel capability on small format video tape cartridges. Many current MDMs make use of hard disc storage as an alternative to tape cartridges. Computer based systems are in the ascendancy today, and these use both magnetic and magneto-optical disc formats for data storage. One advantage of digital recording is that it can provide a direct path from the studio to the home listener; in most cases, the original digital tracks are not subjected to any analog signal processing or further re-recording, and what ends up in the consumer's living room, via Compact Disc (and higher density formats), is a program virtually identical to what the producer and engineer heard in their postproduction studio. Signals in the digital domain can be copied, or cloned, with absolute accuracy; that is, the copy is identical in content to the source that produced it. Since a digitally quantized signal is represented solely by a string of numbers, it is independent of the recording medium. As a result, further signal processing and transfer operations in the digital domain will show no deterioration in terms of distortion, time base instability, or increase in noise. In this chapter we will discuss the basics of digital recording technology, beginning with the fundamentals of signal sampling, and moving on through the assorted hardware it takes to realize a practical recording system. We will then continue with discussions of industry standards and future directions for digital.
The Digital Recorder—A Quick Walk-Through
185
THE DIGITAL RECORDER— A QUICK WALK-THROUGH Figure 12-1 shows a simplified block diagram of a digital record/playback system. The input signal is low-pass filtered and a low-level noise signal (dither) is added to it. The next step is to convert the signal from analog to the digital domain through a successive approximation process. This quantization process represents the input signal as a numerical value at a given instant, and at succeeding points in time. The following step formats the numerical values, adding sync and error-correction information, and the signal is then fed to the recorder. Note that the entire process is controlled by timing signals from the internal clock generator. Recording Drther input
i Signal input Q
Filtering
Anatog-todigital conversion
Signal formatting, error correction, i syrx) functions
t
t Clock generator
Playback From recorder Buffer
Error detection & correction
Digital-toanalog conversion
Output filtering
Signal output Q
Clock generator
Figure 12-1. Simplified block diagrams for digital record and playback systems.
The playback process is basically the reverse of the record process. As the signal is played back the data is stored in a bujfer The data is then "clocked out" of the buffer to a stage that separates the signal components and performs any error-correction functions that are necessary. The next stage converts the signal from numerical form to analog form, and the final stage provides reconstruction of the signal via filtering.
Chapter 12
186
ANALYSIS OF THE SYSTEM Quantization The quantization process is shown in Figure 12-2. A "digital tree" extending from 2^ to the value of 2^ encompassing 32 quantization states is shown. This provides a shorthand method of identifying signal values as a set of binary (two-valued) digits consisting of zeroes and ones. For example, consider the value shown at "a" at the end of the tree We can call this value 10110. Reading the tree from the left you can see that following the branches, with the ones and zeros as just indicated, you will end up at position "a." Likewise, 1 1
0 1
1
1
r
L
^ ^ ^ . ^'
r
L r
L
0
1
^
r
1
r
1
• ^ ^ ^ 0
L
L
' ^ 0 ^ ^ ^ ^ ^ ^ ^ ^*
I
0
0
1
rI
0
r
1
I
1
0
1
1
1[
0
I
1
1
^' ^ ^ ^ ^ ^
1
0
1
I
1 1
0
0
1 1 1
^ ^ . ' ^
1
I
0
0
1
1^ 2°
i
2^
1 state
1
2states
; I
22 4 states
: !
23 estates
; 1
2^ 16 states
MSB
25
-V
j
32 states ;
LSB MSB - Most Significant i3it LSB • Least significant bit
Figure 12-2. A digital "tree" covering the range from 2^ to 2^.
Analysis of the System
187
"b" and "c" are indicated by the digit sequences 01011 and 00001, respectively. In this representation, the first bit is called the most significant bit (MSB), and the last bit in the tree is called the least significant bit (LSB). Taking 2 to the fifth power only gives us 32 encoding levels—^hardly enough for making a recording. If we extend the tree so that it has 16 sets of branches, we will have 2^^, or 65,536 quantization levels. This degree of quantization is used in the compact disc, and it gives an operating dynamic range of approximately 96 dB (16-times-6) between the highest and lowest program levels that can be encoded. A series of binary numbers representing a signal sample is known as a digital word. Each number in the word is called a bit (from binary digit), indicating the fundamental unit of information.
Sampling rate Sampling rate determines how often, or how many times per second, we carry out the process just discussed. If we want to encode an audio signal of 20 kHz we will need to sample the signal a little greater than two-times that frequency. In the CD, the sampling rate is 44,100 per-second. This value is also known as the Nyquist rate, and the Nyquist frequency is half that value, or 22,050 Hz. Normally, we want to keep a guard band between the actual HF signal input and the Nyquist frequency in order to avoid certain problems, so in a system with a Nyquist frequency of 22,050 Hz we would want to filter out signal components higher than about 20 kHz. The combined processes of quantization and sampling rate are shown in Figure 12-3. The input audio signal is shown at^, and the quantization intervals are indicated at B. The filtered output is shown as C and you can clearly observe that neither quantizing levels nor sampling intervals are dense enough to produce an accurate audio signal replica at the output. The representation shown in Figure 14-4 gives an idea of just how dense both of these processes are in the compact disc. A full-level signal at 1 kHz is shown encoded over 65,536 levels and sampled 44,100 times per second. The process we have described thus far is generally known to digital engineers as pulse code modulation (PCM), and the quality attributes of PCM are normally stated as a simple pair of numbers. For example, the data rate for the CD may be stated as 16-44.1, meaning a 16-bit word sampled 44,100 times per second. The designation 24-96 indicates a 24-bit word sampled 96,000 times per second. The input filter in the recording section is a low-pass design that rolls off the input signal's response sharply above the Nyquist frequency. If there were no filter, a sine wave higher than the Nyquist frequency would be sampled as shown in Figure 12-5. The sampling intervals are indicated at the bottom of the figure, and the dots along the input signal where the sampling actually takes place are indicated by the small dots. If we draw a curve through these
Chapter 12
188 B Wavefom) quanized through 5-blt conversion (32 states)
Outout waveform after filtering
Input waveform
W:
Clock pulse inten/al
Figure 12-3. Quantization. Input signal (A); quantization process (B); recovery of signal (C).
Density of PCM sampling at 16-bit word length and sampling 44,100 times per-second 1000 Hz sine wave cycle (period = 1 nnllliseconci) (0
§ n (0
in lO CD
44.1 samples in this time interval Figure 12-4. Representation of one cycle of a sine wave quantized with 16 bits at a sampling rate of 44,100 times per second.
Analysis of the System
189 Aliased signal
Input signal
Sampling intervals Figure 12-5. Example of aliasing.
sampled dots we will see that they outline a new sine wave of much lower frequency. This process is called aliasing. If our Nyquist frequency is 22,050 Hz (as in the CD) and our input frequency is 23,050 Hz, then the alias frequency will be the difference between the input signal and the Nyquist frequency, or 1000 Hz. Aliased frequencies are not harmonically related to the actual program content and show up as gross distortion of the signal. We run into another problem with PCM at very low level signal inputs. Remember that there are 65,536 possible signal recording levels in a 16-bit system. If we are operating the system very close to its lower limit, perhaps in the range of a handful of bits, the actual process will be as shown in Figure 12-6A. An input signal sine wave less than the least significant bit will actually "toggle" between two adjacent states, as shown, creating a square wave output. This problem is solved as shown at B. Referring back to Figure 12-1, you will note that the input filter has a dither input. Dither is a low-level random noise signal that is added to the input signal, and its purpose is to randomize the toggling action between adjacent states in the quantizing process, as shown in Figure 12-6B. The dither produces what is known as a duty cycle modulation between adjacent states, and that is shown at the output. When this signal is low-pass-filtered, the output clearly resembles the input, but with the addition of a slight amount of noise.
Error correction and data redundancy After the signal has been digitized it is then formatted, combined with sync information and provided with data redundancy for error correction during
Chapter 12
190
Least significant bit resolution of system
B
Input signal with dither
Figure 12-6. Effect of dither. Low-level signal without dither (A); with dither (B).
playback. Both the recording and playback of digital data on magnetic media are difficult due to the very small physical space on the tape occupied by the data. Playback is subject to irregulatities in the medium, including dropouts. In analog recording and playback, a slight dropout is virtually inaudible; however, in a digital recording and playback system, an error in reading a single bit can be disastrous. As you can see in Figure 12-2, an error in reading the
191
Analysis of the System
MSB will cause a far greater disturbance than an error in reading the LSB. For this reason it is essential in digital encoding to include several strategies that will enable the playback function to deliver an accurate output signal. The simplest of these is interleaving th^ recorded data, as shown in Figure 12-7. Original r~j-
2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I to| 1l| 12[ isj 14| ts| Is] l/j lej t9| äp]
^?P5yi^^"!i^ py^^. tr^n^LnnR
s| B| 1l| u j
^\1\
19f
7\ 10 121 20 15} 18 16
B
"""y^r-^rrnr:^ BI A 7\\m^'^A^M»\^A^m^oU^\ Burst error has been converted into random errors. Figure 12-7, Example of interleaving of data. Basic process (A); effect on burst error (B).
Most errors in playback are caused by so-called burst errors, dropouts covering several data blocks or words. At A, the data recorded over a given range is interleaved as shown. In playback (B), the data in the burst error are distributed so that they become individual errors instead of a series group of errors. Individual errors within a word can be detected and corrected by fairly complex digital algorithms (programs), and Figure 12-8 gives an example of how this can be done: MSB LSB Bit#1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Digital 16-bit word:
1 1 0 0 1 0 1 1 0 0 0 1 1 0 8-bit sut>-words (not recorded)
1 0
Sub-words produce these parity bits, which are recorded:
I
Sub-word #3:
1 1 0 0 10 11 1 1 0 0 1 0 1 1
Sub-word #4:
1
Sub-word #1: Sub-word #2:
0
0 1
1
Figure J2-8. Identifying errors through use of parity bits and sub-words.
Assume that the ninth bit in a word has been played back in error. If a word is checked against/7ar//y bits, the word can be identified as having an error in it. There are four parity bits for each word, and these are generated as shown: If the number of zeroes in the sub-word is odd, the parity bit is 1; if the number of zeros in the sub-word is even, the parity bit is 0. In a typical digital
Chapter 12
192
system there will be several layers of error correction. By comparing each parity bit with its sub-word, an error in the entire word can be identified as such—and then corrected. During playback, data is stored in a buffer and is "clocked ouf' as the error detection stage needs it. Any errors that have not been corrected may then be subjected to the strategies shown in Figure 12-9. Assume that word 2 (shown in the circle) cannot be corrected. It can be roughly approximated either by interpolation between words 1 and 3—or by simply repeating the previous word value. In the overall picture of modem digital recording and playback, uncorrected errors are fairly rare.
interpolated data
Figure 12-9. Data concealment strategies.
NOISE SHAPING—A BONUS Noise shaping is a variation of dithering that results in an audible noise floor considerably below that expected for a given word length. For example, the CD is a 16-bit medium, and with normal dithering we would expect a wideband noise floor perhaps 92 or 93 dBFS (dB below digital full scale). If we begin with a 20-bit recording, we can truncate that recording down to 16 bits using one of several noise shaping algorithms, which redistribute the noise spectrum so that it is minimal in the 3-5 kHz range, where the ear is most sensitive, and allowing the noise to increase at higher frequencies, where the ear is least sensitive. The effect is as shown in Figure 12-10. These curves indicate the normal flat dither noise floor of both 16- and 20-bit recording systems. Also shown is the effect of truncating the 20-bit system down to 16-bits. If you compare
Fine-Tuning Digital Recording Systems
193
the noise spectra with Figure lA you'll note that there is a improvement in audible noise of about 18 to 20 dB due to the equal loudness contours. The area of the noise-shaped curve below the 16-bit level is approximately equal to the area above the 16-bit level, demonstrating that, in absolute technical terms, we can't get something for nothing. u r
'"c.Hjr
-4or"
|1 kHztoneat-eOdBFSl -79 L
fP 11.
|
r
QRL "aDr
-~| l O ' U l l Willi llUlot? ollCtplllHj 1— ^^^*
T3
-120b
^
|16-bit noise floor| 1 -144r
y"
|20-bit noise floor] -1681-
-196L
0
1 3k
6k
9k
12 k
15 k
18 k
21k
24 k
Frequency (Hz, linear scale)
Figure 12-10. Effect of noise-shaping on system noise floor. Data based on 2048 FFT point analysis. (Data after Prism Sound)
FINE-TUNING DIGITAL RECORDING SYSTEMS Over the years, critical listening has identified a number of technical problems in digital systems:
Jitter Jitter results from irregularities in the clocking ftinction that drives all digital systems. Ideally, the clocking function should occur at absolutely equal time intervals, and the program should be sampled and played back exactly at
Chapter 12
194
those intervals. If excessive, jitter can cause random errors in the quantization process, and in some cases may result in audible distortion. Modem systems are relatively free of jitter problems
Quantization linearity Some early A/D and D/A converters exhibited significant departures from linearity. Any departure from linearity is associated with some degree of distortion; depending on converter design, it may occur in any portion of the transfer function of the system. Today, quantization linearity is of little concern, except in the lowest-cost systems.
Input filtering Problems identified with input filtering have to do with time dispersion and ripple (small peaks and dips in response) in the pass-band of the filter. It took the industry some time to identify these and correct these. The first judgements that design engineers made were that the aberrations were too small to be audible, or took place at frequencies to high for most people to hear. Time has sided with the critical listener, and the best of modem input and output filters have no ripple in the pass-band and minimum time dispersion over the audible range.
DIRECT STREAM DIGITAL (DSD)—AN ALTERNATIVE TO PCM DSD was introduced by Sony and Philips in the late 1990s. It makes use of one-bit encoding at the extremely high sampling rate of 64-times-44,100, or 2,822,400 bits per second. Figure 12-11 shows one-half cycle of a 20-kHz sine wave. The sine wave input signal has been superimposed in the back-
DSD (1 bit delta-sigma)
Figure 12-11. Example of Direct Stream Digital (DSD) recording. (Data courtesy Sony)
Digital Recording Media
195
ground, and you can clearly see the similarity of the one-bit sampling pattern with the analog input signal. Among other things, it is this "analog resemblance" between DSD coding and the input signal itself which many recording engineers highly esteem. Further benefits of the system include minimal, if any, input filtering and near-perfect linearity. The system allows the user, at least in principal, to trade off bandwidth with HF noise. DSD is the basis of the Sony/Philips Super Audio Compact Disc (SACD), but is also gaining in favor for basic tracking activities in the studio. In spite of its intuitive simplicity, DSD is a complex recording process that has only attained technical maturity in recent years.
DIGITAL RECORDING MEDIA Early digital recording systems used magnetic tape that was quite different from analog tape. In analog recording the requirements are for high dynamic range and low distortion of sine waves, and this dictates a fairly thick, dense oxide layer. For digital recording the requirements are for relatively low dynamic range and accurate reading of data changes (i.e., the boundary between the two states of zero and one). This dictates a fairly thin and accurately controlled high-output oxide layer. Video tape, as used in consumer and professional formats, is positioned mid-way between these two sets of requirements and was widely used for two-channel digital recording during the 1980s. In the early 1990s the needs for multichannel studio capability favored the reel-to-reel formats. At that time we began to see the rise of computerbased hard-disc systems, and today these systems are clearly in the ascendancy. Taking advantage of Moore's Law, the storage capability of computer (disc) systems continues to double every 18 months or so for each dollar you spend for hardware. Most of today's digital recording is now done on disc-based systems—^but it must be emphasized that many engineers in the pop sector still prefer to use analog recording for basic tracking activities.
DIGITAL RECORDERS IN HISTORICAL REVIEW Early experiments were carried out by Stockham (1969), eventually leading to the formation of Soundstream in the mid 1970s. Samuels Engineering (California) was another early experimenter in the art during the early 1970s.
196
Chapter 12
Denon Denon, a branch of Nippon Columbia, introduced a 14-bit digital system in the early 1970s that used a quad rotary-head video transport for storing digital data on 2-inch tape. The recorder was primarily an in-house tool used to record program material for release on stereo LPs for the Nippon Columbia record label. Frame-accurate (1/30 second) editing was possible using a mechanical cutting technique.
Soundstream This U. S. based company offered the first 8-track fixed-head recorder using 1-inch tape in 1976. At one time there were three editing centers worldwide providing sample-accurate editing capability. Most commercial Soundstream recordings were made in stereo for release on LPs, however, the system was capable of 8-track recording.
3M This U. S. company, universally known for magnetic tape manufacture, offered 2-track and multitrack reel-to-reel recorders during the late 1970s and early 1980s. The recorders used half-inch tape.
Sony and JVC video-based systems These recorders took advantage of the wide bandwidth of video recorders (UMatic format) for storing two-track audio data. They offered sample-accurate editing and were important in the wholesale adoption of digital two-track recording during the early 1980s. The Sony DAE 1000 provided the bulk of program input for the adoption of the Compact Disc in 1983. Figure 12-12 shows a view of the Sony DAE-3000 editor and processor.
DASH and ProDigi (PD) fixed-head reel-to-reel multitrack recorders These massive recorders were designed to look and perform functionally much the same way as analog multitrack recorders. There are two formats: DASH (adopted by Sony and Studer) and ProDigi (adopted by Mitsubishi and Otari). Introduced in the 1980s, both system support electronic as well as mechanical editing. Figure 12-13 shows a view of a Studer 48-track digital recorder using half-inch tape.
Modular Digital Multitrack (MDM) recorders The early 1990s saw the introduction of modular multitrack digital recorders using video tape cartridges. The Alesis 8-track model used the VHS cartridge while the Tascam DA-88 model used the High-8 cartridge. Overdubbing capability was provided, but required some data retransfer between tracks. These machines fed the need for digital recording at the grass roots level in
Digital Recorders in Historial Review
197
Figure 12-12. Photograph of Sony DAE-300 editor and processor. (Data courtesy Sony)
Figure 12-13. Photograph of Studer 48-track digital recorder. (Data courtesy Studer)
198
Chapter 12
the marketplace and sustained the technology well until personal computer storage capability came on the scene. Figure 12-14 shows the Tascam model DA-98.
Figure 12-14. Photograph of Tascam DA-98 recorder. (Data courtesy Tascam)
Disc-based MDMs By the mid-1990s, the development curve of personal computer storage technology intersected that of recording, and a new generation of MDM recorders based on magnetic or magneto-optical disc technology were developed. These machines offer 8- to 24-track recording capability in a variety of formats. Figure 12-15 shows the Akai model DR-16 16-track disc recorder.
Figure 12-15. Photograph of Akai DR-16 recorder. (Data courtesy Akai)
BIT-SPLITTING The early MDMs offered 16-bit recording on eight channels. Bit-splitting, developed by a number of third-party suppliers, provides a method of recording at higher densities of 20 bits or 24 bits, at the user's choice, by spreading
A Comparison of Tape and Disc MDMs
199
the digital information over other channels. Typically, six channels can be recorded at 20 bits, with tracks 7 and 8 carrying the added information; or, four tracks can be recorded at 24 bits, with tracks 5 through 8 carrying the added information. Many of the newer MDMs are capable of recording directly at 24 bits.
A COMPARISON OF TAPE AND DISC MDMS The chief advantages of tape based MDMs are the relatively low cost of the recording medium and the ease with which it can be transported from one location to another. The disadvantages lie primarily in the time required to access a given portion of the tape and the inevitable wear and tear on the tape itself. Disc media, whether based on computer hard disc technology or on directly removable magneto-optical technology, provide very rapid random access to any part of the disc, and this is a greater advantage in saving postproduction time. The ultimate longevity of the disc systems is also an advantage in that there is no direct contact with the medium. The primary disadvantage is media costs, whether in disc form or as a hard disc drive assembly. The user has a bewildering array of models to choose from. There are many systems offering different features and with differing degrees of compatibility with existing recording/editing systems. Before you buy, check with professional users and scope out the idiosyncracies of the ones you are considering. Be aware that each year brings new models, often greatly improved over earlier ones—and often at considerably lower cost.
STANDARDS AND INTERCHANGEABILITY As in the computer industry, the various digital formats can "speak" to one another via several interface standards. In addition, these published standards also define matters of synchronization, user-definable bits, file formats, and interconnect requirements.
AES-EBU The AES-EBU interface is widely used for two-channel programs. The interface uses a standard XLR three-conductor cable to transmit serial data between recorders and other digital processing devices and can accommodate up to 24-bit resolution at a number of sampling rates.
200
Chapter 12
SDIF-2 The SDIF-2 (Sony Digital Interface) is used between pieces of Sony digital equipment. It requires parallel connections for each channel and an additional word sync connection between devices.
ProDigital The ProDigital (PD) A/B/C-Dub is used with Mitsubishi and Otari recorders and can be used for data transfer between machines based on the PD format as well as with certain DASH configuration machines.
MADI Multichannel Audio Digital Interface (MADI) was proposed by the AES as a method for interfacing multichannel digital devices. It can accommodate up to 56 channels at sampling rates of 32 kHz to 48 kHz with resolution up to 24 bits per channel. It is serial, requiring only a single connection at each end of the bus.
Consumer formats The Electrical Industries Association of Japan (EIAJ) two-channel video format was introduced in the 1980s and employed in the Sony PCM F-1 format for consumer use with standard Beta and VHS video recorders. It was a surprise to everyone when the PCM-Fl attained high status with professional users during the early 1980s. Countless CDs were mastered on these low-cost systems. Interface gear for use with Sony digital editors was made by Harmonia-Mundi Acustica of Germany, but by the mid 1990s the F-1 had virtually disappeared. The rotary digital audio tape (R-DAT) recording system, commonly called DAT, has attained full status as a two-channel professional recording medium. It was originally intended as a consumer medium, but it never succeeded as such. Like the video-based systems, it makes use of a helical scanning system, but it does not use the standard video frame structure. The DAT has improved markedly in overall reliability over just a few years and is a very convenient format for two-channel work. Most of the professional DAT machines can record directly at 44.1 kHz (the sampling rate of the CD) and have AES-EBU outputs for direct interface with two-channel digital editing systems.
Chapter 13 THE DIGITAL POSTPRODUCTION ENVIRONMENT
INTRODUCTION Since the late 1980s, the digital audio workstation (DAW) has virtually revolutionized the postproduction activities of editing, signal processing, and assembly of digital program material. The DAW represents the coming together of digital recording and storage devices and the modem desktop personal computer. The user interface, as introduced by Macintosh in the early 1980s, followed later by Microsoft's Windows, allows a natural, interactive and user-friendly relationship between the operator and the system. Just as a word processing program allows the user to move blocks of text, correct mistakes and format a document, the DAW lets the user load in multichannel segments of music, edit them, move them as an entity or as individual segments, perform signal processing of all kinds, and adjust their levels. Some systems give the user a broad array of tools for cleaning up the program, and many of the noise removal algorithms are simply mind-boggling. The basic operations are carried out through dialog boxes on the screen and the familiar "point and click" of the mouse. A virtual console work surface appears on-screen, and controls can be manipulated with the mouse and certain keyboard commands. Some systems support additional stand-alone mini-consoles with faders and other controls for more rapid, intuitive handson operation of the system. It is possible to carry out a multitrack live recording session using a work station by itself, but the mouse/screen interface may not be the most intuitive or speedy way to do this. This is where a digital console enters the picture, with its familiar mechanical fader and knob environment. We will base our exposition here on the Sonic Solutions system, while stressing that there are many other systems that carry out similar functions. The prospective buyer should check out a number of systems, making a choice based on a combination of economic, operational, and usage considerations.
Chapter 13
202
A BASIC TUTORIAL We will illustrate how to load in program, play it back, segment and rearrange it, and perform elementary editing on it. The procedures shown here are typical of virtually all workstations.
Startup Figure 13-1 shows the screen view for the beginning of this operation. Note the resemblance to the input and output sections of a small 8-by-2 split configuration console. The input level controls may be actuated by clicking on them and moving them up and down on-screen. The panpots are just above the faders, and the blocks above them can be assigned to various equalization and dynamics functions. Patching of signals is done at the top. The stereo output section, shown at the right, consists of a pair of faders and meters. ^S:
onn ocni OCFI OKTI oirai o i n onn o i n m i i m iK^i I ETI iK^ ivn IKU IEHI f»O0.0 42.0
i'
1 »0.0
r
WESnSM
r 1 ^~1W\ Mr l f "
T
r F f p^ rl w
w-
j
W^'~
ri f\ 1.0
il4.Ä
>V!V s K V s K V s n sKV s n s r a sKV
Figure 13-1. View of virtual mixing desk on computer screen (8-in/2-out). (Data courtesy Sonic Solutions)
A Basic Tutorial
203
Loading in Figure 13-2 shows a view of the Edit Decision List (EDL) window. Program material is loaded into a panel in this window, and the program envelope appears as shown in Figure 13-3. As the program is played, a cursor moves slowly across it, indicating lapsed time. Editing Toolbar
Edit Panels
Figure 13-2. Creating the Edit Decision List (EDL); view of editing panels. (Data courtesy Sonic Solutions)
Press the space bar to stop playing and display the recorded sound. Figure J3-3. View of editing panel with program. (Data courtesy Sonic Solutions)
Segmenting and rearranging the program As shown in Figure 13-4, markers known as "gates" can be used to divide the program into segments. Segments can then be labeled and moved as desired. An example is shown in Figure 13-5, where the middle and last segments have been switched. You can ''zoom in" on screen view for more detail.
204
Chapter 13
1. Use the mouse to place left and right gates around a section of sound you would like to ''dip" out as an audio segment
TT o.s
m.
J
ill iWk v\
00:—:W:»».—
mi
2. From the Uieui menu, select Zoom to Gates (8§-G from keyboard or Icon-It ic^^l button ). Place the gates to more precisely define the segment you would like to create
3. From the Edit menu, select Create Segments. From Gates (option-G from keyboard or Icon-It niiXMbutton ). Figure 13-4. Segmenting the program in an editing panel. (Data courtesy Sonic Solutions)
f i r s t Segaent
JIL ifc ifch a
I S M (HHIQOO «a
Lost Sega.
Hlcwie SegMnt
^
"^'^^ ^
Figure 13-5. Rearranging musical segments. (Data courtesy Sonic Solutions)
Fading in and out Figure 13-6 shows how simple fades can be made. The fade can be auditioned before it is executed. In many systems the exact nature of fade-out/fade-in gain structure can be selected, and a fast fade-out can be combined with a slow fade-in.
Detailed Editing
&
ih i h h il tem üHii IOBL.
205
»»:»t:|y;»>.0»
1. Switch back to the "Gates" tool ("G" at keyboard), then use Zoom to Gates ( X - G ) to view a single segment nrnxe closely
2. Return to the "Fade" tool ("f" at keyboard). Select the leading cut of the segnr>ent you are examining
3. Drag the upper-right "grab-box" to lengthen the fade. Use the space bar to listen to the result Figure 13-6. Executing a musical fade. (Data courtesy Sonic Solutions)
Mixing with automation Anything that has been recorded so far can be mixed into stereo using the automation program. Level changes made during this operation can be stored, auditioned, and updated as desired.
DETAILED EDITING As opposed to pop/rock recording, where musical elements are generally added sequentially one on another and refined through punching in/out or through further overdubbing, classical music and film score editing consists of the assembly of many refinements, each taken from complete studio takes, or takes of sub-sections.
Chapter 13
206
The workstation environment, with its quick random access to all recorded takes or segments, offers the editor with a large array of options in refining edits: 1. Shape of incoming and outgoing fades. 2. Length of incoming and outgoing fades. 3. Relation of start and end of fades to the nominaledit point. The range of these variables is clear from the view of the Edit Fade Window, as shown in Figure 13-7.
[ Audition ) • l o c k Sound in Place Edit Point Offset: i-00:00:00:00.00{ DBuditlonBoth ^PouK^r I i>i:lc DAHgn Q Ripple Until Black Fade Template Enueiope Duration Ouerlap db douin Alpha [Untitled-Fade | ^Fade-Out Cosine 60.5000% 6.5 1.0942 60.50007« 6.5 1.0942 ]Fade-in Cosine I Delete | [ »^"te ] Nudge: ( Preu ) [ NeKt )
OO
(i) ...0:00.12 O...0:01.40 O .0:15.00 • Audition ^ Auto Zoom
[ Beuert)
HH
SH
HB
Cancel
SL
M
JM Figure 13-7. View of the Edit Fade Window. (Data courtesy Sonic Solutions)
In classical recording, most edits fall into two categories: 1. Transition to another take: A "better" take of a long or short segment may be inserted as needed to correct a wrong note or any other musical detail the producer determines should be fixed. 2. Removal of very short segments in the recording, possibly a slight noise or a smeared attack. There is a good deal of general "tightening up" that goes on in all editing sessions. These are all possible with the parameter controls shown here.
207
Signal Processing
SIGNAL PROCESSING Audio signals can be subjected to virtually any kind of digital based signal processing in the frequency, dynamics, and time domains, with detailed setting of all parameters. Today, many plug-in options developed by third-party companies can be added to any DAW, extending its flexibility in ways not available in the analog domain.
NOISE REMOVAL Many work stations have sophisticated algorithms for removing noise from old sources as well as various ticks and pops resulting from electrical mishaps in the studio or in live recordings. Figure 13-8 shows an example of a Sonic Solutions program for interpolating waveforms. As shown in the upper panel of Figure 13-8, two clicks in the program have been identified, and markers are set so that the clicks fall between them. The lower panel of the figure shows Moue Zoom Play Mise SndFile Untitled-2
Clicks Gate4 Prior t« InterpoUtlon
Moue
Zoom Piay
Mise
SndFile Desk
Clicks Aft*r li»t*rMl«t«Mi
Figure 13-8. Interpolation of program waveform. Identification of clicks in program (upper panel); removal of clicks (lower panel). (Data courtesy Sonic Solutions)
Chapter 13
208
the resulting waveform after interpolation. The program waveform has been analyzed before and after the period of disturbances, and a new waveform has been estimated. Such a technique as this is possible over fairly short musical segments, taking advantage of the short-term linear predictability of music.
PROJECT MANAGEMENT A very useful capability of workstations is their ability to handle matters of house keeping. Tabulations of edits and other program changes are easily carried out and can be shown on-screen as in Figure 13-9. Here, the detailed timings and types of crossfades are stored in an editing decision list (EDL), making it easy to identify and alter specific edits. Another feature of modern workstations is the capability of editing and assembling masters for CD, DVD, and SACD manufacture, complete with documentation. :v^5;>cp»e.^cui:j^c^
EDL N«me
m^m ünlltled-5
Untitled Edit List-4
Sample Rftte- 44100 0 Chtnnel
I
EDL Stmrt F4de Sound File
Emphasis
P»ge Rwiber
p
Oii G«in Type St*rt
EDL End
00.00.00 00 00 00 00 02 23 04 F*de-Froi«»-Bl»ck gbtmtre Attack0001 Alien3»5 1 F«de-To-BlÄck 00.00 02 12 26 00 00 30 10.52 F%de-Fro«-BlÄck gbtmtre AtttckOOOl Alien3»5 1
F*de-To-Bl*ck 00 00 00 00 00 00 00 02 23 04 F*de-Froi»-Bl»ck ghtmare Att%ck0001 Alien3tt5 2 F*de-To-Bl*ck 00 00 02 12 26 00 00 30 10 52 F«de-Fro«-Bl»ck ghtütre AttackOOOl AlienSHS 2 F*de-To-Bl*ck
RTPrn
I
0 0 Cosine 00 00 00 00 00 Cosine 0 0 Cosine 00 00 21 13 32 Cosine 0 0 Cosine 00 00.00 00 00 Cosine 0 0 Cosine 00 00 21 13 32 Cosine
Duration
1;
End
1:
00 00 00 00 49 00 00 02 23 04 00 00 00 20 68
1 1 1
00 00 00 00 49 00 00 49 U 58 00 00 00 20 68
00 00 00 00 49 00 00 02 23 04 00 00 00 20 68 00.00 00 00 49 00 00 49 11 58 00 00 00 20 68
si;J
m
iii> I
Figure 13-9. View of EDL listing window. (Data courtesy Sonic Solutions)
THE DIGITAL CONSOLE At some point in its expansion of capabilities, the traditional computer user interface reaches a performance limit. (You might think of it as trying to operate a modern vehicle with nothing more than a monitor and a mouse.) There
The Digital Console
209
are human skills that are most effective when a combination of tactile, visual, and auditory sensibilities are employed at the same time, and a digital console can provide an environment for this. Large digital consoles have been available since the early 1990s, and many of them have been little more than virtual replacements for modem inline analog consoles. Their greatest application however has been in the area of postproduction, where they interface directly with existing computers and high-density disc drives. Figure 13-10 shows the operating surface of a Yamaha DM 2000 digital console, which is typical of modem designs that can handle a variety of recording jobs from tracking to mixing. If you compare the console's work surface with the in-line console shown in Figure 13-11 you'll note that it is only about one-fourth the size—^but it has most of the operational capability and flexibility of the larger console.
Figure 13-10. View of a modern digital console. (Photo courtesy Yamaha)
A list of the capabilities of this console include: 1. Capability of processing digital signals at 24-bit/96-k sampling. 2. 24 line/microphone inputs, expandable to three additional layers for a total of 96 inputs.
210
Chapter 13 3. Console architecture can be "designed" on-screen, as required for each job. 4. "Soft" (assignable) knobs for all signal processing functions and panning/assignment. 5. Absolute repeatability of all control settings. 6. Accommodation of third-party software plug-in modules. 7. Automation of all functions in remix mode. 8. Small monitor screen for showing patching routes and signalprocessing settings. 9. Accommodation of large-screen detailed graphics via an auxiliary minicomputer program. 10. Storage of multiple scenes for recall later. (Scene here is defined as a global group of console settings and routings) 11. Accommodation of a number of audio monitoring setups including surround sound.
LAYERS AND ERGONOMICS In a typical analog environment, a 96-input console would of course have 96 input strips, all equipped with the same signal processing. A complex mixing session involving this many inputs would have a number of pre-mixed inputs or effects, and the mixing engineer would not normally require immediate access to those faders or other input strip facilities. For effective mixing, the engineer would probably have no more than about 24 active controls in the immediate working area. Thus, the layering of additional inputs, accessible quickly when needed, makes good sense. But it requires a completely different mind set on the part of the mixing engineer. The move to such an operating environment as this will be a giant step for many persons to make, and the secret is to proceed cautiously with simple operations, moving on from there. One thing is clear: A large scoring session would be better done on a traditional in-line console. Another clear difference between analog an digital consoles is the appearance of the input strip. If you refer back to Figure 9-15, you will notice immediately the rich detail in which all functions are shown.
211
Layers and Ergonomics
AD Input Section
Channel strips
•48V
®-
LJOFF
(D-
26dB
SOLO PEAK
n
SIGMAL
®-
1
•at
im
©-
CHOI
fltt-j 1 fll
ILA 1 si 1 »1
H
CT\
\V
lit
1 JflLJ
&A 1, A ^SLJI A tt-J 1 sol Figure 13-lL Views of DM 2000 analog input section (left panel) and channel strip (right panel); see text for explanation of legends. (Data after Yamaha)
212
Chapter 13
By contrast, an input strip on the digital console, as shown in Figure 13-11, is virtually bare-bones. The analog input section (left panel) contains the following dedicated functions: 1. Phantom power on/off 2. Input pad on/off 3. Variable input gain set 4. Peak signal indicator light 5. Signal presence indicator light 6. Insert point on/off The corresponding channel strip section is shown in the right panel and contains the following generalized functions: 1. ENCODER: Rotary control used to edit Input and Output Channel parameters. Exact operation depends on currently selected encoder mode and layer 2. AUTO button: Used to set automix recording and playback for each channel. Exact operation depends on currently selected layer. 3. SEL button: Used to select input and output channels for editing with the selected channel section. Exact operation depends on the currently selected layer 4. SOLO button: Used to solo the channel 5. ON button: Used to mute input and output channels 6. Channel strip display: Graphic display of the value of the input or output channel parameter currently assigned to the encoder 7. Channel faders: Touch-sensitive motorized 100-mm fader used to set levels of input channels, output channels, aux sends, and matrix sends. Exact operation depends on currently selected fader mode and layer. You can clearly see that any engineer must attain a fairly high level of confidence in the use of a digital console before attempting even a simple mixing session—not to mention a tracking session.
Chapter 14 EQUALIZERS AND EQUALIZATION
INTRODUCTION The term equalizer is taken from early telephone engineering, when HF losses over long distances had to be compensated to "equalize" the sound at the receiver so that it matched that at the transmitter. The name has since been attached to any procedure of altering or adjusting frequency response in an audio chain. You will also encounter the iQrm filter, A filter is a specific type of equalizer that cuts or removes a portion of the audio program in an effort to fix a problem of some kind. The term program equalizer implies a device that is more flexible and that can be used to enhance a given audio program through the boosting or reducing certain portions of the frequency range. Equalizers may also be referred to by the nature of their action. For example, a graphic equalizer has vertical slider controls that can boost or cut specific frequencies, and when these controls are set in given positions the actual plotted frequency response curve will follow those positions. Shelving equalizers provide LF and HF boost or cut, which appears in the plotted frequency response as a shelf below or above the reference line. End-cut filters are used to provide steep cuts in LF and HF response, and a notch filter is used to remove a particular frequency, perhaps hum or HF leakage, from an audio program.
TYPICAL EQUALIZER FAMILIES OF CURVES End-cut filters Figure 14-1 shows a family of LF and HF response curves for a set of end-cut filters. The filter slopes are normally in the range of 18 dB/octave, which is generally steep enough to accomplish the removal of unwanted signals at the frequency extremes. The normal range of LF control may be from 40 Hz to perhaps as high as 160 Hz. The normal range of HF control may be from 5 kHz to 15 kHz. The frequency designation for the filter indicates the specific frequency at which the filter response is -3 dB. In modem console input
Chapter 14
214
sections, you will normally find a single LF cut filter fixed at 100 Hz. These are very useful in tracking sessions for removing any room rumble, air conditioning noises or other annoying thumps and the like. High-pass
Low-pass
Frequency (Hz) Figure 14-1. Typical end-cut low- and high-pass filter response.
Notch filters Figure 14-2 shows a typical notch filter response plot. A stand-alone set of notch filters may provide two sections, each individually adjustable over a wide range in both frequency and the degree of cut desired. A slight amount of 60-Hz hum may be removed by no more than 12 to 15 dB of cut, while an unwanted 1-kHz tone in an audio program could easily require upwards of 30 dB cut in order to be made inaudible. The major problem with notch filters adjusted for high amounts of attenuation is that they tend to "ring" in the region of the cut frequency. This produces a degree of coloration in the overall sound that may be objectionable. Use no more cut that necessary, and remember to bypass the filter when it is not needed.
Figure 14-2. Typical notch filter response.
Shelving boost and cut equalizers These functions are normally found in console input section equalizers at both LF and HF, and are useful in restoring the frequency extremes in audio programs. They are adjustable both in transition frequency and in amount of
215
Typical Equalizer Families of Curves
boost or cut available These equalizers are very effective in correcting for mild amounts of LF boost due to proximity effect with directional microphones and, at the other end of the spectrum, the differences between on- and off-axis microphone response. It is easy to over-use these equalizers, and you should be very careful making any adjustments when monitoring in a new environment or over an unknown set of monitor loudspeakers. Typical families of curves are shown in Figure 14-3.
Shelving boost and cut -rcxMjty
+10clB
OcB
-lOdB
-20ÜB
20
40
60 90100
200
400 600 8001k
2k
4k
6k 8k10k
2Ck
Figure 14-3. LF and HF shelving boost and cut response.
Sweepable peak and dip equalizers These functions are found in many console input sections. Typically, there are two such sections, and each may be continuously varied, or swept, over a fairly wide frequency range. They are useful in making balance adjustments in individual tracks, and the maximum level range of such equalizers is about ±15 dB, although such extremes are rarely necessary). These equalizers can be used for purely creative purposes, or for correcting for a basic timbre (tone quality) problem, such as an overly bright or dull track. Typical families of curves are shown in Figure 14-4.
Parametric equalizers The three independent parameters in the setting of an equalizer section are the choice of frequency, the degree of boost or cut, and the sharpness of boost or cut. We have already illustrated the first two of these parameters, but the third
Chapter 14
216
may be new. Figure 14-5 shows the effect of the sharpness, or Q, of the boost or cut. When the boost or cut is broad the term low-Q is used. Conversely, the sharper response of these equalizer sections is referred to as high-Q. A typical high-end console will have two or three sections of parametric equalizers in each input module, and the frequency ranges of adjacent sections will have considerable overlap. The combination of three such sections, along with LF and HF shelving sections, will give the engineer just about all that is needed in making normal timbral adjustments. Peak and dip (sweepable) -f20clB
-flOdB
OdB
^^
1
-10dB
-20ClB
20
40
60 801GD
200
L
400 600 8001k
2k
4K
6k
1 1
8k 10k
2»
Figure 14-4. Typical sweepable boost and cut response (with fixed Q values).
Parametric (effect of Q control)
fe
OdB
-lOdB
Lx
f
\jy^
-20clB
20
40
60 80 100
200
400
600 800 Ik
2k
4k
6k
8k 10k
20k
Figure 14-5. Parametric equalizer section, effect of Q control with constant frequency.
Complex Equalizer Response
217
COMPLEX EQUALIZER RESPONSE Recording engineers are normally concerned with the amplitude response aspect of an equalizer inasmuch as this defines the primary audible effect of the equalizen But associated with the amplitude response is a corresponding phase response. Normally, the engineer can ignore the phase aspect since the ear is relatively insensitive to it. Sidebar 14.1 discusses phase and time response of equalizers in greater detail Sidebar 14.1: Most of the equalizers in use today are of the minimum phase type; that is, they introduce the minimum amount of phase shift associated with a given amplitude change. As such, both phase and amplitude are reciprocal, and the "undoing" of a given amount of boost by passing the signal through a complennentary dip will "undo" the phase shift as well. This relationship is shown in Figure 14-6 where both amplitude and phase response are shown for a response peak (A) and a complementary dip (B).
A
B Amplitude Response
Amplitude Response
+4 dB
-4 dB Ptiase Response
Phase Response
+20
-20*»-
Figure 14-6. Phase and amplitude response of both peak and dip sections.
The phase shift of the signal is related to its relative delay by the equation: Relative delay = -d(|)/cl(o
14.1
Relative delay is expressed here as minus the rate of change of phase with respect of frequency; (^ is the phase shift in radians and co is the angular frequency, 271(1). In the example given here, the maximum amount of phase shift for the annplitude boost of 4 dB is 20 degrees. For a 1-kHz signal, 360 degrees represents one period, a time interval of 0.001 seconds. The effect of the phase shift would be to add (20/360)(0.001) seconds, or an additional delay of 5.5 x lO^seconds, to the 1-kHz signal.
218
Chapter 14
CREATIVE USES OF EQUALIZERS AND FILTERS Today, most applications of equalizers are likely to be creative rather than remedial, and the following list details some of these uses: Fullness may be added by boosting frequencies in the 100-300 range. This will be most effective on normally weak instruments, such as the acoustical guitar, celesta and harp. No more than about 4-6 dB should be necessary. A recessive sound can be made to project more if a broad peak is added in the 800 Hz-2 kHz range. Again, 4-6 dB should be enough. The articulation transients of many instruments may be highlighted by emphasizing the appropriate frequency range. For example, the acoustic bass has fundamental frequencies in the 40-200 Hz range, but its harmonics extend up to about 2 kHz. The sound of the player's fingers on the strings are nonharmonically related to the fundamentals, but in jazz performances they are often very important in defining the musical line. Adding a broad peak in the 1-2 kHz range will emphasize them. Likewise, the same approach can be used with the acoustic guitar by emphasizing the 2-A kHz range. Crispness in percussion instruments can be emphasized by adding an HF shelving boost above 1 or 2 kHz. Bongo and snare drums may also need similar treatment. Some cautions are in order: 1. Boosting and peaking should be done sparingly on metallic transients such as those produced by cymbals, tambourines, triangles and some Latin instruments. The HF output of these instruments is already strong, and adding more may cause problems in postproduction. 2. Never use equalization as a substitute for proper microphone placement. If a microphone needs to be changed or placed closer to an instrument, then by all means make that change. 3. Do not boost too many tracks in a multitrack recording in the same frequency range. Doing this will simply result in an unbalanced spectrum which is musically unsatisfactory. In pop and rock recording the goal should be to attain a fairly uniform overall spectrum from 50 Hz to about 8 kHz during full ensemble passages. More than any other area of signal processing, the creative use of equalization is learned by observing experienced engineers and through oldfashioned apprenticeship. If you listen carefully you will soon learn that the difference between "just right" and "too much" is often no more than a decibel and a half.
Digital Equalizers
219
DIGITAL EQUALIZERS One of the great benefits of digital signal processing (DSP) is the ease with which equahzation and filtering can be synthesized. Normally, you will not encounter stand-alone digital equaHzers, but you will find them nested in digital audio workstations (DAWs) or digital postproduction consoles. Many well known analog equalizer models from the past have been carefully emulated and are available as digital "plug-ins" for use in the postproduction environment. While we are used to knobs and switches to adjust the setting on an equalizer, the graphic user interface (GUI) of a digital realization of an equalizer generally offers a speedier way to arrive at a given response curve. The onscreen view of a four-section parametric equalizer is shown in Figure 14-7. As you can see, there are no knobs. When you want to make a setting change, you click on the parameter blocks at the bottom of the figure and enter the data you wish. An alternate way of data entry is to use conventional pointclick-drag techniques with a computer mouse. Each of the four equalizer sections is represented by a small white "handle" in the figure which can be moved along the frequency axis (for adjusting frequency), and up or down, (for adjusting the amount of peak of dip).
Figure 14-7. Graphic user interface: a 4-section parametric equalizer. (Data courtesy BSS)
In a sense, you are actually drawing the response curve that you want. In this example, filter sections 1 and 4 have been set respectively for low shelf and high shelf action, while the two middle sections have been set for typical MF peak and dip functions, each at different values of Q or width.
Chapter 14
220
An example of a digitally realized 5-band parametric equalizer with end correction sections is shown in Figure 14-8. The frequency response is shown at the top of the figure, and the on-screen GUI is shown below. Here, the user clicks on a given control in order to make changes. SONY
S Q©^^^ÄQ0 S
Figure 14-8. Graphic user interface: a 5-section parametric equalizer with end sections. (Data courtesy GML and Sony Corporation)
The operational advantages of digital equalizers are: 1. They are space-saving; the equalizer is on-screen only when you wish to make a setting change or to view the response curve. 2. Curves and settings may be paged through rapidly. 3. Multiple settings for a given equalizer can be stored and recalled with precision.
Digital Equalizers
221
The disadvantages are: 1. Changes made "on the fly" are tricky and may need to be rehearsed. 2. You are "flying blind" much of the time; you can't simply glance over the console and identify how much boost or cut you have applied to a given input channel. In any event, digital equalizers will require a bit of getting used to that will remind you of your early days at the computer. Difficult at first, but speedy in the long run.
Chapter 15 DYNAMICS CONTROL
INTRODUCTION In this chapter we will discuss compressors, limiters, noise gates, and other signal processing devices that perform operations on the dynamic range of audio programs. The need for these devices comes from the fact that speech and music programs often occupy an overall dynamic range that is too great for their intended purposes. For example, live music almost always exhibits a dynamic range too wide for reproduction in the average home environment, and this has led to the general practice of signal compression and limiting during postproduction stages. While an experienced recording engineer can "ride gain" on a program manually, things can get out of hand very quickly. In broadcasting, there are times when no engineer is on duty, and it is to the station's advantage to maintain a uniform broadcast level. There is also the requirement in broadcasting that maximum signal modulation not exceed legal limits. In this chapter we will discuss various means of wide-band audio level control as well specialized tools for operating on specific portions of the audio spectrum.
ANATOMY OF A COMPRESSOR Figure 15-1 shows a block diagram of a compressor. The direct path between input and output is through a voltage controlled amplifier (VCA), whose control voltage is determined through signal processing in the side chain in the bottom portion of the figure. Program level is sensed, and a dc control voltage is produced that lowers the gain of the VCA as the input signal increases. Some compressors have input and output faders, as shown here. The input fader, since it is ahead of the side chain, will determine the amount of signal going to the side chain, thus determining the amount of gain reduction. The output fader acts only as a final gain adjustment for the device. The meter is switchable between the signal output and the side chain so that the engineer can read either the actual output signal level or the amount of gain reduction at a given time (this function is normally calibrated in dB).
Anatomy of a Compressor
223
Voltage controlled amplifier (VGA)
Attack time
Release time
Threshold Figure 15-1. Simplified signal flow diagram for a compressor.
The side chain functions labeled attack time, threshold, and release time determine the speed of the compressor's action and the program level above which that action will take place.
Gain curves Figure 15-2 shows gain curves for a compressor. The diagonal line running from lower left to upper right represents the constant gain of a normal amplifier. For each input signal increase there will be a corresponding output signal increase. Linear in-out gain curve
2-to-1 compression ratio
4-to-1 compression ratio
dB in
Figure 15-2. Typical gain curves for a compressor.
224
Chapter 15
A compressor operates very much like a linear amplifier at low signal levels, but when a predetermined threshold has been reached the compression action takes over and the overall gain is reduced. The point on the gain curve where the compression action begins is called the threshold of compression. (Some engineers refer to that point on the curve as the "knee.") The compression ratio is related to the slope of the gain curve in the region of compression. Several gain curves are shown in the figure. The twoto-one curves show that for each signal increase of two dB at the input, the output will only increase by one dB. The gain curve for a four-to-one compression ratio indicates that for an increase of 4 dB at the input, the output will increase only 1 dB.
Attack and release action During compression, the actual gain through the compressor is constantly varying, just as though an engineer was constantly manipulating a fader at the console. Such changes should not be made instantaneously, and if a gain setting is suddenly altered, the action will be quite obvious to the ear. The gain changes made by the compressor must be fast enough to catch sudden program peaks (attack time), but slow enough to allow a gentle return to the previous setting when the peak has passed (recovery time). The effects of attack and recovery time are illustrated in Figure 15-3. At A, you see a signal that suddenly increases in level and then later drops back to the previous level. When the input signal switches to a higher level (t|) which is within the range of compression, the gain of the compressor is reduced as shown at B and, after a slight amount of "overshoot," the compressed output signal drops accordingly. When the input signal returns to its original value {\j), the gain of the compressor is restored to its original value. You can see that both attack and release actions are not instantaneous; the attack time may be fairly fast, but the recovery time is relatively slow. The overall shift in compressor gain is shown at C Many compressors have user adjustments for both attack and release time, while other models have both fiinctions fixed internally. Attack times are normally in the range of 100 microseconds to 1 millisecond, while recovery times may vary from 0.5 second to about 2 or 3 seconds. While a very fast attack time would seem to be desirable, it often comes with the penalty that it can be heard as such. Most modem compressors have advanced circuitry that enables nearly instantaneous inaudible gain changes to be made. A zero-crossing detector can be used ensure that the gain change is made at an instant when the audio signal has a value of zero—^thus minimizing the audibility of the gain change as such. For special applications, some compressors delay very slightly the main signal path through the VCA portion of the compressor, while allowing the side chain to operate on the undelayed signal. If this is
225
Anatomy of a Compressor Input signal
Time
Compressed signal
B
^
1
1 1
Attack time
Recovery time
' '
t
1
_y^
V 1
Time
-..,
1
»•
Time
Figure J5-3. Compression action. Input signal (A); gain changes, showing effects of attack and recovery time (B); plot of compressor gain (C).
done carefully, the command from the side chain lowers the VC A gain before the program signal itself reaches the VC A, thus avoiding overshoot when the program peak occurs. Some "smart" compressors provide modified gain control based on the immediate signal history, For example, a sudden pause in the program would normally indicate a return to an uncompressed state in a conventional compressor design. A more sophisticated compressor would wait until the input signal resumed before making such a decision.
Chapter 15
226
THE LIMITER Basically, a limiter is a compressor with a built-in compression ratio often to one or higher, and with relatively fast attack and recovery times. The high compression ratio ensures that the program signal, once the threshold of limiting has been attained, will not increase substantially. Limiters are most often used to prevent accidental overload of transmission channels. For example, a limiter is the last signal processing element in a broadcast system, its chief function being that of preventing inadvertent overmodulation of the transmitter. LP discs are often mastered with a dedicated high-frequency limiter in the circuit for similar reasons.
MULTI-BAND COMPRESSORS The compressors we have discussed so far have been single-band; action takes place uniformly over the entire audio band, and this is ideal for compression of individual tracks. However, for more complex signals a multi-band compressor may be a better choice. As shown in the simplified signal flow diagram of Figure 15-4, the signal is divided into four adjacent frequency bands, and compression action is individually adjusted for each band. The advantage here is that heavy compression action in one band will not influence the gain in the other bands; this allows for greater overall program compression with minimum audibility as such.
Input
Multiple overlapping bands / LF YLMFVHMFV HF \
Output
—O
EH
Multiple threshold and other settings Figure 15-4. Simplified signal flow diagram for a multi-band compressor.
227
The ''De-Esser"
THE "DE-ESSER'' The de-esser is a special HF limiter that is used in vocal and speech recording to reduce the "splattering" effect of loud sibilant ("s" and "sh") sounds. Some singers and announcers have, for whatever reason, strong sibilants that can cause problems in HF overload in some recording chains. The primary frequency range of strong sibilant sounds is between 6 and 8 kHz, and the de-esser is designed to limit signals in that range. The threshold must be adjustable, but the attack and release times are normally fixed. A signal flow diagram is shown in Figure 15-5.
input
Output
o
O
/
t^''^^ \ i
t
Threshold setting Figure 15-5. Simplified signal flow diagram for a "de-esser."
CONTROL FUNCTIONS ON COMPRESSORS AND LIMITERS Some operational suggestions for setting up and using compressors and limiters are given below:
Input level control This control is of limited usefulness since it basically interacts with the threshold control. If more compression is desired, it is best to achieve it by readjusting the threshold control.
Threshold control For a fixed signal input setting, advancing the threshold control will cause the device to go into compression at progressively lower input signal levels. This is a critical adjustment and should be set so that the onset of compression will occur just as the signal is tending to become too loud or prominent in the program.
228
Chapter 15
Compression ratio This control determines the departure from natural dynamic relationships existing in the program input signal. Low compression ratios will not materially detract from natural program dynamics; high compression ratios can sound quite unnatural. An experienced ear is required in making this setting.
Attack time The general rule here is to use as short an attack time as possible without having it become audible as such. In fast moving music, short attack times may be more appropriate than in slower music.
Release time This is perhaps the most subjective adjustment of all. It should be set so that there is no "breathing" or "pumping" that become audible due to modulation of the program's noise floor by rapid gain changes.
Output level control This control merely sets the signal level which feeds subsequent devices.
Metering The meter normally has two functions. One of these indicates the output program level, and this is useful in determining the maximum level through the device. The other function lets the engineer know how much gain reduction is employed at any given instant. Good engineering practice, and good taste, dictate that you should no more action than necessary.
Stereo ganging Many compressors can be ganged together to act in unison on stereo program. This ensures that there will be no image shifting due to unequal gain changes between the stereo channels. In this mode, both VCAs are controlled by the same dc voltage. For surround sound applications you will need as many ganged compressors as there channels in the system.
APPLICATIONS OF COMPRESSORS AND LIMITERS In recording, compressors have many uses, including the following:
Variations due to performer movements A performer who tends to move toward and away from the microphone can produce wide variation in level. A properly adjusted compressor can smooth out much of this variation, resulting a recorded track that can be more easily
Applications of Compressors and Limiters
229
processed later. Vocalists are likely to the be most problematic. The compressor should be inserted ahead of the input fader so that the engineer has wide control of overall level. The choice of compression ratio is a matter of taste; in general it should be low as possible, while still accomplishing your desired purpose.
Variations in musical output Variations in the output of an electric bass can be easily smoothed by the application of gentle compression, thus providing an even and solid bass line. If the recovery time is long compared with the natural decay rate of the instrument, then the original character of the instrument will be preserved.
Adjustment of release time In the preceding example, if the recovery time of the compressor is fast compared with the natural decay of the instrument, then the timbre of the bass will be transformed into a sustained, organ-like sound, exhibiting little of the instrument's natural decay characteristic.
Heavy limiting A similar effect can be obtained by applying heavy limiting with as short a recovery time as possible to cymbals. Heavy limiting implies that the input signal is always above the limiting threshold, so that the program will have a very low dynamic range. The effect is bizarre and often sounds like cymbal tracks played backwards.
Voice-over activities Voice-over compression is a method of causing background music or effects to diminish in level when an announcer speaks, allowing the level to return to normal when the announcer stops speaking. This is a special arrangement in which the signal to the side chain is derived not from the signal to be compress, but rather from the announcer's signal. Details of this are shown in Figure 15-6.
Program compression In many broadcast operations there is the need for long-term program compression in which programs from many sources need to be fairly well matched in overall loudness and consistency. This is a tall job for many compressors, in that the signals should ideally be first adjusted to the same reference level before compression takes place. There are special compressors (some including low-signal expansion to duck out noise) which address these specific problems, and they should be considered for these special applications.
Chapter 15
230
VGA
Program input o
Output —o ^
^
Threshokl
W
Level sensing
^ Voice " input Figure 15-6. A ducking circuit for voice-over activities.
NOISE GATES AND EXPANDERS An expander is basically the inverse of a compressor; it is used to increase the dynamic range of an audio program rather than decrease it. The basic form this takes is as downward expansion, an action in which low-level signals are made even lower. The noise gate is as typical example. Operation of the noise gate is shown in Figure 15-7. The gain curve is shown at A. The device acts as a unity gain amplifier at high levels, and this is indicated by the diagonal line with slope of unity. As the input level is lowered, the gating threshold is reached, and the gain of the amplifier is reduced, thus lowering the level of any noise in the input channel. Both the gating threshold and the range of gating are adjustable, as are attack and release times. Some models of noise gates provide for external gating, and this allows one signal to be gated on and off by another for special effects. For example, you could feed a steady signal, such as that of a wind instrument, through the direct path. The gating input could then be fed with a series of bongo drum beats. The output would then be a combination of the two, with the wind instrument being gated on and off with the envelope of the bongo drum.
231
Noise Gates and Expanders dB in
B Program en^
I i
Gating threshold
Time-
c5 O)
I Range of gating (dB)
c c5 O
Time
Figure 15-7. Operation of a noise gate. Output curve (A); setting the gating threshold (B); plot of system gain during gating (C).
Chapter 16 REVERBERATION AND SIGNAL DELAY
INTRODUCTION Acoustical reverberation chambers, often referred to as "echo" chambers, date from the 1930s and were used primarily for special effects in motion picture production. Their earliest uses in the record industry date from the late 1940s. These early systems were monophonic, and natural stereo perspectives were difficult to achieve, even with a pair of mono chambers. During the late 1960s, digital hardware was developed that would ultimately simplify reverberation generation, and electronic signal delay (often called "time" delay) devices became commonplace. Today there are many excellent reverberation and delay devices that take advantage of lower cost hardware as well as advanced internal programming.
A REVIEW OF EARLY DELAY AND REVERBERATION TECHNOLOGY The reverberation chamber was discussed in Chapter 1. While these relatively small rooms could produce a fairly natural decay as such, they did not simulate the normal initial time gap at the listener between direct sound and the onset of reverberation. As pop music relied more and more on reverberation, engineers felt the need for creating the initial delay, and a number of methods were used:
Tape delay A tape recorder running at 30 ips with a record-playback head spacing of about 1.5 inches produces a tape-delayed signal of about 50 milliseconds. The system was clumsy, and tape reels had to be replaced every 15 minutes or so.
Delay tubes For shorter delays, some engineers build delay tube systems. These consisted of pipes approximately 2 inches in diameter with a small loudspeaker driver at one end and a microphone at the other. It was important to include an acoustical termination at the microphone end in order to avoid reflections in
A Review of Early Delay and Reverberation Technology
233
the tube, and a well designed tube about 21 feet long could produce a delay of about 20 milliseconds. Eventually, narrow gauge coiled plastic tubing was used and the devices became relatively small.
Analog "bucket brigade" devices During the 1970s, several manufacturers used a Philips circuit element known as a charged coupled device (CCD) that stored instantaneous signal values. These signal values were sequentially clocked through the CCD, and a delayed signal was produced at the output. The systems were fairly noisy and required pre- and post-equalization to reduce the inherent noise level of the CCDs. These systems disappeared with the coming of digital technology. A number of mechanical "spring-type" reverberation systems were developed during the 1950s. These were virtually useless for critical studio applications, but they were very popular for use with electronic organs and other instruments. The first mechanical system to gain acceptance in the mainstream recording industry was the German EMT reverberation plate. It was introduced in the 1950s, and a stereo version followed shortly. Figure 16-1 shows a perspective view of the EMT 140 stereo model with one side panel removed. A Remote damper control Steel plate
Driving transducer Pickup transducers
Figure 16-L Perspective view of an EMT 140 reverberation plate.
234
Chapter 16
steel plate approximately 1 by 2 meters is mounted in a tubular frame, and its edges are undamped. The plate is driven into transverse vibrational modes, with multiple reflections taking place at its boundaries. When properly tensioned, the plate exhibits high modal density, w^ith especially good high frequency response. A moving coil driving transducer is located toward the middle of the plate, and two piezoelectric pickup transducers are positioned toward each end of the plate. On the back side (not shown in the figure) is a porous damping layer the same size as the steel plate which can be positioned over a range from about one-fourth inch to about 5 inches from the plate. Its purpose is to damp the acoustical field generated by the plate and allow its reverberation time to be adjusted. Figure 16-2A shows top and side views of the suspended plate, and typical reverberation time values for short, medium and long settings of the damping element are shown at B. The EMT units are still in use today, and as you progress in your recording career you will undoubtedly come across them. Regarding spring-based units, the Austrian AKG company introduced the very successful BX-20 model during the late 1960s. The unit consisted of two carefully constructed springs that had been "randomized" in order to diminish the effect of normal torsional modes. Driving and receiving transducers were placed at both ends of each spring. The springs were carefully looped over themselves and were long enough to generate remarkably uniform response with none of the "boinging" effect that had plagued all earlier spring systems.
A Review of Early Delay and Reverberation Technology
235
Side view
7^
^ Drive element
Pickup o
o Pickup o
k-
A Top view Damping membrane
B
Frequency (Hz) Figure 16-2. Details of the EMT plate. Side and top views (A); typical reverberation times for low, medium, and high settings of the damping element.
Chapter 16
236
INTEGRATION OF DELAY SYSTEMS AND BASIC REVERBERATORS The reverberation systems we have discussed thus far all require the application of external signal delay in order to produce the most natural results. Figure 16-3 shows a basic plan for stereo. The intent here is to create a natural impression of ambience for a single-channel signal. Follow the direct path from the top and you will see that the direct signal appears only in the left Direct signal in Q
Direct signal
Listener Figure 16-3. Use of delay values in conjunction with an analog reverberation system.
Integration of Delay Systems and Basic Reverberators
237
stereo loudspeaker. The direct signal is delayed twice (Delays 1 and 2), and these are fed respectively to the left and right stereo loudspeakers. Finally, the signal to the stereo reverberator (R) is delayed (Delay 3), and the outputs of the reverberator are fed to the left and right loudspeakers. If we were actually recording a sound source in a performance space, the signal at one of the microphones would resemble that shown in Figure 16-4A. However, we can create a reasonable facsimile of this, as shown at B. Properly delayed and adjusted in level, a single delay of a direct signal can create an impression of what happens naturally as shown at A. In the range of early reflections up to about 50 milliseconds there is a great deal of temporal masking taking place. We do not hear each reflection as such, and a single delay that produces approximately the same acoustical power as the early sound field will suffice if its relative level and delay are carefully adjusted.
Direct sound
(0 t ^ © m\
2ra
sound field I
I Reverberant sound field
I
50
100
150
Time (msec)
B Direct sound
Artificial reverberation
100
150
Time (msec) Figure J6-4. Direct, early and reverberant fields. Sound picked up by a microphone in a large room (A); a simulation of A using discrete delay and a single-channel reverberator (B).
238
Chapter 16
In application, the value of Delays 1 and 2 would be about 15 and 25 milliseconds respectively. The value of Delay 3 would be in the range of about 40 to 60 milliseconds.
DIGITAL REVERBERATION SYSTEMS Figure 16-5 A shows the control surface of a typical digital effects system, and a general signal flow diagram for the reverberation algorithm is shown at B. Digital reverberation units are designed around a number of delay paths that simulate the actual dispersion of sound in real spaces. Various delay settings can be chosen, as can spectral characteristics and a host of other factors. Most models today are stereo; that is, there are two inputs and at least two outputs (some models have four outputs). Even with a single input signal, the stereo models will produce two uncorrelated outputs, similar to a reverberation chamber with one loudspeaker and two microphones for pickup. As you will see, digital systems do not require the use of external delay devices, since those functions are present in the basic digital system.
Fbk3
B
DryL
Lvl3 Dly3 Mix
Analog L ^ <
DfyL Dlyl
>^
Reverberator Digital L
PDLY SHAP SPRD
Analog
SIZE XOVR WAND RTIM BASS SPIN LINK TDGY Lvl2
Digital I
•
DryR
^ i ^
'
^-^
[^D—f—KjH^-gU. Fbk4
Figure 16-5. Photograph of a digital effects system (A); signal flow diagram for the system's reverberation algorithm (B). (Data at A courtesy Lexicon)
Digital Reverberation Systems
239
A typical high quality reverberation system today offers the user control over many variables, including the following:
Program choice The user may choose among programs that are specifically modeled on physical spaces, such as concert halls, houses of worship, small reverberation chambers, or even reverberation plates. Within each program there may be other options, such as simulated room size and diffusion.
Predelay This allows the user to delay the onset of reverberation, usually up to a value of 100 milliseconds or so, in order to simulate the early time gap in a physical space.
Early reflections Pre-echo delays and level setting give the user further flexibility in simulating early reflections.
Low and mid-frequency reverberation time These controls enable the user to select different decay rates for low and mid frequencies. The transition frequency between low and mid can also be chosen, giving the user flexibility in simulating spaces with specific absorption characteristics.
High frequency rolloff This option lets the user determine the frequency above which the reverberant decay is quite rapid.
Decay shape Normal decay is exponential, but other options may be useful. For example, the decay can be programmed to build up in level before it begins to decay. Such a program variation might be useful in pop or rock recording as a special effect.
Mode density Some programs are calibrated directly in terms of room size, and increasing room size will increase modal density.
Wet/dry mix Normally, the reverberant signal is fed back into the console and mixed into the program by the engineer. In some applications, the dry (unreverberated) signal can be balanced with the wet (reverberant) signal at the output of the reverberation unit.
240
Chapter 16
SAMPLING REVERBERATION DEVICES The traditional approach in designing a reverberation algorithm is to analyze what physically happens in a given space, and then model that space through delays and various feedback paths that simulate second and higher order reflections. The user has direct access to many of these variables, and a given program can be "fine-tuned" by the user as required. A recent development is the sampling reverberator. In this approach, a room or performance space is actually sampled through the technique shown in Figure 16-6. A wide-range loudspeaker is placed on-stage and a set of spaced microphones are located in the audience seating area. A test signal is fed to the loudspeaker and picked up by each microphone. That data is recorded and converted into what is known as the impulse response of the room. In this case we have four such impulse responses, and, through a mathematical process known as convolution, an incoming dry signal can be reverberated as it would actually liave sounded at each of the microphone locations. The approach has great promise, and typical models may be sold with a library of sampled spaces, including some of the world's great performance venues, both indoors and out. There is no reason to think that sampling technology will replace conventional reverberation algorithms; the two will certainly coexist.
Sampling Reverberation Devices
241
Gathering room impulse response data PC Impulse response data Signal generator Digital recorder
V
Perfomiance space to be sampled
Amplifier
4 spaced microphones
o^
Wide-band loudspeaker
Adding sampled reverberation to a mono channel Mono input signal \j
^
Convolver
^ Four reverberated • outputs
li-npulse respons•e
data Figure 16-6, The sampling reverberator. Method of gathering room reverberation impulse response (upper figure); use of impulse response to create a set of reverberated signals from a single input (lower figure).
Chapter 17 SPECIAL TECHNIQUES IN SIGNAL PROCESSING
INTRODUCTION In this chapter we will discuss a number of signal processing techniques that do not fit neatly into the subject areas of the three preceding chapters. Some of the techniques are rather complex and may not be normally accessible to the engineer; however, the engineer should know how they work and what useful things can be accomplished with them. The following techniques will be discussed: phasing, out-of-band signal generation, pitch and tempo regulation, chorus generation, vocoders, stereo image manipulation and all-pass phase shift networks. The techniques discussed here have all had their basis in analog signal processing, although most of them now take advantage of digital technology.
PHASING (FLANGING) Phasing is a technique dating from the 1960s. Originally, it was done by feeding two tape recorders the same signal and combining their outputs. Any variation in the speed of one machine relative to the other results in a time difference between the two outputs, and the recombined signal exhibits comb filtering, which can be made to vary over a wide range. The basic phasing process is shown in Figure 17-1. At ^, the term T represents the fixed delay of each tape recorder and is the time gap between record and playback heads. The term At represents the difference in delay between the two machines and is the net value of delay that causes the comb filter response. The value of At can be varied electrically by driving one tape recorder with an external ac power source whose frequency can be shifted around 60 Hz. Another method of varying At is for the operator to place a thumb on the flange of the tape feed reel, thus slowing it down. This practice gave rise to the term "flanging" and is synonymous with phasing. The above techniques for phasing are cumbersome and introduce a fixed time delay into the signal path. So-called instant phasing is possible through
Phasing (Flanging)
243
the use of a delay system whose total delay can be varied in small steps over a wide range, or through the use of a variable phase shift network. When direct and delayed outputs are combined, the effect is very similar to that using the two tape recorders. The sound of phasing is hard to describe. It is most effective on broad band program material, such as cymbals and snare drum. It produces a bizarre "swishing" sound as the peaks and dips move up and down the spectrum. On vocal tracks, the effect often has a "disembodied and ghostlike" quality.
Input Tape recorder
Output * ( ^
1 s^ T±At Tape recorder
t Speed variation
B A^^ Af=1/At
HI >
Log frequency (Hz) Figure 17-1. Principle of flanging: Basic arrangement (A); creation of comb filtering (B).
Chapter 17
244
ENVELOPE SHAPING An envelope shaper is basically a voltage controlled amplifier with a detector (rectifier) ahead of the control signal input. This enables the envelope of one instrument to be superimposed on another instrument. In the example shown in Figure 17-2, the main signal input is a steady tone, while the control input is a percussive sound. The detector extracts the envelope of the control signal and uses this to modulate the steady tone, giving it a percussive sound.
Input
Output
o —
—o
Detector (log or linear)
Control signal in
Input
Control input
Output
Figure 17-2. Principle of envelope shaping. Basic arrangement (A); input and output waveforms (B).
Out-of-Band Signal Generation
245
OUT-OF-BAND SIGNAL GENERATION It is possible to generate program-related signals both above and below the normal bandwidth of a program signal. These devices are useful both in the special effects department as well as in adding greater realism to older, bandlimited recordings.
High-frequency generation Many times, a given track on a multitrack recording may be inherently band limited. For example, a vocal track may not extent much beyond 4 or 5 kHz. Any attempt to boost frequencies in the 8 kHz range will produce nothing, since there is nothing there to boost. The ear is relatively insensitive to pitch information at both the highest and lowest parts of the normal frequency range, and there are fairly direct methods for generating natural sounding harmonic or subharmonic signals at the extremes of a program channel. Obviously, such techniques provide an effect that no amount of equalization can accomplish. Figure 17-3 shows details of a signal processor for generating an extra octave above 7.5 kHz. Program in the octave between 3.75 and 7.5 kHz is used to produce a synthetic top octave between 7.5 and 15 kHz. The distortion network in the side channel can be adjusted both in the degree of distortion it produces and the amount that is returned into the main channel. The distortion network also produces low-frequency components which, if allowed back into the main channel, would be quite audible, and the purpose of the high-pass filter is to remove these components. Input
Output •0
0
y/mj' Band-pass filter (3.75 - 7.5 kHz)
Distortion network
7 . 5 - 1 5 kHz components
Figure 17-3. Circuit for producing a synthetic upper octave.
Such devices are used mainly on vocal tracks in pop-rock recording, as heard in some of the early Steely Dan recordings. The first device of this kind to gain wide favor was the Aphex "Aural Exciter."
Low-frequency generation At the lower end of the frequency range the task is more difficult. What we want to do is halve the frequencies in the next-to-bottom octave to simulate
Chapter 17
246
the bottom octave itself. The dbx Company designed the "Boom Box" for this purpose, and simplified details are shown in Figure 17-4. Subharmonic generation involves complex wave shaping, filtering, and gain control, as opposed to the relatively simple demands of high-frequency harmonic generation. The device has found wide application in discotheque installations, where the extra LF signal is greatly appreciated. Input
Output o
Band-pass filter (55-110 Hz)
Subharmonic generation
27.5 - 55 Hz components
Figure 17-4. Circuit for producing a synthetic bottom octave.
PITCH AND TEMPO REGULATION One of the characteristics of normal recording technology is that changing the playback speed of a recording, fast or slow, will alter both pitch and playing time by reciprocal ratios. That is, if you shorten the playing time to 0.8 its original value, you will raise the pitch by a factor of 1/0.8, or 1.25. There is no direct analogy between sound recording and cinematography, where an adjustment of frame rate provides a totally satisfactory manipulation of the time domain in the form of fast or slow motion. By a slight stretch of the imagination, you can separate musical events into those that are perceived as unfolding in time, and those that are clearly perceived as functions of frequency. Elements of rhythm, the characteristics of vocal vibrato, and so forth, are clearly perceived as events in time. In fact, musical events that happen no more than 8 or 12 times per second can be clearly classified as events in the time domain. Events that occur more than, say, 20 to 25 times per second are perceived as having definite pitch and are clearly in iht frequency domain. Those events between frequencies of 12 and 20 Hz fall into an indeterminate zone. What we would like is some means of treating the time domain information (less than 12 Hz) as a series of "frames" to be sped up or slowed down, while leaving the frequency domain alone. This would provide us directly with a tool for attaining tempo change with no change pitch—or for attaining a pitch change with no tempo change. The earliest method for achieving pitch shift was with a complex rotating playback head assembly. This provided a means of "scanning" the tape, for-
Commercial Pitch Shifters
247
ward or back, while the overall speed of the tape through the transport remained fixed. During the 1970s, digital technology simplified the process as shown in Figure 17-5. The basic method is shown at^. An incoming digital signal is sent to a ring buffer, which has a signal storage capacity of perhaps 50 milliseconds. As the signal enters the buffer, it is clocked through the system to the end of the buffer where it is dumped. The signal output of the buffer is via the arrow shown pointing upward. If the arrow moves around the buffer in counter-clockwise motion, it will raise the pitch of the incoming signal—just as the pitch of a fire engine siren will rise through Doppler shift as the vehicle approaches. The opposite is also true; that is, scanning the buffer in a clockwise direction will lower the pitch.
Audible artifacts in pitch and tempo regulation What happens when the scanning arrow reaches the end of the buffer? It instantly goes to the other end of the buffer and begins its path over again. When this commutating action takes place, there may be a slight audible "bump" as the signal waveform is cut and re-spliced. Some newer pitch regulators use "intelligent splicing" to minimize commutation effects. In these models, the delay line is examined ahead of the signal output in order to determine a favorable edit point before the end of the line has been reached.
Tempo regulation Tempo regulation is not a real-time activity and requires the playback of an existing digital recording. It is a variation of pitch shifting and is shown in Figure 17-5D. If you want to increase the playing time of a program by, say, a factor of 1.25, you slow down the playback rate by that amount through a variable clock setting. The output signal is then raised back to its original pitch by the reciprocal factor of 1/1.25, or 0.8. Tempo regulation is often used in radio and TV commercials, where timelimited advertisements may call for speeding up of spoken segments. Its use in music is far more limited, and it works best when used the least.
COMMERCIAL PITCH SHIFTERS These devices usually contain three or more pitch-shifting sections and are calibrated in musical intervals. They often go by such trademarked names as Harmonizer, Vocalizer or Vocalist. These devices, which are available as stand-alone units or as digital plug-ins for workstations, literally enable soloists to sing duets with themselves. Figure 17-6 shows a commercial model made by Digitech designed for workstation use.
Chapter 17
248 Ring buffer Analog signal in
A/D Iconverterl
O—
D/A iconverterl
Pitch shifted signal out
O
nng buffer
Raising pitch
Lowering pitch
Digital playback
Pitch shifting
Variable clock rate
Pitch control
-O
Figure 17-5. Pitch and tempo shifting. Principle of pitch shifting (A); ring buffer scanning for raising pitch (B); ring buffer scanning for lowering pitch (C); principle of tempo regulation (D).
S^
KlSffiSS^Sffi!? me^^m^Mi^mtA^i
.
,13 cruci::
ZICID
d U CZZl ZZ3 C~)
Figure 17-6. Pitch shifting in post production. Note the musical keyboard for data input. (Figure courtesy Digitech)
Chorus Generators
249
CHORUS GENERATORS The purpose of a chorus generator is to take a single vocal or instrumental track and make it sound as if there were more than one performer. A characteristic of any ensemble performing the same melodic line is that there will be very slight pitch, amplitude, and timing randomness among the players. These elements create a pleasing chorus effect offering some of the natural advantage of a large performing group over a smaller one. The diagram shown in Figure 17-7 provides a simulated chorus effect by combining a number of signals, each individually subjected to slight but random delay modulation for creating small pitch and timing randomness, as well as randomness in amplitude (gain). The method cannot make a single voice sound like a large chorus, but it can make a small group of voices sound like a much larger group. Processors of this sort have been used extensively in electronic organs to create a natural ensemble effect. Additional processing, not shown here, can result in multichannel output, providing spatial randomness as well. T1
(^
^ n i
? \ 'ki
T2
>^—n2 Input O
i
T3
Ik2
^^— n4
? ?
^ — ns
?
> ^ — n3
^»-T V \ \ )
Ikd
T4
^ T5
' k5
T1.. 5, values of time delay m . 5, modulating values of low-frequency noise k1.. 5, modulating values of low-frequency noise
Figure 17-7. Circuit for creating chorus effect from a single source.
Outp i^
Chapter 17
250
VOCODERS A vocoder is a VOice CODER, a complex signal processor that analyzes speech into its component parts and synthesizes it again out of these components. In the process of reconstructing speech from its components, substitutions can be made, and speech modulation can be added to a variety of musical or other non-speech sounds. Early work on the vocoder was carried out by Bell Telephone Laboratories as a means of reducing the data transmission rate for speech. The basic information rate of speech is fairly low, and if speech is broken down into its basic components it can be transmitted over narrow bandwidth channels for subsequent reconstruction. The basic layout of the classic vocoder is shown in Figure 17-8. Multipliers
>peech input
Band-pass filter 1 I
Rectifier and low-pass fitter
5?
Band-pass filter 1
Band-pass filter 2
Rectifier and low-pass filter
X
Band-pass filter 2
1 Band-pass filter 3
Rectifier and low-pass filter
X
Band-pass filters
\JL^ Speech
Band-pasd filtern
9^
Rectifier and [_ low-pass filter r
y®
output
Band-pass filtern
T
Excitation analysis
Excitation generator (pitch tracking, pitch offset)
A Other wide-t)and program sources may be introduced here
Figure 17-8. Simplified circuit for a vocoder.
The spectrum filter banks detect the presence of speech signals in their respective bands, and that information is converted into a group oidc control signals. These filter banks are responsible for detecting the formants of
Stereo Image Widening
251
speech, the basic vowel sounds such as "ah," "ee," "oh," and so forth. The pitch/excitation analyzer determines the fundamental frequency of speech and the presence of various "noise" components of speech, such as hisses, buzzes, and plosive sounds (b, p, d, t, etc.). Upon reconstruction, the fundamental is regenerated and the filter banks in the synthesizer portion of the system are gated according to the amount of signal originally present in each band. The excitation function restores the various noise components as required. Depending upon how many band-pass filters there are, and how sophisticated the pitch tracking and excitation functions are, vocoders can synthesize speech very naturally. More recent designs of the vocoder allow a wide variety of effects useful in electronic music and recording. The more sophisticated systems contain many filter banks for more accurate analysis and syntheses of the vocal spectrum. In addition, the pitch/excitation function provides a pitch offset function that allows a man^s voice to be transformed in pitch so that is may sound like a child's voice. Further, the pitch tracking may be replaced by an external source, thus enabling wide-band signal, such as noise or even music, to be speech-modulated! The vocoder is widely used in cinema work, where it seems ideal for special effects in futuristic films.
STEREO IMAGE WIDENING Many times, an engineer will be asked to make a previously recorded stereo program sound more spacious. There are certain conditions that might call for this. A stereo program may have been blended by mixing the two channels together to some degree; a previous engineer may have thought that the program had too much separation. Vocals are sometimes mixed too heavily relative to the instrumental background, and the resulting recording may seem to be center-heavy. Many of these problems can be fixed, or at least alleviated. The basic method is to provide LF cross-feed between the two channels with signals of opposite polarity. The signal flow diagram in Figure 17-9 shows how this can be done, and the operation can be set up on any console that has polarity inversion switching on the input strips. Caution is advised; do not use any more of the negative cross-feed terms than absolutely necessary to produce a mild spreading of the stereo image. As a practical matter, phantom center program can be reduced by no more than about 3 dB, while negative polarity terms will appear in opposite channels down about 10 dB in level. Listen carefully!
Chapter 17
252 Left channel in O
Right channel in
O Lo-pass shelf filter
Lo-pass shelf filter
Hi-pass shelf filter
fiyj^^^-^iNs Amplifier
2Z^ Loudspeaker
Amplifier
Loudspeaker
Figure 17-9. Circuit for producing stereo image widening at lower frequencies.
ALL-PASS PHASE SHIFT NETWORKS A passive all-pass phase shift network is shown in Figure 17-lOA. The lattice network provides flat frequency response but shifts the phase of the signal from 0 degrees at LF to 180 degrees at HF. Such a network can be used to change the crest factor of wide-band signals. For example, the signal shown at B clearly has higher peak values in the positive direction than in the negative direction. When this signal is passed through the network, it is modified as shown at C. The phase shifting does not affect the sound of the signal at all, but the resulting lowering of crest factor can be useful. For example, in broadcasting applications, the natural sound of many male announcers is rich in harmonics and has a fairly high crest factor. The crest factor may in fact be high enough to cause modulation problems which must be fixed. Rather than make use of limiting to contain the peak signal values, the insertion of an all-pass network in the chain may solve the problem far more simply, allowing an overall louder broadcast level for a given average signal level.
253
All-Pass Phase Shift Networks
input
R ^
Output
B
Figure 17-10. All-pass phase shifting. Passive lattice network for producing a phase rotation from 0 degrees at LF to 180 degrees at HF.
Chapter 18 FUNDAMENTALS OF STEREO RECORDING
INTRODUCTION Stereophonic sound, or stereo as it is usually called, refers to any system of recording or sound transmission using multiple microphones and loudspeakers. Signals picked up by the microphones are fed to loudspeakers placed in a geometrical array corresponding to the pickup zones of the microphone array. Many of the spatial aspects of the recording environment are preserved, and the listener can perceive the spatial perspectives of the original performance. Stereo is not limited to two channels. Motion picture systems have included six channels or more, but for home use stereo has traditionally been limited to two channels. Serious studies of stereo were undertaken in both England and the United States during early thirties, but it did not become a commercial success until the advent of multichannel motion pictures during the 1950s. About that same time, two-track stereo tapes for home playback became available, and in 1957 the stereo LP was introduced.
COINCIDENT MICROPHONE ARRAYS FOR STEREO One of the first stereo microphone arrays was developed by Allan Blumlein in the early 1930s. It consisted of a pair of figure-eight microphones positioned one on top of the other and rotated 90 degrees with respect to each other. This so-called ''Blumlein array" is shown in Figure 18-1. Both positive lobes of the crossed microphones are set at 90 degrees, and the arc between the positive lobes is aimed toward the players. For positions 1 and 3, sound localization will clearly be at the left and right loudspeakers since each of the sources lies along the major axis of one microphone and the null plane (at 90 degrees) of the other. A source positioned at 2 will be picked up equally by both microphones and will appear in the center of the stereo array. The pressure ampHtude of the sound picked up by each microphone will be 0.7 (-3 dB) relative to on-axis, and this corresponds exactly to the output produced by a panpot set for center localization (refer to Figure 2-9).
Other Coincident Arrays
255 Source 2
1 Source 1
Sources
Side quadrant
Back quadrant
Figure 18-1. The basic Blumlein crossed figure-8 array.
All sources located within the 90-degree pickup arc of the Blum lein array will appear in stereo playback uniformly positioned according to their location along the pickup arc. In addition to the exact "panning" of sources positioned in the front quadrant, the Blumlein configuration has the following characteristics:
Opposite polarity side quadrant pickup Since the two side quadrants are picked up by both positive and negative lobes of the figure-8 microphones, any sources of sound located at the sides will appear ambiguously in the playback array and should be avoided.
Reverberation pickup Since reverberation may enter the microphone array more or less equally through the back and the two side quadrants, it will contain both in-phase and anti-phase signals. The result is a very natural reverberant pickup with just a hint of localization extending outside the loudspeaker array. In actual use the Blumlein array requires careful judgement in positioning. It must be far enough from the ensemble to fill the front quadrant properly, but it must not be so distant from the ensemble that direct-to-reverberant relationship suffers.
OTHER COINCIDENT ARRAYS Crossed cardioids are often used in a manner similar to the Blumlein array. It is customary to spread the angle between the major axes out to perhaps 120 degrees in order to avoid too much pickup along the central axis of the
Chapter 18
256
array, which would produce a strong center, or monophonic, dominance. Supercardioid and hypercardioid microphones are often used in a similar manner, and these would be the natural choice in spaces that were too live for the Blumlein array. Examples of these alternate arrays are shown in Figure 18-2.
Figure 18-2. Other coincident arrays. Splayed cardioids (A); splayed supercardioids (B).
When cardioid, supercardioid, and hypercardioid microphones are used in a coincident array, it is common to splay them at the angles at which their overlap is -3 dB, relative to on-axis. This information is given in the table below. Pattern
Total angle between microphones for - 3 dB overlap
Bidirectional Cardioid Supercardioid Hypercardioid
90° 131° 115° 105°
STEREO MICROPHONES A number of compound microphones have been designed that include two capsule assemblies that can be individually adjusted in pattern and in their angular orientation. Such a design is shown in Figure 18-3. Stereo microphones of the kind shown here can be used for all coincident stereo pickup arrays. Most models have a control unit that allows pattern changes to be made remotely; capsule orientation is made manually at the microphone.
Mid-Side (MS) Pickup
257
Figure 18-3. Cutaway view of the Neumann SM-69 stereo microphone. (Photo courtesy Neumann USA)
MID-SIDE (MS) PICKUP Coincident microphones are very often manipulated by way of the MS* technique. MS employs a directional microphone (M component) aimed straight ahead, emphasizing the middle of the performing group. A figure-8 microphone {S component) is oriented at 90 degrees so that its two lobes emphasize side pickup.
Chapter 18
258
The outputs of the MS* pair are recorded but are not listened to directly. They are processed through sum and difference circuits to produce left and right signals suitable for normal stereo listening. The basic process is shown in Figure 18-4. You can see a t ^ that the M signal from the forward-oriented cardioid is added in-phase to each of the summing amplifiers. The side-firing S signal is added to the L^^^^ amplifier and subtracted from the R^^^^ amplifier. When the signals are summed in this process, the result is a net stereo pair of microphones as shown at B, one oriented left and the other oriented right. Ensemble
B Left
Right
Figure 18-4. MS recording. Microphone array setup (A); resolution into stereo (B).
When used in MS form, coincident microphones are extremely flexible. By varying the amount of the S component, the apparent width of the array can be altered electrically. Increasing the Mcomponent will narrow the stage, since M represents a mono center phantom image. Increasing the S component will widen the stereo stage, but too much S component will confuse the stereo imaging. A position control allows shifting the array left or right, as
Mid-Side (MS) Pickup
259
desired. These circuit modifications are shown in Figure 18-5, and a typical in-studio application is shown in Figure 18-6. Width control
Position control
Cardlold ( M ) a
OLout
ORout Flgure-8 ( S ) 0
Figure 18-5. Description of width and position control in MS" recording. Ensennble
Figure 18-6. Application of width and position control in MS recording.
Figure 18-6 shows how a typical MS' recording might be made. A main MS pair (1) provides the overall ensemble pickup. A secondary MS pair (2), which would be operated at a lower level and with reduced width, provides accenting of the middle of the ensemble. Using both the stereo width and position controls, MS' pair (3) could be used to highlight that part of the ensemble located at stereo right. When the MS stereo signals are summed for monophonic presentation, the S signal drops out completely, leaving only the M signal, as shown in Figure 18-7. The mono compatibility is excellent, and many broadcast engineers prefer it for this reason. Regardless of the specific pickup method they may use, some recording engineers prefer to take their stereo channels, convert them to MS, and then-
Chapter 18
260
convert them back again to stereo. While in the MS domain, the Mand S components can be separately adjusted so that the final stereo sound stage can be widened or narrowed as desired. This technique, shown in Figure 18-8, can be used for putting the finishing touches on the "spread" of a stereo recording.
Left and right patterns
Summation of left and right channels
Resultant forward-oriented cardioid
m V
(h
Figure 18-7. Mono compatibility of MS recording.
LinO OLout
ORout RinO Figure 18-8. Inserting an MS matrix into a stereo chain.
Figure 18-9 shows the relationship between various MS, or sumdifference, microphone arrays and their equivalent XT, or stereo, forms.
Implementation of MS Recording
261
Sum-djfference form
XY (left-right) form Left
I
Left
Right
Right
Figure 18-9. Various MS arrays and their corresponding stereo {XY) arrays.
IMPLEMENTATION OF MS RECORDING Implementation of MS normally requires a special sum-and-difference arrangement of transformers or amplifiers. Such systems are available as external processors often called "matrix boxes." If you are presented with an MS recording, and you do not have a playback matrix for it, you can improvise one at the console, as shown in Figure 18-10. This arrangement calls for a polarity inversion switch in one input channel, as shown.
Chapter 18
262
M input O Left output O Right output O
S input
Polarity reversal
RL
Figure 18-10. Resolving an MS recording on a console.
NEAR-COINCIDENT STEREO MICROPHONE ARRAYS Near-coincident stereo microphone arrays make use of a pair of directional microphones laterally spaced no more than about 1 foot (30 cm). The purpose of these arrays is to combine some of the excellent imaging of coincident arrays with the added sense of space or ambience that small displacements between microphones can produce. Since the 1970s, a number of specific techniques have been described in detail.
ORTF technique Developed by the French Broadcasting Organization, the ORTF technique employs a pair of cardioid microphones spaced about 7 inches (17 cm) apart and angled at 110 degrees. In detailed listening tests the system has rivaled the Blumlein array localization acuity, while at the same time oflFering immunity to excessive reverberation. The array is shown in Figure 18-11 A.
NOS technique This approach was developed by the Dutch Broadcasting Organization and employs a pair of cardioid microphones spaced about 12 inches (30) cm apart at an angle of 90 degrees. It has many of the advantages of the ORTF array, but with a shade more ambience. The array is shown in Figure 18-1 IB.
Faulkner technique This approach was first described by a British recording engineer during the 1970s. It works best in the recording of smaller groups in fairly live spaces. The figure-eights provide good rejection of excess reverberation, and the primary use of delay cues for localization gives a slightly "soft-edged" quality
Spaced-Apart Microphones for Stereo Pickup
263
17 cm (6.7 In)
B
ORTF Array
30 cm (11.8 in)
NOS Array
20 cm (7.9 in)
Faulkner Array Figure 18-11. Various near coincident arrays. ORTF (A); NOS (B); Faulkner (C).
to the stereo imaging. You may find that widening the spacing between microphones will enhance separation—but in no case should the spacing exceed about 2 feet.
SPACED-APART MICROPHONES FOR STEREO PICKUP The logical starting point for developing spaced-apart stereo microphone placement is shown in Figure 18-12. Here, we have a horizontal line of many microphones in front of a performing group. Each microphone feeds only to its corresponding loudspeaker in the playback space. As the figure clearly shows, there will be accurate wavefront reconstruction in the playback area, and an observer in that area should be able to hear stage events just as if that listener were located in the original recording space.
Chapter 18
264 STAGE SOURCE
— DIRECT SOUND PULSE SCREEN OF MICROPHONES ELECTRICAL CHANNELS VIRTUAL SOURCE SCREEN OF LOUDSPEAKERS INDIVIDUAL POINT-SOURCE SOUND PULSES SINGLE RESULTANT SOUND PULSE AUDITORIUM
OBSERVERS 1 PULSE TO EACH EAR • Figure 18-12. Waveform reconstruction using a horizontal line of microphones.
This is all fine in theory, and 1 have heard one instance where 16 microphones, spaced about a foot apart, were placed in front of a big jazz band. The recording was played back over 16 loudspeakers placed directly below the microphones. While the effect was certainly interesting, it was not all that accurate and hardly justified the expense required to carry it out. (Actually, the problems were more timbral than spatial; you heard instruments where they were located, but the sound quality suffered.) Taking Figure 18-12 as a starting point, the number of channels has been reduced to three, as shown in Figure 18-13. While this is far from ideal, it works much better than you might think. You do not normally hear "three distinct sound pulses" as the figure indicates, because the relative time delays between them are fairly short, well within the range defined by Haas (1954). Haas' experiments measured what had long been referred to as the "law of the first wavefront," which states that localization will tend toward the direction of the earlier arriving sound. Over a range of about 20 to 25 milliseconds, the ears will not normally detect a delay as such, and sounds arriving over that short interval will usually coalesce into a single impression for the listener. As motion pictures adopted stereo during the 1950s, the three-channel approach shown here was used. Three-channel magnetic tracks were adopted by the recording industry at the same, and when the stereo LP was introduced in 1957 the basic three-channel recordings were mixed down to two channels by feeding the center channel equally, and at slightly reduced level, to the left and right channels.
Spaced-Apart Microphones for Stereo Pickup
265
STAGE SOURCE DIRECT SOUND PULSE ^^TX
3 MICROPHONES 3 ELECTRICAL CHANNELS - 3 LOUDSPEAKERS
3 DISTINCT SOUND PULSES
AUDITORIUM
3 PULSES TO EACH EAR Figure 18-13. Waveform reconstruction using three spaced microphones.
This approach pretty much defined stereo recording in both popular and classical fields and has been the basis ofthat art up to present times. The variety of current spaced-apart microphone techniques for stereo is shown in Figure 18-14. The technique shown a t ^ is the direct use of the original threechannel approach we have just discussed. The approaches shown at B and C combine the advantages of spaced and near-coincident techniques. The socalled Decca tree shown at D was developed by the British Decca company in the early days of stereo and remains a popular approach today. In normal application, it specifically calls for five Neumann M-50 microphones, which are omnidirectional at LF and MF, but become increasingly directional at HF. The tree itself consists of three microphones aimed left, center, and right. The middle microphone is center-panned in the stereo mix while the other two are panned hard left and right. The center microphone is normally introduced into the mix at a level some 3 to 4 dB lower. The two outrigger microphones are spaced as best fits the music and are mixed hard left and right at a somewhat lower level than the microphones on the tree. When two spaced omni microphones are used to pick up a large group, there is a strong tendency to spread them too far apart. If they are farther apart than about six feet, then there will be a "hole" in the middle of the stereo stage. This can be easily filled in by using a center microphone of the same pattern and panning it in the center of the stereo stage. You will probably find that it will sound best when it is mixed in about 3 or 4 dB lower than the outrigger microphones.
266
Chapter 18
e
Omni
e
e
Omni
Omni
e
V
©
5
^^
Omni
ORTF
Omni
0.5 meter
5" Ail sutx^ardioids
Decca tree ^.
e
1.5 meter
ez IX V-"-^ 2 meters
e
Figure 18-14. Spaced microphone techniques for stereo. Three spaced omnis (A); two spaced omnis with center ORTF pair (B); two spaced subcardioids with near coincident array (C); Decca tree with outriggers (D).
Chapter 19 STUDIO RECORDING AND PRODUCTION TECHNIQUES
INTRODUCTION Most studio recording activities today are done multitrack and are intended for postproduction for record, video, or film formats. A professional, competitive studio has a variety of working spaces adjacent to the control room, including vocal booths and possibly larger isolation areas. The main room itself will often have variable acoustics, ranging from very dead, or damped, to relatively live. Isolation is a prime requirement in laying down useful tracks, and this can be accomplished through proper microphone placement and the use of isolating elements. In this chapter we will cover specific microphone techniques and procedures that are in everyday use in the industry. The techniques discussed here assume that all basic tracks are stored on multitrack recorders, with signal processing taking place only in the monitor mix and in later postproduction mixing operations. We will also assume that there are headphone monitoring facilities for the players, if required. A very important part of this chapter will be discussions of what tracks to lay down—^the basic decisions of what will be useful and necessary in postproduction.
SOME ISOLATION TECHNIQUES Figure 19-1A shows a wall construction detail you will find in many studios. Individual sections can be reversed as shown to expose either a reflective or absorptive surface, depending on the need. The view at B shows a studio with both live and damped areas. The damped area shown at the upper right would be an ideal location for an acoustical bass or a drum set as part of a small jazz group. A string section might sound better in a more reflective part of the studio, as shown at the left portion of the figure. There are two kinds of isolation requirements: relatively soft performers, such as vocalists, need to be isolated so that unwanted sounds will not enter
Chapter 19
268 A
Reflective surfaces
surfaces
B
^^m^mm^mmmmm
Reflective surfaces
m v////mv/////m^zm Absorptive baffles
Figure 19-1. Isolation in the studio. Wall detail with reflective and absorptive surfaces (A); live and dead areas built into the studio (B).
their microphones; loud performers, such as heavy brass and percussion players, may need to be isolated so that their sound will not interfere with softer performers. Often, both kinds of isolation must be used together. Many groups, such as small jazz combos, are pretty much self-balancing and require no special treatment. On the other hand, a jazz big band will produce levels that will easily swamp out an acoustical bass or vocalist. This is where isolation booths and direct pickups come in very handy, and you will see how these options are chosen as we proceed with this chapter.
Track Logistics
269
TRACK LOGISTICS In preparing for a tracking session, be aware that you are establishing a plan that will follow the project through to the end. Allocate recorder tracks carefully, and combine two or more microphones into stereo pairs when that makes good sense. Make sure that your machine operator is keeping an accurate log entry for every take. You will be making a stereo monitor mix as you go, and control room playback will normally be from that source. Analog tracking will give you a maximum of 24 tracks. Through sync-ing two machines you can reach a maximum of 48 tracks if there are separate time code tracks. This is a complex procedure and is not casually recommended. Digital techniques allow the use of multiple modular units, and track capability is virtually unlimited. Again, as things become more complex, the greater likelihood of logging mistakes and machine glitches. You will not likely be the chief mixer on a tracking session until you have done your apprenticeship as a second engineer. And at that stage you will learn the value of concentration and orderly bookkeeping.
PERCUSSION INSTRUMENTS; RECORDING THE DRUM SET The complex nature of the drum set requires that it be recorded in stereo, often heavily "spotted" with microphones. The modem drum set consists of the following elements, played by one person: 1. Kick drum (played by the right foot). 2. Snare drum (played with either sticks or wire brushes). 3. "High-hat" cymbal set (played with sticks or brushes and left foot). 4. Two "ride" cymbals (played with sticks or brushes). 5. Two or more tom-toms (played with sticks). The arrangement and actual number of individual elements in the set may vary from player to player, but the setup shown in Figure 19-2 is typical. Figure 19-3A shows a basic stereo.setup that would be appropriate for a jazz trio or quartet. There is an overhead spaced stereo pair along with a single kick drum microphone. The overhead pair will normally be small format condenser cardioids, and the kick drum microphone may be either a dynamic or condenser. A number of fairly low cost dynamics have been design specifically for kick drum use, and they are characterized by a slight LF rise in response as well as low distortion at high levels. Try both microphone types
270
Chapter 19 Cymbals
Hl-hat cymbals
Toms Snare drum
Kick drum
Figure 19-2. A basic drum set.
About 7 feet
"P^ Condenser
Dynamic
Figure 19-3. Basic pickup of the drum set (A); a more complex pickup (B).
and, if you have the track capability, record them both. While some engineers prefer a coincident pair for overheads, most use a spaced pair as shown, since it produces a wider stereo image. In larger jazz groups, or where the drummer is to be featured, it will be necessary to add a number of spot microphones, as shown at B, Use any or all that you, the producer, and the player deem necessary. You should also keep
Other Percussion Instruments
271
in mind that recording the drum set can take up many tracks, so lay out a plan for track assignments before you start. In some cases you will be able to combine certain microphones—as long as you keep in mind that, once grouped, they cannot be separated. Internal microphones should be located close to their respective sources, but they should be placed away from the player so that they will not interfere with the player's movements or be struck with a stick. Families of small clipon microphones, as shown in Figure 19-4. are available from many manufacturers and are useful in recording the drum set.
Figure 19-4. A small electret microphone that can be clipped to the rim of a drum. (Courtesy AKG Acoustics)
When closely picked up, spurious resonances in drum sets that would normally not be a problem become a matter of concern. Drummers are well aware of these problems, and they will usually solve them during set-up before the session starts. In fact, drummers are usually the first to arrive at a session so that these details can be worked out.
OTHER PERCUSSION INSTRUMENTS Among the tuned percussion instruments normally encountered are the xylophone, marimba, and vibraphone. Figure 19-5 shows a suggested stereo pickup for these instruments. Many engineers choose hypercardioid microphones
Chapter 19
272
for this application, aiming the major lobes of the microphones at the low and high extremes of the instrument. These instruments should be allocated a pair of tracks, or mixed into a stereo pair along with other instruments if the musical role is secondary.
/
%
Figure 19-5. Suggested pickup of mallet instruments.
Most of the non-tuned percussion instruments used in studio recording are fairly small, and they tend to radiate evenly in all directions. Cardioid microphones placed 2 to 3 feet overhead will usually give excellent results. Very often, a single player will be asked to perform on a number of these instruments, and the player should be made aware of what your particular pickup requirements may be. Many Latin percussion instruments (maracas, claves, gourds) are often picked up at short distances when played at moderate levels. Determine beforehand whether they should be recorded as a stereo pair.
THE PIANO Unless you are doing a classical date, you can assume that the piano will be picked up fairly closely, as shown in Figure 19-6. Details at^ and B show the piano picked up in stereo and recorded to a pair of tracks. For better isolation, position the instrument so that its open side points away from the other players in the studio. If the studio is small, and if ambient music levels are high, it may be necessary to record the instrument on half-stick and covered with a heavy blanket, as shown at C. Most engineers prefer to record the piano using large format condenser cardioids; however, omni condensers may produce a somewhat warmer sound. Experiment with both. You may find that considerations of isolation will favor the cardioids.
Vocalist
273 B
20 Blanket covering opening; piano cover on half-stick
10
•
PRI
Pan 1 left of center and 2 right of center for normal stereo perspective. Both microphones about one foot above strings. Figure J9-6. Recording the piano. Top view (A); front view (B); on half-stick with blanket (C).
VOCALIST If a vocal track is being laid down with a band, it is best to keep the vocalist in the studio—if you can get enough separation. Put the vocalist in a booth only as a last resort, or if the vocalist wishes to be there in the first place. The setup shown in Figure 19-7 is standard, and the microphone height should be set so that it will accommodate the vocalist standing or sitting. You will need a dedicated headphone mix for the singer, and it should include whatever instrumental tracks the singer desires, along with the singer's track—and with reverberation. Do not compress the vocal headphone feed, since it will only confuse the singer. Microphone about 20 to 24 Inches from vocalist
See-through t^ upper sections
Side view
Top view
Figure 19-7. Recording a vocalist. Side view (A); top view (B).
274
Chapter 19
Traditionally, vocalists have gravitated toward large format condensers, and they usually prefer the older tube models. There is a lot of mystique involved here, and you should do everything necessary to satisfy the singer. Professional performers will not attempt to use a handheld microphone, but you can expect just about anything from amateur performers. Some vocalists will prefer a so-called vocal microphone, perhaps one of their favorites used in stage performances. At times like these you'll have to rely on advice from the producer and the artist's manager. Use goboes as shown if you need greater isolation. If the vocalist has any tendency to "pop" b's and/?'s, use a mesh pop screen, which you can attach directly to the microphone. These devices are virtually transparent to sound and they do a very good job of reducing wind noises.
THE BASS (BASS VIOL OR ACOUSTICAL BASS) There are a number of methods of recording the acoustical bass: 1. Microphone placed on a low floor stand. 2. Nesting a microphone between the tailpiece and body of the instrument. 3. Microphone in front of the amplifier-loudspeaker unit. 4. Direct output from an instrument pickup located on the instrument's bridge. Methods 1 through 3 are shown in Figure 19-8. Picking up the bass via a microphone on a floor stand as shown at A is generally preferred because it picks up finger articulation on the string as well as acoustical output from the body of the instrument. In jazz recording, both of these ingredients are important. The method shown at B is useful under certain live performance conditions where the instrument may be moved around to some degree. The fixed position of the microphone relative to the instruments will maintain a fixed pickup level from the instrument. The method shown at C picks up sound from a loudspeaker and may be subject to noise and hum, as well as any distortion generated by the system. A direct line-in to the console from the instrument's built-in pickup is very useful in that it provides complete isolation from any other sound sources near the bass. The sound resembles that of a solid body bass under direct-in conditions. Make sure you have a good active direct box for this purpose. If you have the track capability, I suggest that you use methods 1 and 4. If you are restricted to a single track, record a mix of the two methods which you, the producer, and the performer agree on.
The Acoustic Guitar
275
Fingerboard
Baffle
Baffle Small microphone wedged in place with foam rubber between tailpiece and body
Tailpiece
'///y////////y//77?7////////,
v///yy/////?//////P>////////A Baffle
Guitar amplifier
'///////////TT////////. Figure J9-8. Recording the acoustic bass. Floor microphone (A); microphone mounted on instrument (B); picking up sound from the amplifier-loudspeaker (C).
THE ACOUSTIC GUITAR The acoustic guitar is a relatively soft instrument and may need to be closely baffled to achieve the necessary isolation for microphone pickup. Figure 19-9 shows a typical spaced stereo pickup. In some cases you can achieve a stereo pickup by using one microphone (left channel) and using the instrument's direct-out (right channel). If you opt for this method, place the microphone directly in front of the instrument. As with the bass, use an active direct box.
BRASS INSTRUMENTS Pickup of individual brass instruments is shown in Figure 19-10. When played loudly, all brass instruments produce considerable harmonic development, and the higher harmonics are quite directional along the axis of the bell. A microphone placed about 3 feet in front of a trumpet may pick up an unnaturally bright sound. You may want to roll off the HF a bit, move off-axis or select a microphone whose HF response is rolled off. Older model ribbon microphones are quite popular for this application. The players themselves
276
Chapter 19 Direct out
U
As seen from above
iiiiii'lii'i'liii,i|i|i,'ii
^ ^
Distance: 8-15 Inches Spaced microphones at)out 1 5 - 3 0 inches from guitar txKly Figure 19-9. Recording the acoustical guitar,
often prefer these older microphones, since the recorded sound is more like what they hear while playing. Don't hesitate to experiment with various models and types of microphones. The French horn should never by picked up along the bell axis. As shown in the figure, place a microphone at 90 degrees above the bell, pointing downward. Very high frequencies diminish quickly at 45** off-axis
B
^ 40 In
< - ' Sound on-axis Is very
--^O
bright; consider using ribbon microphone
Baffle no closer than about 40 inches
>^ardloid microphone aimed downward from above. approximately 90'' off-axis ' of the bell. Distance 40 - 80 inches
Figure 19-10. Recording brass instruments. Trumpet (A); trombone (B); French horn (C).
Woodwind Instruments
277
WOODWIND INSTRUMENTS Sound radiation from the woodwinds is fairly complex due to the patterns of tone holes on the bodies of the instruments. Figure 19-11 shows some of the techniques that are used. In studio recording there should be no problems in putting a microphone just where you want it. However, in many live tracking situations it is customary to clip a small wireless microphone onto the bell of the instrument. If this is a requirement, then be prepared for a sound that will require a lot of equalization in postproduction. A
Good balance
B
yP
"Breathy"
'12-20 In
X
Good balance
?
12-20 In
Bass heavy
Good balance V2 - 20 In
Figure J9-11. Recording woodwinds. Clarinet-oboe (A); flute (B); saxophone (C).
STRING GROUPS Because they are relatively soft instruments, strings are almost always used in multiples and a natural stereo pickup is traditional. In popular recording, they are used primarily in large studio orchestras with the following quantities: 6-8 players, 1st violins 6-8 players, 2nd violins 4-6 players, violas 4 players, cellos There is usually one bass, and it is picked up separately from the rest of the ensemble. Figure 19-12 shows details of picking up the string ensemble.
Chapter 19
278 2nd Violins
I St Violins
Cellos
Figure J9-12. Recording a string ensemble.
In arraying the players, allow a space about 5 feet on a side for each stand (pair) of violins and violas. Allow a space about 6 feet on a side for each stand of cellos. The height of the microphones over the instruments is fairly critical. If it is too close, the sound will be edgy. At greater distances, these problems are solved, but then we then face the added problem of excessive leakage into the string microphones from the louder instruments in the studio. Many times the best way to solve this problem is to record the louder instruments on one date, adding the strings later through multitrack overdubbing. Some large studios have the luxury of an adjacent space large enough to isolate the strings from louder instruments, while allowing eye contact through large windows directly into the control room. Headphone monitoring is essential in these cases.
ELECTRONIC AND AMPLIFIED INSTRUMENTS In this section, we will deal with all instruments whose sound is generated or amplified for presentation over a loudspeaker. These include solid body guitars of all types as well as synthesizers. In most cases a feed directly from a pickup on the instrument can be used. It is essential that your direct feed include any signal processing that takes place in the instrument's electronics. If the instrument has stereo outputs, be sure to record both of them. In some older guitar amplifiers the processed output may only be available at the loudspeaker. Try tapping the signal off the voice coil leads; if this works, then use it. If not, you must place a microphone at the loudspeaker, as discussed earlier.
Session Planning and Track Allocation
279
SESSION PLANNING AND TRACK ALLOCATION Remember that the final mix will be made later. New musical ideas may come along in the meantime, and the producer and artist may want to experiment. Plan the session and identify the separate tracks that you know will be needed. Try to avoid "running out" of tracks by using the largest recording format that the budget will allow. The following examples illustrate these points.
SMALL JAZZ GROUPS Assume that a typical jazz trio of piano, bass, and drums is to be recorded for stereo release on compact disc, and possibly in surround as well. Even at this point, we must have some conception of what the stereo stage will be, and both producer and artist will have ideas of their own.
A jazz trio A good plan here would be to place the most prominent instrument in the center, with remaining instruments in flanking positions. A suggested studio layout is shown in Figure 19-13. The piano is picked up as a stereo pair and panned inward as needed so that it occupies the center one-third or one-half of the stereo stage. Drums will picked up with an overhead pair as well as a microphone on the kick drum. The overhead pair can then be panned so that the left signal appears in the center of the stage, with the right signal at far right. The kick drum would best be panned center, for purposes of mono compatibility and FM broadcasting. The bass, picked up by a single microphone or by direct pickup, should be panned slightly in from the left in order to maintain overall musical and stereo stage balance.
n
Drums
Direct out
Use short goboes
Use short goboes
Figure 19-13. Studio setup for a jazz trio.
280
Chapter 19
The overall intent is to feature the piano, with both drums and bass as musical support functions. Ambience (room) microphones may to be used if the studio is large enough to provide a good reverberant field that can enhance the recording. These microphones should be positioned fairly high and widely spaced in the studio and away from the players. While you may not be able to determine their absolute necessity during a tracking session, the mixing engineer may find them very valuable. In order to cover all postproduction possibilities, the track assignments might be as follows: 1. Piano close (left). 2. Piano close (right). 3. Bass (microphone). 4. Bass (direct in). 5. Drums (overhead, left). 6. Drums (overhead, right). 7. Kick drum. 8. Room pickup (left). 9. Room pickup (right). Obviously we need 16-track capability, and this leaves additional tracks for future overdubbing. Or, you might want to put additional microphones on the snare drum and highhat cymbals. Ask yourself the question: Do we really need a 16-track recorder to make this recording? In many cases the answer may be no. During the early digital era, hundreds of jazz recordings of small groups such as this were made and mastered directly to stereo. In other words, the monitor mix was the final recording, and all balances made at the session were locked in. The rationale for doing it multitrack and remixing later is basically one of musical flexibility. A year after the CD is released, a film producer might want to use a portion of the CD in a film, and the multichannel original would be an advantage. And don't forget that a surround mix may be needed at a future date.
Piano, vocal, and bass Here, the vocalist would be centered, with piano and bass moved out to flanking positions. Stereo reverberation is essential for the vocalist, but it should be subtle and in good perspective. A suggested studio layout is shown in Figure 19-14, and track assignments are: 1. Piano left. 2. Piano right. 3. Vocal (dry). 4. Bass (direct).
Medium-Size Jazz Groups
281
5. Bass (microphone). 6. Reverb return left. 7. Reverb return right. Direct out
Use tall see-through
goboes
Use short goboes
Vocal Figure 19-14. Jazz trio with vocal.
MEDIUM-SIZE JAZZ GROUPS A basic rhythm section for a medium size jazz ensemble consists of piano (or organ), bass, drums, and possibly guitar. Against this background, up to three wind instruments are often used in instrumental arrangements, or as backup for a vocalist. There are a number of stereo stage layouts for a group like this, and the selection of tracks must provide for them all. A suggested studio layout is shown in Figure 19-15, and track layouts are: 1. Drums (overhead, left). 2. Drums (overhead, right). 3. Kick drum. 4. Bass (direct). 5. Bass (microphone). 6. Guitar (microphone). 7. Guitar (direct). 8. Piano (or organ) (high). 9. Piano (or organ (low). 10. Vocal. 11. Sax 1. 12. Sax 2. 13. Room ambience (left). 14. Room ambience (right).
Chapter 19
282 Direct out
Drums
Short goboes
1 Short goboes Sax 2
Saxi
Guitar Use tail see-through goboes
Vocal
Short goboes
Figure 19-15, Jazz group with seven musicians.
A larger group might consist of nine musicians: vocal, piano, bass, guitar, trumpet, sax 1, sax 2, guitar, drums, and synthesizer, as shown in Figure 19-16: 1.Vocal. 2. Piano (high). 3. Piano (low). 4. Bass (microphone). 5. Bass (direct). 6. Trumpet. 7. Saxl. 8. Sax 2. 9. Guitar (microphone on body). 10. Guitar (direct). 11. Drums (overhead, left). 12. Drums (overhead, right). 13. Kick drum. 14. Synthesizer (left). 15. Synthesizer (right). 16. Room ambience (left). 17. Room ambience (right).
Medium-Size Jazz Groups
283
Bass
Direct *0"*
n
Drums
1
T.
Short goboes Sax 2
Guitar Direct out
Use tail see-through goboes Direct outs
r r>
Vocal
Short goboes
Synthesizer Figure 19-16, Jazz group with nine musicians.
The above plan allows piano, guitar, synthesizer, and drums to be picked up in stereo. This is important because they are all polyphonic sources; that is, they can all produce more than one musical line or sound at a time, making it possible to give them a natural "spread" on the stereo stage. The vocal, trumpet, and saxes are all monophonic sources, producing a single note at a time, and thus only need a single track each. With these track assignments the producer and mixer will have many options to choose from in creating final mixes. This track listing also illustrates a problem that will occasionally arise: with 17 suggested tracks you are required to use a 24-track recorder. If the budget allows it, that's fine. However, if this creates a problem you'll have to remove a track. The obvious choice here to go to a single bass track, creating a mix of microphone and direct pickup that you and the producer will feel comfortable with later on.
284
Chapter 19
THE JAZZ BIG BAND The usual makeup of this ensemble is: 4 trumpets 4 trombones 5 saxophones (1 baritone, 2 tenors, 2 altos) 1 bass 1 drum Set 1 guitar 1 piano This basic ensemble may be augmented in certain arrangements by added percussion players, French horn, electric organ or synthesizer, and possibly one additional player each in the wind instrument sections. The saxophone players may double on woodwind (clarinet, flute, or oboe), as required. We will focus our attention only on the basic ensemble. A typical studio layout is shown in Figure 19-17, and a suggested track layout is: 1. Drums (overhead) left. 2. Drums (overhead) right. 3. Snare drum. 4. Highhat cymbals. 5. Kick drum. 6. Piano left. 7. Piano right. 8. Bass (microphone). 9. Bass (direct). 10. Guitar (microphone). 11. Guitar (direct). 12. Sax 1. 13. Sax 2. 14. Sax 3. 15. Brass left. 16. Brass right. 17. Solo trumpet. 18. Solo trombone. 19. Ambience left. 20. Ambience right.
The Jazz Big Band
285
? ? 1 Bass
Direct In
J
I Direct in
Guitar {
Figure 19-17, The jazz big band.
At the start of the session, the major challenges to the engineer are: 1. Optimizing levels to the multitrack recorder. 2. Setting up a monitor mix. 3. Providing headphone mixes for musicians. These three tasks must be accomplished quickly. In most cases you will be working with the "home crew" in a studio, and they will be able to estimate settings based on previous sessions. You will very rarely be working "from scratch." Let them make the initial level settings to the primary multitrack recorder. This crew will also have a good idea of headphone routings and levels, and again give them that assignment. The monitor mix is your sole responsibility, and you can start working on it as soon as any musicians arrive before the session begins. As a starting point, you will know beforehand the stereo panning layout, and in most cases
286
Chapter 19
this will be virtually the same as the left-to-right physical layout of the players in the studio. If you make a mistake or two, there is nothing to worry about. Set levels first, and then consider any reverb feeds that you will need. If there is a vocalist on the date, pay special attention to those requirements. It is imperative that the vocalist hear a near-perfect mix at the first playback break. Many large scoring studios have an auxiliary console located in the studio. This console has parallel feeds from your basic tracks, and an assistant engineer assigned to this job takes the entire responsibility of setting up phone feeds for the players.
THE LARGE STUDIO ORCHESTRA The studio orchestra considered here is the typical ensemble that would be used for recording the score for a TV or motion picture soundtrack. While it appears symphonic in its overall size, the musical demands may vary from jazz performance to massed string writing in a classical context.
COMPOSITION OF THE STUDIO ORCHESTRA A typical studio orchestra setup is shown in Figure 19-18. A group such as this would be used for film or video scoring and may be augmented with additional instruments as well as with a jazz rhythm section and possibly a vocal chorus. Synthesizers are very commonplace today. By symphonic standards, the string section is usually abbreviated to include the following: 8 1st violins 8 2nd violins 6 violas 5 cellos 1 bass viol
Track Assigments
287 Synthesizers
Percussion
X X X X
Timpani
• w
/ Horns
Brass
Tuba
rrrumpets Trombones
•
/
Piano
Woodwinds Clarinets# # Bassoons Flutes # # Oboes
2ndViolins\
Internal strings
^^^^
Conductor
Flank
Flank
Main pair i
Legend
1
• Main microphones # Spot microphones 1 X Direct in
Figure 19-18. The large studio orchestra.
TRACK ASSIGNMENTS Multitrack recording is the norm here, because of the demands for postproduction flexibiHty. Sound tracks are usually mixed with effects and dialog, and individual instruments may need to be emphasized. An important consideration here is how the tracks are to be used in postproduction. The engineer should know which elements will require the most flexibility in rebalancing, and ensure that those elements are on tracks of their own. Conceivably, the entire string section could be recorded on a single stereo pair of tracks, if the postproduction flexibility required only raising them and lowering them as a group. On the other hand, a lead vocal must remain isolated from background vocals, and certain critical rhythmic elements must remain separate. When mixed behind dialog or sound effects, the engineer often has to reach for musical details or thin out textures if they are to be easily heard. In live recording events, a pair of tracks may have be assigned to pick up audience reactions.
288
Chapter 19
VGA subgrouping, or automated fader subgrouping, will simplify the monitor mixing of a complex session with a studio orchestra. For example, the entire string ensemble, with all stereo and panning assignments, can be grouped under the control of a single fader. The rhythm tracks may be similarly grouped. Within such grouping, an individual microphone inputs may of course be adjusted as required.
ANTICIPATING BALANCE PROBLEMS The single greatest difficulty in recording a large studio orchestra is keeping the sounds of the louder brass instruments from swamping out the softer string instruments. If there is a string isolation area, then you will have few problems; otherwise, you may have to resort to close string microphone placement. Avoid being so close that you run the risk of getting a strident sound; also, use the smoothest microphones you have—and don't hesitate to shelve out some HF above about 6 kHz. Generally, a competent arranger will help you avoid these problems.
KEEPING THE TAPE LOG SHEET A poorly documented session can be a nightmare to sort out later. A log entry should be made just after a machine is put into record mode and should include the following: 1. Title. 2. Take number. 3. Time Code reading at start. 4. Time Code reading at end. 5. Identification of take as a complete take, false start (FS), breakdown (BD), or any other abbreviations the producer and engineer wish to use. 6. Track content. The recorders are not usually stopped after a false start, and the assistant should enter the new time code on the log sheet when the music commences, whether or not a new take number has been assigned. Data to be entered at the top of each log sheet should include: location, date, identification of artists, producer, and engineers, and any project numbers that are pertinent. A copy of the log sheet usually is permanently attached
Keeping the Tape Log Sheet
289
to the tape box and becomes the official record of the session proceedings. Don't forget to label on the tape reel, cartridge or hard disc drive itself. Any backup copies of the session must be correctly labeled as well. The recording engineer may occasionally ask the assistant how much time is remaining on the recording medium. With a knowledge of time code and the medium itself, the assistant should be able to estimate the time remaining to within a few minutes. Multitrack sessions require extra work in that the content of each track must be indicated on the tape log. All details of overdubbing and track combining must be scrupulously documented. The producer may or may not keep detailed notes, and may rely completely on the accuracy of the assistant engineer in this regard.
Chapter 20 CLASSICAL RECORDING AND PRODUCTION TECHNIQUES
INTRODUCTION In this chapter, we will discuss the musical and technical factors involved in producing classical recordings for commercial release. We will discuss the selection of a recording venue, planning the sessions, placing the microphones in order to produce a desired recording perspective, and details of equipment and staffing.
THE COMMERCIAL RECORDING ENVIRONMENT Role of the producer As a professional classical engineer you will almost always be working with a producer. The producer may work directly for the record company, or be an independent agent engaged by the company for a given project. The producer's responsibilities may include any or all of the following: 1. Preparing a budget for the sessions and ensuring adherence to that budget. 2. Working with the artist or conductor in planning how the sessions will run. For example, does the artist or conductor feel comfortable only with long, complete takes, as in actual performance? Or, is the artist willing to work with numerous shorter takes? 3. Determining the sonic aspects of the recording. In practice, this is a joint responsibility of both producer and engineer, and many producers rely heavily on the advice and expertise of engineers they have successfully worked with. Included here are such important points as details of stereo placement of instruments and the balance of direct and reverberant sound pickup. It is essential that the producer and engineer have virtually identical conceptions of how a given recording should sound if the sessions are to be productive. 4. Musical supervision of the sessions. This involves studying the score with the artist and/or conductor well ahead of the sessions so that both will have the same goals. The producer communicates directly
Staffing
291
with the conductor on stage via the talkback system and keeps detailed notes of which parts of the music have been covered during the course of the session. 5. Supervising all musical aspects of editing and post-production. The producer represents the record company in matters involving the musicians union, and the producer often has to function as diplomat as well as drill sergeant during the sessions. Above all, the producer must remain cool and collected—and always in control of things. In many cases, the producer will have the sole authority to call for overtime in order to finish a recording.
The role of the engineer The engineer has the following responsibilities: 1. Checking out and certifying recording venues. Such matters as ambient noise level, acoustical suitability, and physical comfort are covered here. 2. Taking responsibility for the performance of all recording equipment and directing the technical crew. Quick solutions of all technical problems encountered during a session are essential 3. Translating the producer's sonic wishes into technical reality, through choice of microphones and their placement. In this area most producers are happy to leave the matter entirely in the hands of the engineer. 4. Performing all musical balancing functions at the console during the sessions. 5. Working with the producer, as required, in details of postproduction. (In many large companies, editing may be carried out by specialists working from scores previously marked by the producer.) Like the producer, the engineer must be collected and respond quickly when technical problems arise. The engineer must know the equipment inside out and keep detailed setup notes so that a given microphone array and level settings can be accurately duplicated in the same venue at any later date.
STAFFING There are normally three persons in the control room during a professional session: producer, engineer, and assistant engineer. The role of the assistant engineer is generally to keep the recording log, which relates the producer's
292
Chapter 20
slate numbers with start-stop times on the recording medium, whether it is tape or disc. The assistant engineer should be able to take over for the engineer in case of any emergency.
STUDIOS VERSUS REMOTE RECORDING VENUES Most classical recordings are made in remote recording venues. Orchestras normally prefer to record in their regular performance halls, but these are often unsatisfactory for the recording of large scale works. Specifically, the reverberation times in many halls are not long enough for orchestral recording. Reverberation times in the 2- to 2.5-second range are ideal. In the case of older halls with a proscenium and a deep orchestra shell, there are additional problems. The purpose of the shell is to project sound toward the audience during concerts. This may be a problem in recording, since the acoustical environment is different between the front and back of the stage. For recording, all orchestral players should be in the same acoustical space, and stage extensions are often used to move players to the front of the stage and into the house. Among the spaces used for remote classical recording are churches, ballrooms, and a surprisingly large number of Masonic meeting halls throughout the United States. Most of the good rooms are fairly old, built when lots of concrete and plaster were used for interior surfaces. But many of these older locations are apt to be noisy, cold in winter, and hot in summer. The newer buildings are more apt to be comfortable, but they are likely to acoustically inferior because of excessive acoustical absorption. It is the engineer's responsibility to check out and certify remote venues, and the following are some of the points that should be considered: 1. Is the space acoustically appropriate? If it's too live, can it be partially draped to reduce reverberation time? If the room is too dead, can it be livened? (See Figure 20-13.) 2. If there is a stage, does it project well into the house, or will a stage extension be required? 3. Is the air handling quiet, or must it be turned off* during actual recording? 4. What about various comfort factors and facilities for the musicians? 5. Can a control room be adequately set up close by so that the conductor or artist does not have far to walk? Do not forget to arrange for extensive damping materials to make the control room sufficiently absorptive.
Dynamic Ranges of Musical Instruments and Ensembles
293
6. What about external noise? Traffic around the building should be observed for at least one week, and any unusual patterns noted. Also consider any other activities that may be scheduled in adjacent spaces in the same building. 7. What about electrical service? Is it adequate and free from troublesome transient disturbances? During a recording made in a remote location, it is essential that the assistant engineer keep a sharp ear open to extraneous noises of any kind and note them in the recording log. A private phone link between producer and conductor is essential since it enables sensitive conversation to take place. A video link between stage and control room may be desirable for larger projects A properly designed studio will have few or none of the noise and comfort problems you're likely to find in many remote locations. This leaves us only with the acoustical disadvantages of most studios, but in many cases, it is possible to work around these through the use of modem high quality artificial reverberation. Studios are in fact strongly recommended for smaller musical forms, such as solo instruments and small chamber groups.
DYNAMIC RANGES OF MUSICAL INSTRUMENTS AND ENSEMBLES It usually comes as a surprise to recording engineers that the dynamic ranges of most instruments are as limited as they are. Figure 20-1 gives a clear indication of this. The string instruments have a fairly uniform dynamic capability over their frequency range. By comparison, woodwind and brass instruments shift widely in their dynamic characteristics depending on the range in which they are playing. A string quartet, for example, may normally play with an overall dynamic range that doesn't exceed 30 dB, so there should be no problem in recording that group with a recorder capable of handling a 90-dB signal-to-noise range. The piano produces initial keystroke dynamic ranges not exceeding about 35 or 40 dB. When an orchestra plays loudly, all players are involved, and brass and percussion will predominate. When the orchestra plays very softly, there may be only a few string instruments playing. The acoustical power output of the orchestra may range from 15 to 30 watts for the full ensemble to less than a microwatt for the softest passages, and the resulting dynamic range may be 70 to 75 dB. However, not many home environments have a low enough
Chapter 20
294
ambient noise level to allow full appreciation of such a wide range program without having the high level portions of the program be extremely loud. Virtually all classical recordings today are adjusted in dynamic range, usually at the mastering stage, to ensure that the recorded product meets the buyer's expectations.
C4
C5 C6 Trumpet (pp to ffsX\ meter)
C3 C4 C5 Horn (ppto /fat 1 meter)
Figure 20-1. Dynamic range of selected musical instruments showing the span in dB between playing very soft (pp) and very loud (ff) over the normal range of the instrument.
Table 20-1 shows some of the published data regarding power output from various instruments and ensembles.
Recording Solo Instruments
295
Table 20-1. Maximum power outputs and levels of various musical sources Source Male speech Female speech Violin Bass viol Flute Clarinet Trumpet Trombone Orchestra
Maximum power output
SPL^ at 10 ft (3.3 m)
0.004 watt 0.002 0.01 0.07 0.3 1.0 2.5 5.0 15.0
73 dB 70 79 88 94 99 106 109 972
Notes: 1. Calculations made assuming Dl = 1. 2. Calculated for a distance of about 30 feet (10 m).
RECORDING SOLO INSTRUMENTS The piano The solo piano is normally recorded with a pair of microphones placed fairly closely to the instrument, as you can see in Figure 20-2. If you choose a coincident cardioid pair, as shown at A and B, the sound stage will be fairly narrow and ambient pickup will be low. This may however be a good choice in a live space. The spaced microphone approach shown at C and D will result in a much broader stereo sound stage. This method, using omni microphones, is preferred by most engineers and producers today since it produces a generally warmer sound. Watch the spacing between the microphones and be careful to avoid a "hole in the middle." You should also be aware that most concert caliber pianos have been voiced to be on the bright side so that they will project well in a concert hall. These instruments usually need to softened or "pulled back" somewhat for recording. A good technician can do this fairly quickly, and it is a good idea to keep the piano technician on stand-by during the recording to ensure that the instrument is in top shape and tune at all times. Regarding the recorded perspective, it is ideal for the piano to be centered and occupying about one-half the total stereo stage width. Ambient program information should be perceived as coming from the entire width of the stereo stage.
The harpsichord Many of the same principles which apply to the piano may be used here. There are several important differences, however. While the modem piano is
Chapter 20
296 2Qr-^--0
D 1 Height: 6.5' Distance: 6.5'
2 8* 10'
1 Height: 6.5' Distance: 6.5' Spacing: 3'
2 8' 10' 5'
Figure 20-2. Recording the solo piano. With coincident or near-coincident microphones (A and B); with spaced-apart microphones (C and D).
a mechanically quiet and smoothly regulated instrument, the harpsichord action is apt to be noisy. If the instrument is recorded too closely this will be problematic. The proximity effect of directional microphones may aggravate the problem, and in that case you may need to use 50- or 80-Hz sharp highpass filters. Since the instrument is basically a "period piece" from the 18th century, many of its musical requirements may call for fairly reverberant spaces. Because of its relatively rich HF content and precise attack, the harpsichord may be presented against a denser reverberant background than would be appropriate for the piano.
The guitar and lute These instruments are small, and they are normally recorded close-in. The apparent stereo width of the instrument should be about one-third the stereo stage, and coincident or near-coincident microphone arrays will do this nicely. Reverberation should convey a feeling of intimacy; that is, it should be fairly short (1 to 1.5 seconds), and there should be enough of it to support the
Recording Solo Instruments
297
relatively thin texture of the instrument. Remember that the lowest string on the guitar is E2 (82 Hz). Figure 20-3 shows two methods for stereo pickup of these instruments. Proximity effect may add an unnatural LF boost if directional microphones are used, and you may have to remove some of this with an LF shelving cut.
Ax= 40to60 in
^^
o
y = 20to40 in
i
o z- 40to60 in
Figure 20-3. Recording the guitar. With coincident microphones (S); with spaced microphones (B).
The guitar or lute may easily be recorded in a relatively dry studio, since good artificial reverberation always works well with these instruments. The HF response of the reverberant signal should be rolled off above about 3 or 4 kHz.
The harp It is almost impossible for a good player to make an ugly sound on a well tuned harp. There are many microphone approaches that work well. Keep in mind that the instrument is not very loud and that room noises can get in the way if you are too far from then instrument. Figure 20-4 shows one approach to placing microphones. Keep the stereo image well centered, and don't hesitate to add some reverberation to give the stereo stage added width.
Chapter 20
298
^ X
.-"
« . - ^
Top view
Perspective view Figure 20-4. Recording the harp. Distance x about 40 inches.
Many engineers prefer to pick up the harp with a pair of omnidirectional microphones, about 20 inches apart and positioned about 40 inches from the instrument.
The organ Most organs are located in houses of worship, and those spaces often have fairly long reverberation times. The ideal reverberation time for an organ is in the 2.5- to 4-second range. Large cathedrals and European churches may have reverberation times in excess of 6 seconds. Modem organs have borrowed heavily from traditions of eighteenth century North German and French organ design, and most instruments are placed fairly high above the floor. Figure 20-5 shows a typical installation in the rear gallery of a church. ^
^
BJ]
-o
^
Elevation view
Section view
Figure 20-5. Recording the pipe organ. Distance x about 6 to 12 feet, distance j^ about 10 to 20 feet.
Recording Chamber Groups
299
Either a coincident microphone pair or spaced omnidirectional pair (shown in the figure) will usually provide excellent pickup. For spaced microphones, x should be in the range of 6 to 12 feet, and typical distances for y range from 10 to 20 feet from the gallery rail. If the environment is fairly reverberant, then the single microphone pair will pick up enough ambience. In less reverberant spaces, a secondary stereo pair (about 20 feet behind the main pair) will provide the necessary ambience. Microphone height should be at the average height of the instrument. The spaced omnidirectional microphones can create "excellent spatiality, but without image specificity." Many engineers, producers, and organists are willing to sacrifice precise left-right imaging for a greater sense of large-room ambience. Many large organs have low frequencies that reach down to the 20 Hz range. This is another good reason to use omnis, since their LF response is normally quite extended. You need good monitor loudspeakers to ensure that you are actually picking up these frequencies.
RECORDING CHAMBER GROUPS Chamber groups generally range from two to about twelve players. The category includes solo vocal or instrumental with piano, string quartets, and a variety of other instruments with one performer on each part.
Seating the musicians For public performance, musicians normally face the audience. In a recording environment, their positions can be altered as required to make for better sound pickup. While players may initially be reluctant to change their traditional seating positions, they can usually adapt to new ones—especially if there are some real recording advantages. Consider several possibilities for seating and pickup before the recording and discuss them with the producer. Between the two of you, a plan can be worked out that the musicians can adapt to.
PIANO WITH SOLO INSTRUMENT OR VOICE Figure 20-6 is the ideal way for a soloist to maintain good eye contact with the pianist, while allowing the engineer to get the desired balance. In recording solo instruments with piano accompaniment, it is important to keep them both in proper scale, and the method shown here allows the engineer to
300
Chapter 20
separately adjust both piano and soloist levels. A recommended console setup is shown in the right portion of the figure. Studio setup
Console setup
0 10 ©1 Pan .0. .0.
Reverb send ^
e
_ Soloist
er 3
0
Faders
Microphone
C
0
0
Mi 1
2
3
c 1
1 Reverb return
Figure 20-6. Recording a vocalist or instrumental soloist with piano.
Microphones 1 and 2 can be adjusted to get an optimum pickup of the piano, while microphone 3 can be positioned for optimum solo pickup. For a vocalist the operating distance would be in the range of 2 to 3 feet. In most instances the soloist will need a touch of artificial reverberation, and that is shown in the console setup. A variant of this approach is to use a coincident or ORTF stereo pair on the soloist instead of a single microphone panned to center. The ORTF pair should be panned left and right. This approach may give just a little more feeling of space around the soloist, and allow small sideways movements to be tracked on the stereo stage. Whether you use one or two microphones on the soloist, the microphones should be positioned sHghtly above the source and aimed downward toward the sound source. Most singers can cover an extremely wide dynamic range, and you shouldn't hesitate to adjust balances as you are laying down the basic tracks. The producer will be your best guide here.
RECORDING THE STRING QUARTET The players in a string quartet normally seated as shown in the upper left portion of Figure 20-7. The separation between players is no greater than it has to be, since visual contact is so important to good ensemble playing. Method 1 uses a coincident or near-coincident pair on the group, and the microphones
Recording the String Quartet
301
should be placed overhead looking into the quartet. At a typical operating distance of about 6 feet, the stereo stage may seem a little narrow and the group may sound too distant. Performance setup ^^?"^ violin
vIoHn
Method 1
Cello ^^ ^
^, ^, ,
Viola
^^?"^ violin
Cello ^
^i^jj^
Viola
5 to 6 feet
V Method 2 Second violin
Method 3 ^^„^ ^^"^
^
Second violin
©
U7 First violin
^
0
0
^. , Viola
First ^IQH^
^'Olin ^
r>^u^ ^®"°
0 _ Viola
^
Figure 20-7. Recording the string quartet.
In Method 2, three omni microphones have been placed closer to the players. The sound stage will be considerably wider, and the quartet will have a closer perspective. Left and right microphones are panned hard left and right, while the center microphone is fed into the mix just enough to get a good front-back balance and "anchor" the cello a little right of center-stage. This approach works best with extremely flat microphones at a height of about seven feet, and the overall impression of the recording is that the quartet is performing in your listening room—rather than transporting you into a recital hall. Most quartet recordings you hear today are done in this way. Method 3 in a sense "deconstructs" the quartet, allowing the engineer and producer to reconstruct it in postproduction The microphones would normally be cardioid, placed overhead and aimed at the respective instruments. The approach is often used in live performance recordings. An advantage here is that you can widen or narrow the stereo stage after the fact, and of course you will have excellent immunity from audience noises. Methods 2 and 3 normally require artificial reverberation in order to flesh out the sound.
Chapter 20
302
OTHER CHAMBER GROUPS The piano-string trio In concert the piano trio must be recorded as shown in Figure 20-8A. In a studio setting the approach shown at C is recommended. As with the vocalist with piano we discussed eariier, the aim here is to put the players in a circle, all looking at each other. The piano pickup would be panned to create a broad center image, and each string player would be panned slightly inboard to give cohesion to the group. The string microphones should be at about the same height as you would use with a string quartet. Omnis would normally be used on the piano, and cardioids on the two strings. Artificial reverberation would normally be used. B )
D ^Violin
)
D
/Horn 7
BassoonX
Cello Oboe>
SCIarlneN
\f
\ /
c
D
)
9
^ . Cello.
Violin
n
_
J fo
p
f
)
^LJ
9 OCIarineO Oboe
Figure 20-8. Recording chamber groups. Trio and quintet in concert setting (A and B); in studio setting (C and D).
The piano-wind quintet A concert setup is shown at B. In the studio the approach shown at D would be used, with the players spread out to give them ample playing and breathing room. The cardioid microphones on the winds should be placed overhead.
Notes on Microphone Leakage
303
slightly in front of the players, and aimed downward. The horn is the exception here: place the microphone directly over the instrument at about 7 feet. Use artificial reverberation to flesh out the sound.
NOTES ON MICROPHONE LEAKAGE Leakage is the result of pickup commonality between adjacent microphones. In the pop studio we normally try to avoid this, but in classical recording it is an advantage—if it is controlled. Taking another look at Figure 20-8C, you will observe that each omni microphone on the piano will pick up some sound from both cello and violin. This leakage will be several dB lower in level and will not interfere or conflict with the primary violin and cello pickup; in fact, the leakage will add a degree of richness and warmth to the sound, since it simulates nearby, early reflections. There is one very important point you must remember: leakage is most pronounced between adjacent microphones, and those microphone outputs should be panned to adjacent positions on the stereo sound stage. Always seat the players and position their microphones from left to right as you expect them to appear in the stereo stage as heard over the control room monitors. If you need to change the basic positioning of instruments, do it in the studio—not at the console.
NOTES ON ARTIFICIAL REVERBERATION When making a recording in a fairly live room you can place ambience microphones about 15 or 20 feet away from the players. These would normally be cardioids pointed away from the players. In many cases however the room is too dry to support a significant reverberant field, and you will have to use artificial reverberation. Here are some general rules and suggestions: 1. Use the best digital reverberation unit you can get. 2. Use a bass multiplier setting of about 0.8 at a crossover frequency of about 500 Hz. (This will keep the texture clear and avoid any trace of muddiness.) 3. Use a reverberation time setting no greater than about 1.8 seconds. 4. Roll off both the reverberation setting and frequency response above about 3 kHz. 5. Use a predelay no longer than about 20 milliseconds. 6. Disable any randomizing functions that the program offers.
Chapter 20
304
Some reverberation units provide ambience programs, which generate a set of early reflections with little if any reverberant "ringout." Such programs are very useful and may be better in some situations than an actual reverberation program.
RECORDING LARGE MUSICAL GROUPS Choruses Choruses, large and small, are usually best picked up by using one of the microphone arrays shown in Figure 18-14. The chorus will normally be standing on risers, and the microphones should be about 12 feet above the floor, about 3 or 4 feet in front of the first row of singers. An ambient microphone pair can be used for reverberant pickup, if needed. A soloist in the chorus singing a short passage may sing "in position" and not need a microphone. However, an extended solo with choral background will need to be picked up separately. In that case the singer should stand in front of the group so that there will be minimal leakage into the soloist's microphone. Added reverberation will be required for the soloist, and this is best with an artificial reverberation unit.
The orchestra Figure 20-9 shows a typical seating plan for a modern symphony orchestra. The principal players are the heads of their respective sections, and you can see that they tend to be clustered toward the middle of the orchestra. Typical seating pian for a large symphony orchestra
•
- positions of principal players
Figure 20-9. Seating plan for a modem symphony orchestra.
Recording Large Musical Groups
305
Furthermore, the wind and string principals each form a quartet of players. This proximity is important because they often confer during rehearsals, and very often play as isolated quartets. The typical width of an orchestra on-stage is about 55 feet and the depth is about 30 feet. Many modem concert halls have a choral terrace just behind the orchestra, elevated about 10 to 12 feet. By way of terminology, the woodwind instruments are always referred to as "winds." The term "brass" refers primarily to trumpets, trombones, and tuba, and the French horns are always referred to as "horns." We can think of recording the orchestra in layers. The first and most important layer is the frontal microphone array, which we discussed in Chapter 18. A second layer consist of a group of what are called spot, or accent, microphones. These are microphones placed fairly close to certain instruments in order to do one or more of the following: 1. To increase loudness. (Many extended wind and string solo lines need to be increased in volume.) 2. To increase presence. (Many orchestral elements need added presence without resorting to playing louder. (Examples would include harp, celesta, orchestral piano, and various percussion instruments.) 3. Add focus to the recording. (The winds as an ensemble always need more focus than the main microphone array provides. A secondary stereo pair of microphones provides this. Spot microphones will all call for some degree of added reverberation. A third layer is provided by the ambience microphones, and a fourth layer consists of the chorus behind the orchestra, if there is one. A final layer would be any front-stage soloists. It is clear that each layer must be picked up separately, and yet they must all blend into a unified stereo sound stage. It is essential that all spot microphones be panned into the stereo mix at positions matching the player's actual positions on stage.
The session and setting initial balance Prior to the recording session, the producer, conductor, and engineer will have met to discuss details of the recording, including the assignment of spot microphones. A typical list of spot microphones would include: principal first violin (concertmaster), wind pair, first stands of basses, brass overhead, horns overhead, timpani overhead, celesta/piano, and harps. The main microphones would be placed as shown in Figure 20-10. Distance A would usually be about 4 feet from the front row of players, and the height of those microphones would be between 10 and 11 feet. Distance B would be approximately one-third the frontal width of the orchestra. Microphones 5 and 6 would be 9 to 10 feet above the fioor, and they would
Chapter 20
306
be positioned just in front of the first row of winds, with the two cardioids aimed toward the ends of the second row of winds.
Figure 20-10. Basic microphone plan for an orchestra.
The horn and brass spot microphones should be about 10 feet above the floor and aimed directly downward. Timpani and bass spots should be placed about 8 feet above the floor and aimed downward. Harp, celesta, and piano spots are normally just a few feet away from the instruments. If you are working in a venue for the first time you will need at least one rehearsal to set your basic balances. Begin with the main pair by raising the two faders to their nominal ''zero'' position. Set operating levels using the trims only. Then, raise the two flanking microphones, again to their zero point on the faders, and then trim them so that their contribution to the overall balance is about the same in overall level to the main pair. You will have to bus on and off each pair in order to establish this. Once the four frontal microphones have been balanced to your, and the producer's satisfaction, proceed with the spot microphones. The first is the wind pair. Since most of your microphones will probably have the same output sensitivity, you should be able to zero in fairly quickly on a proper balance using the trims. Next in order are the house microphones, which should be cardioids widely spaced about 20 feet back in the house and facing to the rear. Bring them into the mix using the trims, with the faders at zero. When you are finished with this procedure, all eight of the microphones adjusted so far should be set at their nominal zero positions, and the basic balance will have been made using only the trim controls. Once you have reached this point, you are virtually home free. The remaining spot microphones can be "fine tuned" as the music gets underway.
Large Orchestral Resources
307
As you bring in the remaining spots, you may want to shelve out the LF response by about 3 dB below about 100 Hz. This will give you a little more leeway in manipulating them during the recording.
Overall orchestral level I have never measured an orchestra that produced a level at the main pair greater than about 105 dB SPL, but actual peak levels will depend on the hall and the nature of the music. If you have made your initial settings too high, your first recourse is to pull the overall levels at the group master controls. Conversely, you will have to raise the group levels if you have trimmed too low. In any event, these adjustments can usually be made on a running basis.
Document your settings This means making a chart of the house microphone layout, indicating microphone models, their settings, and their heights. Console trims, fader positions, pan settings, and EQ settings must also be noted. The next time you work in this venue you should be up and running at the first downbeat. Even in other venues, the level settings shouldn't be too different, if you are using the same console and microphones.
LARGE ORCHESTRAL RESOURCES Figure 20-11 shows the floor plan of a large work with chorus and soloists. Many such recordings today are made during live concerts, and you will have to work around some inconveniences. The microphone plan for the orchestra is pretty much as we have already described. The chorus will need three or four microphones that function in the same way as the four frontal microphones on the orchestra. Place them as high as you can without running into a canopy shell, and aim then downward into the chorus. This will minimize your biggest problem, which may be leakage from percussion and brass in the back of the orchestra into the choral microphones. Ideally, each soloist should have a microphone and a track on the recorder so that final balances can be made in postproduction. In many cases there may not be enough track capability to do this, and you will have to balance them as you go, mixing them into a stereo pair. We have all been in these situations, and they are not easy to cope with. It is cases such as these where years of experience will pay off. Take advantage of every opportunity you may have to observe experienced classical recording engineers at work.
Chapter 20
308 Chorus on risers
i V i
Soloists O
O
O
O
Figure 20-11. Recording orchestra with chorus and soloists.
DELAYING SPOT MICROPHONES Figure 20-12 shows the rationale for delaying spot microphones before they are introduced into the mix. Since these microphones are fairly close to the instruments they are picking up, their introduction into the mix comes before their actual acoustical arrival at the main microphone pair. For example, if a spot microphone is located 20 feet from the main pair, it has a time advantage of 20/1130 seconds, or about 17.7 milliseconds. It is always correct to delay the microphone signal to compensate for the acoustical delay, but it may not be necessary because of certain masking effects. In practice only percussion microphones, which may be as far away from the main pair as 25 to 30 feet, would require delay. It is customary to increase the calculated delay by about 10 milliseconds in order to avoid any tendency for comb filtering to be noticeable.
A Itering Room A coustics
/
/
309
/
\ Spot microphone 0
^^
/ / ' I I
Ensemble
/ ^ r I / '^ i_J/ /^
X
^. \ I ! Required delay = X/1130 seconds, where X is measured in feet.
Main microphone array Figure 20-12. Implementation of spot microphones.
ALTERING ROOM ACOUSTICS Many newer halls have variable acoustic control built into the structure in the form of heavy drapes that can be deployed as needed to damp the space. Some halls have associated reverberation chambers that can be opened or closed to increase the effective reverberation time of the hall. The usual problem in older halls is that they lack the reverberation that engineers and producers feel the music needs, and there is a technique, shown in Figure 20-13, that can substantially liven a large space. The procedure is to get enough 4 mil (0.004 inch) thick polyethylene plastic sheeting to cover the entire seating area and drape that material loosely over all audience seating. The before and after reverberation time measurements in a concert hall are shown in the figure. The material is available from dealers in construction materials and is normally used for paint drops and covering of carpets and seating when construction is underway. Do not use plastic sheeting thinner than 4 mils. You can hear the difference this treatment provides by comparing bands 2 and 12 on the CD "The Symphonic Sound Stage," released by Delos International D/CD 3504.
Chapter 20
310
Section View of Hall (crosshatching Indicates areas covered by 18,000 square feet of plastic material)
2.5
1
vL^
•o
5 2.0 t
sj^
\ /
E
Q,
.9 1.0 o3
K
„
:
\Aft4tK)ut
>o
plasticl:
Willi plastic
Vi. 'x><
0.5
25
50
100
200
400
800
1600
3150
6300
12.5k 20k
Frequency (Hz)
Figure 20-J3. Adding liveness to a hall. Section view of hall (A); reverberation time, before and after plastic treatment (B).
Chapter 21 SURROUND SOUND RECORDING TECHNIQUES
INTRODUCTION Surround sound for music without picture is an outgrowth of the home video revolution of the 1980s. As motion pictures became available on VHS tape, multichannel soundtracks were routinely played over home stereo systems. When motion picture digital soundtracks intersected the development curve of consumer high-density optical media, the record industry saw an opportunity to expand the home music experience from stereo to surround sound. The reasoning was as follows: If consumers have invested in multiple loudspeakers in order to recreate the cinema listening experience in the home, then they will certainly want to hear music over the same surround sound listening setup. This has been true to some degree, but the home audio surround revolution hasn't really happened to the extent anticipated. Many engineers who've been around a number of years will remind us of the debacle of quadraphonic sound during the 1970s. The fact is, we should keep in mind all of the reasons quad failed—and try not to repeat them. Quad failed because of a lack of standardization, immature technology, and the lack of a consumer base on which it could grow. These problems have largely been solved today. In any event, it is important for all recording engineers to understand the principles and practices in surround recording and mixing that have developed over recent years because the commercial opportunities in video alone are sufficient to ensure its growth.
BACKGROUND Today, the motion picture surround experience is heard over the loudspeaker arrangement shown in Figure 21-1. In general, dialog is presented over the center channel; all three front channels carry the music and effects, usually in wide-stage, often exaggerated, stereo. The surround channels are devoted specifically to off-screen sound effects, such as fly-overs or battlefield
Chapter 21
312
sounds. The surround channels are often used in extended music segments to enhance the ambience elements in the music.
Figure 21-1, Loudspeaker layout in a modem motion picture theater,
It is important that there be a number of surround loudspeakers in each channel. Each loudspeaker is operated at a fairly low level, and the ensemble effect of all of them creates a more or less random sound field in which the film patron will have difficulty in pinpointing a given sound source. The surrounds at mid- and rear-house are often delayed slightly from those near the front in order to enhance the sense of randomness. Motion picture sound has grown from four channels on magnetic-striped film during the 1950s to today's five digital tracks with stereo surround. The audio program is encoded directly on the film or on a CD-ROM accompanying the showing of the film.
TRANSLATION OF THE FILM EXPERIENCE INTO THE HOME In a typical motion picture theater the screen loudspeakers may be spread over a distance of 15 to 20 feet, and the distances from screen channels to the surrounds may be in the range of 40 to 50 feet or more. The spatial effects heard in the theater do not directly translate into the home listening room as such. Because of the distances from the listener to the various loudspeakers in the cinema, there are greater time cues than in the home environment, and the sense of spatiality will be more significant in the theater. Also, due to different equalization standards, a mix made for the theater is likely to sound too bright in the home environment. Films that are slated for release on DVD are normally re-equalized in the process to make a better match with home systems.
Loudspeaker Setup in the Home Environment
313
LOUDSPEAKER SETUP IN THE HOME ENVIRONMENT Figure 21-2 shows two approaches that are used. At A the three frontal loudspeakers are spaced so that they subtend an overall listening angle of about 45 to 60 degrees at the listener. The surround channels are normally placed to the sides, or slightly to the rear, and the distances from the center listener to each of the five loudspeakers should ideally be the same. Slight differences are not all that critical, however. In the setup at A, the five loudspeakers should be identical for best results. It is common to use somewhat smaller models for the surround channels, but the MF and HF "sonic signatures" should be matched to the front loudspeakers. A
i
B
n • Left
b ( [Surround L
([Renter
•
Right
)
1 D
Surround 1 R
•
Left
•8-
SurrourKl L
•
Center
•
Right
3 -&
Surround R
Figure 21-2. Home theater setups. Using 5 identical loudspeakers (A); using dipole loudspeakers for the rear channels (B).
The setup shown at B calls for dipole surround loudspeakers. Dipoles project sound primarily to the front and back, and the so-called null angles of the systems are aimed at the listeners. This means that the surround signal heard by the listeners will first be heard reflecting off the front and back walls of the listening room. This further diffuses the sound and to some degree helps the home listening experience to be more closely related to the theater experience.
THE 5.1 STANDARD When digital audio was standardized for motion pictures in the early 1990s, a special effects channel was included for the purpose of enhancing low frequencies in the theater. The standard is known as 5.1 (five-point-one), which translates into five full-bandwidth channels and a band-limited channel covering the range below 100 Hz.
314
Chapter 21
When the DVD video standard was set, 5.1 became part of it. Among other things this standard has brought a proliferation of subwoofers into the home, along with so-called "sub-sat" systems consisting of five small satellite loudspeakers (covering the range above 100 Hz) along with a single sub channel (covering the range below 100 Hz).
SETTING UP THE SURROUND POSTPRODUCTION MONITORING SYSTEM Figure 21-3 shows the loudspeaker placement recommended by ITU (International Telecommunications Union, document ITU-R BS 775-1). There is no requirement that the home user duplicate the setup in order to enjoy surround sound; it is simply a reference setup for use in the recording industry. I can attest to the fact that product mixed using the ITU monitor format performs well in a wide variety of home surround installations. If you are using small monitors positioned on loudspeaker stands, the setup is relatively easy to make. It is essential of course that reference levels be accurately matched.
Figure 21-3. ITU standard loudspeaker arrangement for creating surround product.
How Surround Sound Works—What You Can and Can't Do
315
HOW SURROUND SOUND WORKS— WHAT YOU CAN AND CAN'T DO In stereo we have two loudspeakers. We can position musical events (real images) at either loudspeaker, or we can pan them as phantom images between the loudspeakers. We can also treat ambient information as a combination of in-phase and anti-phase signals, creating a sense of spatiality which may extend slightly outside the loudspeaker array. With surround sound there are many more options: With the frontal loudspeakers we can create three real images, and the center channel is a powerful "anchor" for center-stage events. Phantom images can be positioned between adjacent frontal loudspeakers. If you pan a side image between front and rear loudspeakers you will find that the image will be apparent only if you turn your head sideways. For a forward facing listener the phantom will not exist, and such images should generally be avoided. However, you can produce a convincing image in motion, such as a fly-over effect in a video presentation, by panning rapidly between front and rear channels. Figure 21-4A shows some of these options.
Figure 21-4. Phantom images in surround (A). Ambience in surround (B).
One of the most convincing effects in surround is the ambient sound field that can be generated by feeding decorrelated program information to the four outside loudspeakers, front and rear, as shown in Figure 21-4B. The effect is strongest when the center channel is not used. A number of modem reverberation units can generate a sound field of this kind; or, you can use a spaced array of four microphones placed in the reverberant field of a room.
316
Chapter 21
MICROPHONES AND MICROPHONE ARRAYS FOR SURROUND PICKUP Soundfield microphone The Soundfield microphone has long been used for direct recording of surround program information. The basic model is shown in Figure 21-5A. Housed in the microphone are four subcardioid elements oriented at equal spatial intervals, as shown at B. The outputs of these elements (A-format) are fed to a matrix section which produces the four B-format outputs, known as W, X, X and Z These correspond to the following patterns:
Microphones and Microphone Arrays for Sound Pickup Omnidirectional Left-right figure-8 Up-down figure-8 Front-back figure-8
= = = =
Lp Lp Lp Lp
317
+ R B + Rp + L B + L B - Rp ~ R B + R B - Rp - L B + Rp -B" R B
These four patterns are all effectively operating at the same point in space. By combining the four outputs in various ways, any first-order cardioid microphone pointing in any direction in space can be synthesized. These and other functions are carried out by the control unit, shown in Figure 21-5C. In surround applications, the synthesized cardioid directions correspond to the playback loudspeaker positions, and the sound field existing at the microphone is virtually recreated in the listening room. In practice, the Soundfield microphone would be supplemented with additional spot microphones, suitably delayed and panned as we have discussed earlier.
A - format:
B - format:
Quad outputs
Capsule assembly as seen from behind Mono Stereo
Soundfield Controls Azimuth Elevation Dominance
a-o' AdoInput
Fine Gain Test LB- LF^- RFGam Fine RB^Gain Osc Test End Inv
DubTi^^ Dub Tape
SF B rn Formal
DDDD D D r "Df DD Soto
MkV Processor
Output
oV^iso \ ^ Stereo^Monitor
D
Pattern ^ M/S
Angle
D
CtP
Figure 21-5. The Soundfield microphone. Photo of microphone (A, left); schematic showing capsules and basic processing (B, above); front panel view of controller (C, above). (Photo courtesy Transamerica Audio Group)
318
Chapter 21
Schoeps KFM360 sphere The KFM360, conceived by Bruck, is shown in Figure 21-6. Essentially, it is a sphere with omnidirectional microphones imbedded on opposite sides.
from
ooog "figure-d' CCM8
i
Ooo6 rear channels: difference (omni - figure-8)
Figure 21-6. The Schoeps KFM360 sphere. Photo of sphere (upper); analysis of patterns (lower). (Data courtesy Schoeps and Posthorn Recordings)
Microphones and Microphone Arrays for Sound Pickup
319
External to each omni microphone is a figure-8 microphone. In the control unit, the microphone outputs can be added or subtracted to produce both frontal and rear pairs of spaced microphones. There are essentially four outputs, but the control unit, shown in Figure 21-7, provides a matrixed center channel output. The control unit also provides delay and equalization for the rear outputs, both of which will enhance front-rear separation.
^ W- sw».
/'/
'^4—•^''
ft ^ ' *
1 l**99!L_J "^0^9^
'i
T,-/
Outputs Inputs AES/EBUOL-Omnl O L-Flgure 8 O AES/EBU O R-Omni O R-Flgure-8 O -
ADC
Digital Input —\ signal Iswitch Processing
ADC
Front/rear pattern selction Front panorama Gain controls Rear delay
User interface
- O AES/EBU - O L front - O R front - O AES/EBU - O L rear - O R •'ear - O AES/EBU - O Center front - O Sub
Rear slope Front/rear balance Left/right balance Gerzon matrix
Figure 21-7. DSP-4 control unit (upper); functions of control unit (lower). (Data courtesy Schoeps and Posthorn Recordings)
Both the Soundfield microphone and the KFM360 would normally be positioned as the main pair would be in a stereo recording, and of course it would be supplemented with spot microphones.
SAM (surround ambience microphone) This array is shown in Figure 21-8. The spacing of the microphones is sufficient to generate time cues in addition to the ampHtude cues generated by the microphone's patterns. The array would normally be used only for the pickup of room ambience.
Chapter 21
320
Figure 21-8, SAM microphone array.
SPL (Sound Performance Lab) This array, shown in Figure 21-9, extends the notion further, giving the user a choice of microphone patterns and, through the use of telescoping arms, allows variable spacing of microphone elements. You will note the similarity to the Decca tree discussed in Chapter 20. The array would be used for primary pickup of a large group, along with spot microphones. The rear-facing microphones would normally feed the rear surround channels with no further signal modification other than added delay.
Microphones and Microphone Arrays for Sound Pickup
321
Figure 21-9. SPL microphone array. Photo of array (A); drawing of array (B) (Photo courtesy Transamerica Audio Group)
322
Chapter 21
FRONTAL MICROPHONE ARRAYS You may hear the term "3-2 recording" as a general description of 5.1 techniques. The " 3 " refers to the front channels and the "2" to the rear channels. The designation implies the usage of surround in the direct-ambient sense, implying that two arrays are distinct from each other. This is a distinction that can lead to endless discussions among recording engineers. In any event, considerable attention has been given to optimizing frontal pickup, and the arrays shown in Figure 21-10 are typical. The array shown at A is proposed by Klepko, and the intention of left and right hypercardioid patterns to minimize phantom images existing between the outside microphone pair. This approach would then leave the center channel pickup entirely to the center cardioid.
A Left
\) 7in
Center i
Right
O
(f
^[^
7in
C - cardioid at O"" L & R - hypercardioids at ± W
Center
Left
o Left
omnr
Q 24-32 in
Right
o Right
C - forward facing cardioid omni L & R - side facing supercardioids Omnis - both operating below 1 0 0 H z and adjustable in level Figure 21-10. Frontal microphone arrays. Klepko array (A); Schoeps OCT (Optimum Cardioid Triangle) array (B).
323
A Classical Case Study
A similar approach, due to Schoeps, is shown at B. Here, the distance between left and right microphones has been increased for added isolation between them; the center microphone is placed slightly forward of the center line between left and right. The purpose of the two omni elements at left and right is simply to restore LF response, which may be lacking in supercardioid and hypercardioid microphones.
A CLASSICAL CASE STUDY The plan shown in Figure 21-11 is for a piano concerto with an orchestra of strings. The session was mixed and monitored in stereo, while added track capability made it possible to create a surround mix later. All nine microphones were used in the stereo mix, and the surround plan was as follows:
Ambience microphones
A
Frontal array Figure 21-11. Recording arrangement for simultaneous stereo/surround pickup.
324
Chapter 21 Front array: Left channel: Left flank
Center channel: Right channel: ORTF pair (-4 dB) Right flank Piano half-left Piano half-right Bass spot half-right
Rear array: Left-rear: L-ambient (delayed)
Right-rear: R-annbient (delayed)
The piano and bass microphones were panned as indicated in the front array. The delay in the ambient microphones was increased by about 10 milliseconds in order to increase front-rear delineation. Both stereo and surround recordings are on Delos SACD 3259 (Music of Shostakovich and Schnittke).
POP MIXES FROM MULTITRACK SOURCE TAPES As surround developed in the 1990s many older pop multitrack tapes were "repurposed" into surround mixes for 5.1 delivery on DVD audio and SACD formats. In many cases the original mixing engineers and artists may not be available to supervise the surround mixes, or otherwise get involved in the approval cycle. It is truly amazing how good most of these surround mixes are. For the most part, the mixing engineers are putting a basic stereo "scene" at the front, relegating the rear channels to background vocals, rhythmic fills, and ambience. Many engineers are using newer reverberation programs that allow decorrelated, random ambience to appear in all channels. The problem area has been assignment of the primary vocal line. Most artists would object strongly to having the vocal line isolated in the center channel—for obvious reasons: It could be dropped out and the remainder of the mix used for other unauthorized purposes. Most engineers proceed as follows: Vocals are assigned to the center channel and also to the front left and right channels about 4 dB lower. In addition, the primary bass line is assigned to the center channel. Other elements may also be assigned to the center channel as needed but should be lower in level so that they do not conflict with any phantom images between left and right front signals.
Special Hardware for Surround Mixing
325
DOWN-MIXING Can a 5.1 surround recording be down-mixed to stereo? Most engineers would say no, preferring to make a separate stereo mix—if there is room on the disc for it. However, there is a feature offered in the DVD audio mastering that allows down-mixing using predetermined balance coefficients. This at least is a step in the right direction and should be carefully explored, and implemented only when it works.
SPECIAL HARDWARE FOR SURROUND MIXING Unless they have been designed for film mixing, most current consoles will not have the tools necessary to make flexible surround mixes. There are generally two problem areas; monitoring flexibility and multichannel panning. Many third-party companies make various modules that can be externally fitted into the console's architecture to provide these functions. In particular, check out any "joy stick" types of panners carefully to make sure they do what you expect. While direct assignments of inputs to one of five output busses can always be made on a traditional console, the flexibility of one or more panpots will enable you to make such assignments more quickly.
Chapter 22 MIXING AND MASTERING PROCEDURES
INTRODUCTION Over the four last decades, the creation of pop and rock recorded music has divided itself into three basic stages: tracking, mixing, and mastering:
Tracking Tracking is the process of laying down useful material, or tracks, in the studio. The skills needed by a good tracking engineer are efficiency in the studio and knowing which microphones to use and where to put them. An engineer who specializes in tracking may be a fine mixer, too; but his clientele might not leave him much time for that activity. Mixing can be a time-consuming process and may call for certain skills which the tracking engineer may or may not have.
Mixing The mixing activity normally takes place in a modem control room, since the in-line console provides all the facilities and signal processing capability required by the mixing engineer. The mixing process is relatively unhurried, and producer and artists will often make one or two experimental mixes, evaluating them over a period of a day or two, before a final decision is made. In some cases a decision may be made to go back into the studio to lay down additional tracks, followed by remixing.
Mastering The mastering process comes at the end of the production cycle and is normally the point at which the product is given its "final spin," so to speak. Most mastering engineers are independents and work in an atmosphere quite different from that of a control room. Their art is an outgrowth of earlier LP disc mastering, which truly did require the hands of an expert. Some of the most famous names in recording are those of mastering engineers who grew up with the LP and have made the transition into the world of the CD. For most classical and jazz recordings, the mixdown session is a logical extension of the studio session itself. In some cases the monitor mixes from the sessions may be used directly for creating the final product, and no mixing session, as such, will be needed.
Preparations for Mixing
327
In this chapter we will assume that the mix is being made from a 24-track source tape, since this is the most common medium for multichannel recording. We will consider both stereo and surround mixdowns. We will discuss non-automated as well as the use of console automation procedures.
PREPARATIONS FOR MIXING An analog master 24-track tape will undoubtedly have been through many operations before the final mixdown session, including the addition of extra tracks and overdubbing on existing tracks. The segment of tape to be mixed should be isolated from the main reel; it should be "leadered" and placed on a new reel. (Leader tape is normally made of paper or plastic and is nonrecordable. It is used primarily for providing silent pauses or spaces between recorded segments.) Ample tape should be left at the start and end of the recorded segment. In particular, any count-downs or pertinent comments just before the start of music should be left intact. Any noise reduction reference tones or frequency alignment tones should be isolated by leader tape and kept with the multitrack master so that accurate alignment of the playback system can be carried out any time a remix operation is scheduled. The tape should then be spot-checked, track by track, to determine that they are all magnetically "clean," that is, free of any signs of punching in and out. It would be wise to clean up any such spots up before proceeding. Any internal edits in the multitrack analog tape should have been made with the "arrow" splice configuration discussed in Chapter 23. Because of the splice, it will be necessary to restripe the time code track (track 24) to ensure uninterrupted code for operating the automation system. If the mixdown is made to an analog two-track machine, that machine should be carefully aligned prior to the session. If the end product is a Compact Disc, then the mixdown will normally be to a digital recorder or DAW. Be sure that it has been calibrated to the house standard. Normally, this would call for a calibration level of "0" VU on the analog machine registering -20 dBFS (dB full scale) on the digital recorder.
BUILDING THE MIX The console will be operated in its monitor mode, with all outputs from the tape machine fed into the first 24 line inputs. Signal processing should be switched so that all components are in the monitor path.
328
Chapter 22
Typically, a pop/rock mix is built starting with the basic rhythm tracks. Trial panning assignments may be made at this point, but it is almost certain that they will change slightly later on. The level of this "sub-mix" should be 6 to 8 dB below normal operating level on the 2-track recorder, since program levels will rise as more ingredients are added to the mix. An important rule here is to keep the faders as close as possible to their nominal "zero" operating points. (On most consoles this is about 10 to 12 dB down from the fullon position of the fader.) At this stage, the trim controls on each input should be used to set preliminary balances. First, try to get a good mix purely in terms of level and panning adjustments; then proceed with any equalization or dynamics control, as needed. Finally, determine if there is any need for reverberation or other time based signal processing, and apply it as needed. If the console has voltage controlled amplifier (VCA) subgrouping, this would be a good time to assign all the rhythm elements to a single console subgroup. The next element in the mix would be the basic tracks: guitars and vocals. Rather then bring these in directly over the rhythm elements, first try them alone, adding signal processing as required. Once an initial balance has been made, the rhythm elements can be brought in. Finally, any sweetening (strings, synthesizers, etc.) can be brought into the mix. The mix elements here can now be assigned to three (or more) additional VCA subgroups: vocals, guitars, and sweetening. At each new stage in building the mix, the levels on the output busses will increase, and by the time all the ingredients are in, the stereo signals should be fairly close to normal operating level. Any corrections here may be made by adjusting all the input trims by some fixed amount. Although they will certainly not stay at their zero positions throughout the mix, the notion of having a reference "home base" for all faders is important. In the way of VCA subgrouping, there may be as many subgroups as the engineer feels will be necessary for making quick sectional level changes. Moving the subgroup faders will control all of the elements assigned to that subgroup, in any stereo configuration. If the engineer wishes to change a single element within the subgroup, then the fader for the individual element can be adjusted as needed. Again, the reason for starting out with all the input faders in their zero positions is to provide a point of reference for resetting them.
REHEARSING THE MIX By the time initial balances have been set and signal processing decisions made, the producer and engineer will already have a pretty good idea of how the mix will proceed. If the music is fairly straightforward, then many ele-
Monitoring Conditions
329
ments in the mix will more or less take care of themselves. However, if it is a "busy" mix, the engineer may have to make many mental notes of what to do when—and there may be a need for an assistant engineer to help with some of the changes at the console. After a few rehearsals, an experimental mixdown can be made and replayed. Take careful notes at this first playback of those things you want to do differently the next time around. It will soon become very apparent that listening to a mix while you are making changes at the console is not quite the same thing as listening intently to a mix while your hands are still!
MONITORING CONDITIONS Nothing takes the place of a pair of truly smooth, wide-range, low distortion monitor loudspeakers in determining the musical values in a mix. But always remember that your mix should sound good on the lowest common denominator playback equipment, including boomboxes and low-cost car systems. Try to avoid the thrill of playing music back at ridiculous levels in the control room. You will find that your decision-making capabilities will soon fade, and in time so will your hearing.
THE AUTOMATED MIXDOWN From the description of the manual mixdown procedure we have just given, the benefits of automated mixdown will seem like a blessing. With automation it is possible to build a mix in stages, perhaps a couple of tracks at a time, and storing all settings as the mix proceeds. At any point, a single track can be altered without upsetting the rest of the mix. Trial mixes may be saved simply by storing the fader positional data in memory. You can combine several mixes by merging them at any point. Only when you and the producer both agree on the quality of a mix should you save it to stereo. Even then, don't throw away any of your previous mixes until you are absolutely sure of your choice. If the automated mixdown is made from a digital multitrack source, the same basic process is followed. If you are working with two or more MDMs, you should consider dumping everything into a single disc drive, with careful attention to synchronization.
330
Chapter 22
MUSICAL CONSIDERATIONS The mixdown process should reflect the best musical judgement of both engineer and producer. It should never be done hastily and then sent out the door. Always evaluate it again the following day, no matter how tight your production schedule. In making the mix, keep in mind the following: 1. There are many ingredients on your multitrack master, and they cannot all be up-front at the same time. Determine those that are important at a particular time and keep the others at a slightly lower level. Think in terms of layers: which elements need to be in the foreground, as contrasted with those that belong in the middle or background? 2. Spectral balance. The ideal here is to have a program signal that occupies a wide spectrum. Avoid boosting too many tracks in the same "presence" frequency region (2 to 5 kHz). 3. Take advantage of stereo. Many contemporary pop/rock mixes seem very center heavy. Spread things out a bit more than usual, and make good use of those tracks that were recorded as stereo pairs. The essence of stereo is a sense of spatiality, not a set of mono images panned to different positions on the stereo stage. 4. Do a mono "reality" check. Although your primary efforts will be to make as good a stereo mixdown as you can, do make periodic mono checks, just to make sure that the channels will sum with no problem. 5. Reverberation. Use more than one reverberation device if the program calls for it. For example, a reverb setting that may be appropriate for percussion instruments will undoubtedly be too short for a vocal—and vice versa. Above all, listen to and study analytically the recordings of artists, engineers, and producers whom you respect.
THE CLASSICAL MIXDOWN If the intended program release format is the compact disc, then most chamber music and solo instrumentals are recorded direct to stereo; that is, the monitor mix is the final mix. For orchestral and operatic recording, multitrack is now commonplace, not only for flexibility in making the best possible stereo mixdown, but for future possibilities in surround sound applications. Here, we will deal only with orchestral and operatic recordings in mixdown
Stereo Sound Stage Plots
331
to stereo. It will be helpful to re-read the sections of Chapter 20 dealing with orchestral recording, since this discussion relates to those techniques. The first step in building the mix in a classical orchestral recording is to listen only to the main microphone pair, and then gradually bringing in the flanking pair. Do not use any more of the flanking pair than is necessary to produce a slight broadening of the orchestral sound stage, primarily in the strings. The aim here is to combine the precise localization of the main pair with the ambience afforded by the spaced pair. Stated differently, we are looking for the right combination of image specificity and spatiality. At this point, the house microphones may be introduced into the mix. Their purpose is merely to flesh out the reverberant texture in the mix. If the four main microphones were used a bit farther from the orchestra than usual, and if the recording space had more reverberation than usual, then we might not need the house microphones at all. Many engineers and producers who record with left-center-right omnidirectional microphones feel no need for the house microphones, since their basic technique normally provides enough room sound. Finally, the various accent microphones and microphone pairs can be added to the mix. At this point, the engineer and producer may experiment with digital delay of some or all of the accent microphones, as discussed in Chapter 20. If you are mixing with a digital console or workstation, adding delay is a very simple process. During the normal course of remixing a movement from a large classical work, subtle changes in level may be useful. A fairly long solo passage in the woodwinds would certainly benefit from very slight boosting of the wind microphone pair, not so much for an increase in level but for slightly more presence. Even the level of the house microphones may be changed slightly during a long work. Slow passages can often benefit from a small increase here. It is in instrumental and vocal i3erformances with orchestra where the postproduction flexibility of changing levels is most applicable. While the producer and conductor are normally in agreement regarding solo instrument and orchestral balances, the soloist normally wants a little more of the solo instrument in the final mix. This play of musical egos has little to do with recording engineering/?er se, but it is an integral part of the record business —so play your part well.
STEREO SOUND STAGE PLOTS As you are planning a mixing session, it's often very helpful to make a stereo sound stage plot of the intended intended stereo sound stage. This is simply
Chapter 22
332
an illustration showing the locations of the major elements in the mix identified in terms oi layers of musical dominance. A simple mix is shown in Figure 22-1. Here, we have a solo vocal accompanied by piano, drums, and bass. The vocal is layer one, and the three instrumentalists account for layer two. Reverberation and ambience constitute layer three.
Layer 3 -4—
Figure 22-1. Stereo sound stage plot for a small jazz group.
Normally, any element in layer one is a solo vocal or instrument and is generally positioned in the center of the mix. Elements in layer two are uniformly distrubuted as stereo images covering the width of the stereo stage. Reverberation and ambience are fairly constant in the mix and should be heard over the broadest extent possible. If layer one consists of a duet, then these elements should be panned slightly left and right of center so that the listener can delineate them spatially. Figure 22-2 shows a more complex mix. The elements are identified in the figure, and it is not unusual for instrumental soloists to move back and forth between layers one and two, as required. When these elements do make the move to layer one, they should be raised slightly in level. A chorus featuring an instrumental soloist should have that soloist panned to the center; however, a fairly short instrumental soloist in a predominantly solo vocal section can remain in its layer-two position. Layer 1
(^ocaT) ^ Lead ^ ^^^JpstrumenLL^
^ Lead ^ V.Jnstrument^2^>^
Layer 2 Vocal fills J Layer 3
^
C^^hythm fills
Ambience
'mm^///////M^/^^^^^ Figure 22-2. Stereo sound stage plot for a medium-sized pop group.
The Stereo Mastering Process
333
THE STEREO MASTERING PROCESS Many pop projects are carried out using a number studios and engineers in major recording centers around the world, and the mastering engineer is the one person who is entrusted with the task of pulling everything together into a musically and technically cohesive end product. Although there is an executive producer heading the entire project, there may actually have been several session producers as well as engineers. The mastering engineer normally works in a high-tech music listening room, as opposed to the usual control room environment. State-of-the-art signal processing gear is typical, along with a few pieces of highly regarded vintage gear. Given good sources, most of what the mastering engineer does is quite subtle. Where the sources may vary from take to take, the mastering engineer can bring a consistency to all tracks, making virtually a night and day difference in the overall quality of the project. The primary tools are equalization and dynamics control. Rarely is any reverberation added, since adding reverb globally to a stereo mix usually causes more problems than it solves. In such cases it is probably better to remix the program in question. By the way, not all album mixes need to go through a final mastering stage as such. A lot depends on how well those mixes were made in the first place. If they are consistent, and largely under the control of one person, then chances are that the project can go directly into production.
SURROUND MIXING Surround mixing is still in its developmental stages. In classical music there is one prominent model or paradigm-that of placing the orchestra or chamber group in a direct-ambient setting, such as you might perceive in a concert or recital hall. This entails putting the performing group clearly in the front three channels and relegating the rear channels basically to filling out hall ambience. If the nature of the music lends itself to various off-stage effects, such as an opera or a work that might call for an off-stage trumpet, then the rear channels may be used effectively for conveying primary directional information. In pop music, many of the great albums from the seventies and eighties have been very successfully remixed (or repurposed) into surround. The general plan is to keep lead vocals and primary rhythmic elements up front, using the rear channels for vocal and guitar fills. Ambience is usually distributed among all channels for the widest effect. Recordings of live performances are
Chapter 22
334
a "natural" for surround, with audience reactions and applause wrapping around the listening area. Figure 22-3 shows the spatial layout for a typical pop surround mix. Nearly all level one elements will be presented from the front channels, along with primary rhythm, bass, and instrumental elements. Level two elements may be shared between the front channels and the rear channels. In particular, rhythm and vocal fills are very effective when presented from the rear channels. Ambience elements are most effective if they are presented in a U-shaped wrap-around array using only the left and right front, and both rear channels.
Figure 22-3. Surround sound plot for a pop group.
Pop Vocals in Surround
335
POP VOCALS IN SURROUND Since surround sound gives the engineer and producer a real center channel to work with, there might be a temptation to put the vocalist only in the center channel. The problem with doing this is that a clever consumer might "deconstruct" the mix, removing the solo artist and using the remaining tracks for other purposes. This and other problems have come up in the last few years, and the solution shown in Figure 22-4 is recommended. Put the soloist in all three front channels as shown, with a predominance in the center. Ambience for the soloist can be omitted from the center channel for best effect. Solo (0 dB)
THIS: Left
Center
Right
Solo (-6 dB)
Solo (-3 dB)
Solo (-6 dB)
Left
Center
Right
OR THIS:
Solo appears here Figure 22-4. Treating a vocal soloist in surround.
Chapter 22
336
CLASSICAL SURROUND MIXES A general approach to direct-ambient classical mixing is shown in Figure 22-5. The primary ensemble would be presented from the front three channels according to the techniques discussed in Chapter 2L Reverberation and ambience would then be added using the chosen microphone arrays or by panning ambient signals over left and right front, and both rear channels.
All orchestral textures
Ambience Figure 22-5. Surround sound plot for a traditional direct/ambient classical pickup.
Operas and other large-scale works present many possibilities for distributing special effects over all five channels, as shown in Figure 22-6. Such elements as off-stage instrumental, choral, and vocal effects all appear in classical music, and you should take advantage of the opportunities of positioning them at effective positions.
Classical Sound Mixing
337
All orchestral textures
Figure 22-6. Surround sound pickup for more complex classical works.
Remember that it was the motion picture industry that first introduced all of us to these audio techniques, so don't hesitate to copy some good idea from the latest Star Wars movie that you might have seen. But always remember: In surround, phantom images work only among the front three channels; do not try to position phantom images among the front and back side channels.
Chapter 23 MUSIC EDITING AND ASSEMBLY
INTRODUCTION The advent of 1/4-inch (0.6 cm) magnetic tape recording after World War II brought the capability of editing, and a new creative world was opened to the record industry. Wrong notes could be corrected, noises removed, and the best sections of musical works joined together. Not everyone considered this to be a musical advantage, but there is no denying that recording standards, not to mention consumers' expectations, have been raised through the skillful editing and assembly of the best takes. The great majority of musical artists wouldn't have any other way. There is virtually no instructive literature on music and speech editing. Of all the aspects of recording, editing has traditionally been learned through apprenticeship. Editing analog tape is mechanical, involving the cutting of tape and splicing the desired pieces together. With the advent of digital recording, tape cutting has gone away, and the new process is entirely electronic. With digital editing have come new freedoms for the editor, including many techniques impossible in the analog domain. In this chapter we will discuss the process of editing both analog and digital recordings, with special emphasis on studio techniques which facilitate the editing and assembly processes.
BASIC ANALOG TAPE EDITING TOOLS Why are we discussing analog editing at the start of the 21 st century? The fact is that many activities, radio broadcasting among them, still use a lot of analog tape. The vast majority of audio archives are analog, and future archival transfers of this material will involve the re-doing of old splices. The editing block, razor blade, splicing tape, and marker pencil are the basic tools of tape editing. Figure 23-1A shows a quarter inch editing block of the type designed by Tall. The depression in the center holds the tape in a slightly concave position, and the slanted (45-degree) groove allows the razor blade to be drawn across the tape, cutting it smoothly. (The vertical groove is used for making rare "butt" splices.) Splicing tape is made of a thin plastic backing with a bleed-free adhesive that inhibits sticky splices.
Basic Analog Tape Ediitng Tools
Concave depression for holding tape
\
339
Slits for cutting tape
kB
Concave depression for holdlr>g tape
Silts for cutting tape
Splicing tape
Figure 23-1. Splicing blocks for quarter-inch tape (A) and 2-inch tape (B): example of arrow cut splice on wide tape (C).
A fresh supply of single-edge razor blades should be kept on hand, and old ones discarded at the first sign of dullness. Care should also be taken that the blades are free of any remnant magnetization.
Chapter 23
340
Multichannel tape formats can be edited as well, and there are editing blocks for all tape widths. A 45-degree cut is not applicable here, because of the audibility of the time gap between outside tracks as the splice moves over the playback head. For editing 2-inch (5-cm) tape, the splicing block shown in Figure 23-IB is normally used. Note that there are two slants at steep angles. This facilitates making a "vee" cut in the tape, as shown at C
MUSIC EDITING Assume that a piece of music has a noise in the master recording at a point indicated in Figure 23-2A. Assume that in an alternate take, the same point in the music was noise-free, as shown at B. The editor can insert a portion of the alternate tape into the master tape by identifying two workable editing points in the music, as shown at C. Noise
^
\
Measure 18 Measure 17 Measure 16 Measure 15 Measure 14 Master tape Edit into master tape
B
Measure 18 Measure 17 [Measure 16]Measure 15 Measure 14 Alternate tape
Locate out-point
Locate in-point I ^
^^
Measure 18 Measure 17 Measure 16 Measure 15 Measure 14 Master tape
Figure 23-2. Principle of editing. Master tape with noise in measure 16 (A); alternate tape without noise in measure 16 (B); locating an edit-in point before the noise and an edit-out point after the noise (C).
It is easiest to edit on musical attacks or chords in which all notes change at the same time. The editor will then slowly "rock" the tape back and forth over the playback head until the beginning of the incoming attack is identified. Then the editor will "back oflP' from the actual attack by moving the tape a very small amount to the left, and then carefully mark that point on the tape, using a fine wax marker. Then, going to the alternate take, the procedure is repeated.
Music Editing
3 41
The editor then places the master tape in the editing block so that the marked point is just over the diagonal slot, and then cuts the tape with the razor blade. This step is repeated with the alternate take. Then the outgoing point of the master take is placed in the editing block, and the incoming point of the alternate take is placed in the block and butted up against it. A short piece of splicing tape is placed over that point and firmly pressed to make a sure contact, as shown in Figure 23-3. Splicing tape
Alternate tape
||i||i|||||gii|iil|i|
Master tape
Figure 23-3. Making the splice.
At this point in the process, the editor plays the splice to make sure that it works well and that the transition is not audible as such. If the splice works, the editor then searches for another editing point to get back into the master take, and the whole process is repeated. Note that we have used a diagonal cut for this operation. This seems to work best by providing a slight "time smear" which helps to make the transitions less audible. At a tape speed of 15 ips (38 cm/sec), the 45-degree cut results in a time smear of a little less than 16 msec with 1/4-inch (0.6 cm) tape. At a tape speed of 7.5 ips (19 cm/sec), the time interval is just over 30 msec, which is reaching the range at which the ear might detect small timing differences in transitions between the left and right channels. For this reason, fine editing at a tape speed of 7.5 ips is difficult and not recommended. In ensemble playing, an attack may be marred by one player who enters a split second before the rest of the group. Often, the leading edge of the offending note can be cut out entirely (and not replaced). The amount of tape removed may be no more than 10 or 20 milliseconds, and the gap in the music will be virtually inaudible. The resulting attack will then sound very precise and musical. Here, a butt splice might work best. On a larger scale, pauses in playing can often be tightened up slightly for better musical continuity. This, in essence, is the editing process. As simple as it appears, it is very tricky. Even experienced editors will copy a recording and experiment with a difficult edit before attempting to execute it with the master tape. If a poor edit has been make, the tapes can be reassembled and the process begun over again. There is a limit to this, however, in that the tape will bear the bruises of all of your mistrials.
342
Chapter 23
Leader tape is plastic base material without oxide and is used at the head and end of tape programs. Normally, at the end of a program, the tape or room noise is faded down to the lowest value before the leader tape is spliced in. If this were not done, there would be an abrupt drop of room or cumulative tape noise as the leader tape passed over the reproduce head. At the beginning of a program, is it customary to use leader tape up to the start of music if the noise floor is fairly low. If the recorded noise floor is high, many editors will "fade in" during the noise, editing the leader tape up to the start of the fadein. Details of this are shown in Figure 23-4. Leader tape
^
Splicing tape M
m
\ Endotprc9,a,n
v
Leader tape
^^Star.ofp^^gram
^
fe^ig|||||||||||||||||||||lC
Start of music
1 •
Fade in
Leader tape
Figure 23-4. Use of leader tape. At beginning and end of program (A); Fade-in of room sound or background noise (B).
PLANNING THE RECORDING SESSION FOR EDITING An experienced recording producer is responsible for running the session in a way that will provide the editor the needed flexibility for assembling an acceptable recording. Producers generally use their own shorthand indications of things that may have gone right or wrong in the score. On subsequent takes, or insert takes, the producer works with the artist to "cover" all problems with small sections that can be readily edited into the master take. The producer then "blocks" the score (see Figure 23-5 for an example) and then gives it to the editor to work from. Often, there will be several choices for the editor, so that the one that works best (is least audible as an edit) can be used. Many producers are able to block as they go in the studio, while others prefer to do the blocking at another time. On some occasions, the artist may wish to be present for the blocking process. The skilled producer has a number of tricks which will make editing easier and more efficient. Some of them are:
343
Planning the Recording Session for Editing 5
4 (g)
1 ^
^
rW
"
<|'
•
1 i
LJ
[ +
-P-
ti**
1 [fevjiilfii ^
(4o cvirfj
3
2
^=^
^^
li_r
J.
( Fj 2
j
frr
f*f
4^
S'-'^l*
^hrh
4
'"f^
^
/T\
»•
vc^
#
feE
H ^J 1 2
<
^
Figure 23-5. Example of blocked musical score.
Overlapping takes to eliminate gaps for breathing On occasion, a singer or wind instrument player will want to perform a long passage without taking a breath. While this is often difficult in performance, it may be necessary musically, and it can be done quite easily in recording. Instead of identifying one take as a master and another as alternate, there are in effect two master takes starting at different times, as shown in Figure 23-6. I
Z Tapel ^
Tape 2 ^
Region vlhere breath
,
is required
,
_—-/
Measure 171 Measure 16 | Measure 15 | Measure 14
Measure 20 | Measure 191 Measure 18 | Measure 171 Measure 161 I
^
J Locate edit point and go from Tape 1 to Tape 2
Figure 23-6, Editing a breath sound out of the program.
^
344
Chapter 23
Eliminating page turn and mute noises Page turns can be noisy, as are putting on and removing mutes on string instruments. Page turn noises can be eliminated by having the players memorize the first one or two measures on the following page and playing through to that point without actually turning the page. Then, the page is turned and the previous one or two measures memorized. The second segment of music is played with the overlap, and the two takes can be edited without the noise of page turns. A similar technique allows string players to put on and remove mutes without noise. Similar methods can be used to remove some of the noises of registration changes on organs or harpsichords. The beginning producer soon learns that a "cold" attack cannot be edited into a continuous performance, because there will be no continuity of reverberant sound into the incoming program at the editing point. Unfortunately, this is often required, and the addition of artificial reverberation during the transition can often make such edits as these workable.
SPEECH EDITING Speech editing is greatly helped by proper production techniques in the studio. Often, a speaker can be taught to correct mistakes by simply pausing, and repeating a sentence or portion thereof. The resulting segments can then be easily assembled by the editor. Many talkers will inadvertently emphasize a passage of speech as they are correcting it, and both mood and inflection may be different from what was originally intended. The producer must be quick to identify and correct this. Experienced narrators generally do not fall into this trap. The editor must learn to recognize the sounds of consonants with the tape playing at half or quarter speed, since this will be of great help in identifying an edit point as the tape is rocked back and forth over the reproduce head. Normally, speech editing is done at the beginnings of consonants, so it is essential that they be identified correctly. Vowels and diphthongs are far more difficult to identify at slower speeds and will require more practice.
DIGITAL EDITING The DASH and ProDigi reel-to-reel digital recording formats provided for normal razor blade editing, and the techniques presented thus far in this chapter are workable. A far better approach however is to transfer the audio mate-
Assembly of Music for Commercial Release
345
rial to a hard disc format and then do all editing on a DAW. Everything you have studied before in this chapter will work on the DAW, along with many more advantages:
Change of level, channel balance, or EQ of the incoming segment Often, a better edit can be made if the parameters of the incoming signal can be adjusted as needed.
Variable crossfade times As you identify a new editing point you can adjust the incoming and outgoing crossfade times separately. Many times, a long incoming splice can be used to match slight differences in room ambience between two takes.
Variable in and out times Many times, finding the right edit point requires fixing the out-point while leaving the in-point flexible—or vice versa. This can be done easily in any DAW environment and helps you refine the edit. Perhaps the greatest thing about digital editing is that it is completely non-invasive. You never have to worry about ruining a master tape! However, the more you know about razor-blade editing the quicker you'll adapt to digital techniques.
ASSEMBLY OF MUSIC FOR COMMERCIAL RELEASE In this section we will deal with the problems of taking previously recorded archival program segments fi*om various sources and compiling them for commercial release. Such matters as relative transfer level, signal processing, and even identification of the earliest sources will be discussed. As the new era of surround sound audio develops we can expect to see many older multitrack source tapes be "repurposed" for surround sound—just as stereo masters of the past two decades have been issued on CD.
SOURCE IDENTIFICATION In compiling programs for reissue, the engineer often will have to work with material that is quite old, perhaps three decades or more. When dealing with record companies who archive their material under vault conditions, the
346
Chapter 23
original tapes are usually easy to find and identify. However, where recorded material has over the years been sublicensed or sold, such identification may not be easy. More often than not, the original tapes were transferred, and the copies exchanged in the licensing procedure. Older tape formulations were not stable, and many older recordings have been ruined by progressive delamination of oxide from the base material. In such cases, the only working copy of the recording may be one or two generations removed from the master. On other occasions, the old tape will be playable, perhaps for only a few minutes at a time, due to the build-up of oxide on the reproduce head, with consequent "squealing" of the tape as it passes over the head in violin bow fashion. There may be a need for tape baking treatment, as discussed in Chapter 11. The following points should be noted:
Tape wind and reel condition Examine the tape carefully, noting the quality of the wind. If it is smooth, then it is likely that the tape is wound tail out. It should not be rewound quickly, because this may cause damage if the tape has any tendency for adjacent layers to stick to themselves. Place the tape reel on the feed side of the machine and rewind it by playing it at the slowest tape speed. (The tape lifters may be engaged so that there will be no unnecessary head wear.) Stand over the tape during this process and be prepared to stop the machine on a moment's notice if anything goes wrong.
Splice condition This is also a good time to check the condition of splices. Anything questionable here should be redone using fresh splicing tape. Some engineers use a slight amount of talcum powder to dry a sticky splice; this is not recommended since the powder and adhesive will eventually collect on the heads and cause other problems. If a splice is sticky, redo it.
Tape equalization Check the tape equalization carefully. Most 7.5- and 15-ips tape recordings made in the United States will require NAB equalization. The occasional rare 30-ips tape may, if made before the middle 1970s, be of unknown format, requiring equalization strictly by ear. Every attempt should be made to correct for the various CCIR (European) playback equalization curves by having switchable playback electronics for the purpose. If this cannot be done, the correction curves shown in Figure 23-7 can be used as a broad guideline for making transfers.
Determining the Earliest Usable Source Tape
347
+6h ^+4 m 2,+2 § 0 -J -2
-4 -6
J 25
50
100
200
500 Frequency (Hz)
L 1k
2k
5k
10k
Figure 23-7. Correction for playing CCIR tapes with NAB equalization. Curve A: equalization to be added when transferring 15-ips tapes; Curve B; equalization to be added when transferring 7.5-ips tapes.
Track format and azimuth Check track configuration and azimuth carefully. Many older tapes made before modem standards were adopted have odd track configurations which may result in noise and level differences between channels when played back on modem machines. The exact track configuration can be observed by treating the oxide surface with a small amount of carbonyl iron solution. This liquid has very fine particles of iron in solution and can be painted over a small portion of the tape. The iron particles will align themselves with the part of the tape that has passed over the record head. Remove the solution when the determination has been made. If the tape appears to be far out of standard, you may need to adjust the height of the playback head. Azimuth determination must be done purely by ear if there is no HF azimuth setting tone on the tape. If the recording is stereo, sum the tracks and listen carefully for maximum HF output while the azimuth adjustment is varied.
DETERMINING THE EARLIEST USABLE SOURCE TAPE Figure 23-8 shows a hierarchy of analog tape sources as normally used in the record industry. The earliest is of course the multitrack master (A), and this is today generally recorded on 2-inch tape with 24 channels. Technically speaking, this is the "master" tape, since it represents the earliest generation of recording. It is sometimes referred to as the "original session" tape. In a practical sense, however, the two-track tape (B) that was mixed from it is what most people in the industry would refer to as the "master." This tape represents the musical judgements made during the arduous mixdown process and
Chapter 23
348
is the approved recording that both artist and producer intended to be heard via the two-channel media of LPs, compact discs, cassettes, and FM radio. Multitrack Master
Multitrack source
B Stereo master
1st edited & EQ'd stereo mixdown
T EQ'd copy for disc mastering
Tape copies for licensees
1
Duplicating master tapes
Figure 23-8. Normal hierarchy of analog master tapes.
In the pre-digital era, this tape was subjected to considerable use. It was the source of the initial disc transfers for LP manufacture, the production of duplicating masters for Cassette duplication, and the production of next-generation tape copies for the international licensees of the record company. No wonder that it occasionally wore out and had to be remixed again. But many times the original producer, engineer, and artists were not available for this purpose, and the new mix was apt to be different from the earliest mixdown in one respect or another. As a hedge against this problem, an "EQ'd master" (C) was often made at the time of the original disc transfers. This tape reflected any last minute changes in equalization or dynamic range control that the producer felt would make a better long playing record. Specifically, this tape generally compensated for shortcomings in the LP record itself, and as such it was not necessarily intended to be used for later transfers. Unfortunately, the C masters are in great abundance, and the B masters are often relegated to the back shelf. It is worth every attempt to recover the B tapes and work from them. Digital technology has changed this, and the new heirarchy is shown in Figure 23-9, and all copies are assumed to be clones of their sources.
MAKING A NEW ARCHIVAL TRANSFER Today, great care is generally taken in making archival transfers of older tape sources. In the early days of the CD, B masters were routinely used for CD
Signal Processing
349
Multitrack source
CD mastering
SACD/DVD Audio mastering
Figure 23-9. Normal hierarchy of digital master tapes.
mastering, with much critical damnation. Whenever possible, the original producer and/or engineer should be contracted to supervise transfers from original multitrack masters. CBS Records, RCA Records, and Mercury Records have done this with classical recordings; EMI engaged original producer George Martin to supervise the CD transfers of the early Beatles fourchannel recordings. If the engineer and reissue producer know exactly what they want in terms of equalization and signal processing, the final transfer may be made incorporating these changes. More often than not, it will be better to make a straight digital transfer. From this point, any signal processing and level adjustments can be made with little or no further degradation. This is also an appropriate time for decisions regarding any computer based processing of the digital tape for the removal of noise. If it is decided not to do this, then the tape can be assembled into a finished program, making appropriate band-toband level adjustments and any desired equalization changes.
SIGNAL PROCESSING It is important that the engineer and producer in charge of program assembly work with an accurate monitor system and they routinely "calibrate their ears" with a wide variety of recordings from many sources. In most cases, a living room ambience with consumer-type loudspeakers will be a better working environment than a control room with high-power monitors.
350
Chapter 23
The engineer and producer should not try to be "too creative" as they transfer older program material. That is, there should be no attempt to make the recordings sound as though they had been made just yesterday. It will be more appreciated by the buyer if the sound is simply brought up to the best standards of its time. Identify the best sound recording of the set, and attempt to make the others match it through careful broadband equalization. The options of adding reverberation and processing mono program for pseudostereo presentation should be made only after careful consideration, since both are quite obvious and may offend many listeners.
ADJUSTING BAND-TO-BAND LEVELS IN A COMPILATION It goes without saying that band-to-band levels in a compilation of recordings should be adjusted in level so that the listener will not have to adjust the volume control during the playing of a CD. Where all of the assembled sources are of the same, or like, musical group, matching levels is not a difficult matter. The problem arises when a sampler disc is assembled, which may include a wide variety of program sources ranging from those that are normally small in scale to those that are large. For example, a disc containing a movement of a string quartet following a loud symphonic excerpt poses a problem. If we assume that the loud excerpt peaks at full scale (0 dBFS), then the quartet excerpt should not peak any higher than about -8 to -10 dB, if it is to be heard in natural proportion. A good starting point for adjusting sources with different levels is to set their lowest levels on a par with each other, and then make adjustments up or down from that point, as required. This is a skill that takes some practice.
TRANSFERS FROM 78-RPM DISCS While a few established record companies still have their vaults intact and can even make new pressings from old metal stampers, most of the reissue market for old recordings depends on the cooperation and generosity of individual record collectors and various non-profit recording archives around the country. Transferring from old discs is a special art and requires a collection of custom-made styli and tone arms, a variable speed turntable, facilities for cleaning old discs, and various equalizers and filters.
Transfers from 78-rpm Discs
351
If continuous performances are to be assembled from 78-rpm records, there will be a need to match the sound from the end of each side as it is edited into the start of the next side. This is often very tricky, since it involves matching the sound quality of different cutter heads as well as the normal differences between the response of disc recordings at outer and inner diameters. Some subtle equalization and level changing will be necessary to avoid this problem. A more difficult problem here are the noise level differences that may be evident between consecutive sides, due to differences in record processing quality. You should check out the many noise removal algorithms that are available for these purposes. Companies such as Sonic Solutions and Sadie have pioneered in this area, and some their programs are truly remarkable. Overall, the art of making transfers from old discs rests in the hands of a few dedicated persons. Companies interested in having such work done should consult the Library of Congress, the Rodgers and Hammerstein archives at Lincoln Center, the Stanford University archives, and the University of Syracuse archives, since these organizations can identify skilled engineers in the art as well as provide sources for old recordings. Archival source (metal parts, mint quality pressings) Original disc source
|>j
Pitch adjustment, stylus contour choice
"De-ticking;" noise elimination; convolution Final master tape i'l'l'l'T't't't'l't't't'lVl't'l
Figure 23-10. Flow diagram for transfer from disc sources
Chapter 24 RECORDED TAPE PRODUCTS FOR THE CONSUMER
INTRODUCTION The first recorded two-track stereo tape product for the consumer was introduced in 1953. It was in the form of 1/4-inch-wide (6.4 mm) tape running at 7.5 in/sec (19 cm/sec) on a 7-inch (17.8 cm) reel-to-reel format. The programs were less than one hour, and the costs for longer tapes were in the range of 20 to 30 dollars. Such were the prices that could be charged when tape was the only medium for stereo. When the stereo LP was introduced in 1957 the original tape format quickly died out and was later replaced by a 1/4-inch four-track reel-to-reel format with stereo program recorded in each direction. The high costs of tape duplication could not keep pace with the improvements in stereo LP production, and eventually the four-track configuration gave way to the Philips Compact Cassette. The cassette began as a low-performance medium for dictating machines. Through many improvements during the years it has reached a degree of technical respectability unforeseen when it was introduced during the late 1960s. Today, the very mention of recorded tape brings to mind the cassette, and the format has exhibited a remarkable staying factor in this age of the compact disc.
PHYSICAL PROFILE OF THE CASSETTE A major problem with reel-to-reel formats was the nuisance of threading the tape through the intricate guides of tape machines. The convenience of the cassette is that the tape is contained within a shell; the consumer simply inserts the shell into the player; the machine does the rest, often including automatic reversing of the tape at the end of a side. Figure 24-1A shows a view of the cassette. A functional view is shown at B, and the track format is shown at C. The tape speed is 1 7/8 in/sec (4.75 cm/sec). In some record/playback machines, the tape must be removed from the machine and turned over to record or play on the alternate set of tracks. A few machines have dual drive capstans and require no removal from the machine.
Physical Profile of the Cassette
353
7.5 mm
o o
~^m
^
m
.06 mm .Ö3jTim .06 mm
3.8 mm
.09 mm guard band
Tape speed = 47.6 mm/sec (1 7/8 Ips)
Figure 24-1. View of the cassette (A); dimensions and internal details (B); track configuration for the cassette (C).
Chapter 24
354
ELECTRICAL ASPECTS There are two primary HF playback characteristics (70 and 1120 msec) for the cassette, and they are determined basically by the the magnetic characteristics of the tape stock. Representative playback curves are shown in Figure 24-2. Many machines are capable of recording on standard tape, chromium dioxide tape, and various metal tape formulations. Since the recording equalization requirements are different for the three types of tape, recording equalization must also be switchable. Curve A: Standard ferric oxide formulation Curve B: Chrome and metal formulations
1327 Hz 2275 Hz
50
100
200
500 1 k Frequency (Hz)
5k
10k
Figure 24-2. Standard playback curves for the cassette.
The best overall performance is obtained with metal tape.This formulation requires bias drive capability in excess of the normal oxide and chromium formulations and must be done on machines specifically designed for it. Metal tapes can be played back, however, on standard machines. Dolby noise reduction is applicable to all tape formulations, and Dolby B-type noise reduction is widely used today. For the quality conscious consumer, C-type and S-type noise reduction offer improved performance. Figure 24-3 shows a family of encoding curves for Dolby B, C, and S-type noise reduction. These curves show the maximum amount of HF boost that takes place at lower recording levels, but they do not necessarily indicate noise reduction action for specific program input conditions.
355
Electrical Aspects
•^ 25
B15 -« 10 5 0 20
50 100 200 500 1k 2k Frequency (Hz)
5k 10k 20k
B ^1 90
- • ^
'S
\
B15 (D
>
^
II
-^ 10 5 0
20
50 100 200 500 1k 2k Frequency (Hz)
5k 10k 20k
iLO\
**^
0€\
s
\
B15
1
-J 10 5 0 20
50 100 200 500 1k 2k Frequency (Hz)
5k 10k 20k
Figure 24-3. Dolby encoding action for low-level signals. B-type NR (A); C-type NR (B); Dolby S (C).
Chapter 24
356
TAPE CHARACTERISTICS Recorded wavelengths on cassette tape are one-eighth those of the same frequency recorded at 15 ips (38 cm/sec). At 20 kHz, the recorded wavelength at 15 ips is 20 microns; at 1 7/8 ips the recorded wavelength is 2.4 microns. Tape for cassette use is optimized for short wavelength recording, and this requires that the magnetic layer be quite thin in order to minimize recording losses at those short wavelengths. Since thin magnetic coatings tend to increase noise, much of the research in cassette tape over the last three decades has addressed higher signal output capability. In particular, metal formulations are excellent in this respect.
HXPRO In cassette recording, high frequencies at high program levels reaching the record head can act as additional bias, and the resultant "over-biasing" at short wavelengths can lead to diminished HF program output. HX Pro is based on developments by Dolby Laboratories and Bang & Olufsen and is a method of controlling the primary bias signal during heavy modulation so that the eifective bias current operating on the signal at the record head is more or less constant. HX Pro is provided by the circuit shown in Figure 24-4. The incoming audio signal is summed with a bias signal that has been determined by a voltage controlled amplifier (VGA), modulated by a signal from the record head itself The filter-rectifier detects the amount of HF program and Summing amplifier for signal and bias
Signal in
1 Bias
^ >—^H
M
1 oscillator"
""f f
VGA
Record head
Filterrectifier
)\
To other channels Figure 24-4. Circuit for Dolby HX Pro.
High-Speed Duplication of Cassettes
357
alters the bias oscillator output so that the net eflfective bias at the record head remains uniform. A tape recorded with HX Pro will play back on any cassette machine and will exhibit more accurate HF response at high recording levels. Using the best available tape, along with Dolby C or S-type noise reduction and HX Pro, the quality of sound produced by a carefully made cassette is excellent. Great care must be taken, however, that the input program spectrum does not stress the system at high frequencies.
HIGH-SPEED DUPLICATION OF CASSETTES A large part of the success of the cassette derives from its relatively low duplication costs. A large part of this is due to the time saving aspects of highspeed duplication. Figure 24-5 shows a high-speed duplication system. The duplicating master is recorded at 7.5 ips (19 cm/sec) and runs in an endless loop tape bin, shown at the left in the photo. The master normally runs at 240 in/sec, resulting in a duplication ratio of 32 to 1. A 64-to-l duplication ratio is possible if the duplicating master is recorded at 3 3/4 in/sec, with some reduction in quality. Higher duplicating speed ratios are desirable obviously because of the shorter manufacturing times involved.
Figure 24-5. High-speed duplicating system for cassettes. (Courtesy Gauss)
Of all the consumer media, the cassette is the only one that is duplicated in a manner strictly analogous to making a cassette in the home. The basic differences are purely those of scale and the fact that the tape is duplicated in bulk on hubs, called "pancakes," which are later loaded into empty cassette shells after quality checks have been completed.
358
Chapter 24
A recent improvement in duplicating technology involves the use of digital storage of the entire program in high-speed digital memory. This permits a duplicating ratio up to 80 to 1, while bypassing another generation of analog tape copying altogether.
MASTERING FOR CASSETTE DUPLICATION Most cassette duplicating facilities prefer to make their own duplicating (running) master tapes, since this gives them complete control over such matters as overall tape level and HF level in particular. An incoming program is auditioned and potential trouble points are noted. When the program is transferred to the duplicating master, limiters which have been previously calibrated will reduce the level of HF passages in the program that might be troublesome. The alternative to such treatment is to reduce the overall duplicating level so that all signals can be accommodated. For classical music, the spectral characteristic is generally as shown in Figure 8-4, and little if any HF treatment should be necessary. If the overall spectrum is flat, as is so often the case with rock program material, then limiting is required. HX Pro can be implemented at the duplicating stage just as it can be in consumer recorders.
Chapter 25 OPTICAL MEDIA FOR THE CONSUMER
INTRODUCTION The introduction of the CD in the early 1980s heralded a new era of consumer enjoyment of recorded sound. While the LP had served so well for many decades, the lingering problems of ticks, pops, and inevitable record wear were liabilities. The emergence of a digital playback medium came after many years of development in the fields of high-speed computation and digital signal processing. At first the manufacturing costs were high and the players were expensive. Both of these have fallen dramatically, and the CD is as commonplace now as the LP and compact cassette have ever been. The format has been expanded into such applications as the CD-ROM (read-only memory) for data storage, the CD-I for interactive video game purposes, and the Photo-CD for storage of photographs. While the basic CD has an information capacity of about 700 megabytes, its close relative, the DVD, has a capacity of 4.7 gigabytes, nearly seven times greater. The DVD attains its higher capacity through a combination of finer pit size and a shorter wavelength laser requirement. Many home computer systems are now provided with CD "burners" enabling users to copy audio CDs as well as make backup files on CD. The CD-ROM has virtually replaced the 3.5-inch floppy disc as the medium of information currency in the PC and Mac worlds. While DVD video has made great inroads as a carrier of video programs, DVD audio and super audio compact disc (SACD) formats are supporting high density audio for both stereo and surround sound (5.1) reproduction. These high-density audio formats are currently struggling to create a viable market segment for consumer surround sound. The SACD is available in a hybrid dual-layer form that has a standard CD layer, which will play on any conventional CD player, as well as a high density layer for carrying both high bit rate stereo and surround programs.
PHYSICAL PROFILE OF THE CD Table 25.1 presents pertinent specifications for the compact disc, and Figure 25-1 shows an overall view of the CD. Program is recorded only on one side in spiral form of pits of varying length. The pit surface is metalized
Chapter 25
360
so that it will reflect light, and it is covered with a coating of clear plastic for protection. The pits are read sequentially with a fine laser beam. The disc rotates at a constant linear velocity, so the rpm is variable, as required. The disc plays from the inside to outside. Table 25-1. Specifications for the compact disc Playing time: Rotation: Rotational speed: Track pitch: Diameter: Thickness: Center hole diameter: Material: Minimum pit length: Maximum pit length: Pit depth: Pit width: Standard wavelength: Focal depth: Digital quantization: Sampling frequency: Frequency response: SIgnal-to-noise ratio: Channel capacity:
nominal limit of 80 minutes counterclockwise when viewed from readout surface 1.2-1.4 m/sec 1.6)Lim
120 mm 1.2 mm 15 mm polycarbonate (1.55 refractive index) 0.833 |im (1.2 m/sec) to 0.972 \im (1.4 m/sec) 3.05 |im (1.2 m/sec) to 3.56 |im (1.4 m/sec) approximately 0.11 )im approximately 0.5 jim X = 780 nanometer ±2 )im 16 bits 44.1 kHz flat to 20 kHz greater than 90 dB 2 (4 with reformatting)
Physical Profile of the CD
361
15 mm (0.6 in) 120 mm (4.7 in) Figure 25-1. View of the CD.
Chapter 25
362
OPTICAL DETAILS A scanning electronic microscope view of the pit structure of the CD is shown in Figure 25-2, and Figure 25-3 shows a simphfied view of the laser reading assembly. The pit depth and laser wavelength are chosen so that light reflected from the metalized surface between pits will be constructively reinforced, while light reflected from the pit itself will be cancelled. Thus, it possible to distinguish between both pit and disc surfaces and recover digital data. No details of the tracking system are given here, but all sources of tracking error must be detected and compensated for during the playback process. These errors include departures from concentricity of the disc as well as updown motions resulting from deviations from flatness of the disc. The digital readout from the mechanism is subjected to signal conditioning and error correction as discussed in Chapter 12. Many players have a serial digital output port (SPDIF format) enabling the player to be used with an external digital processor.
M
hill 1
•
'
1 .1
,
.' M '
'
1 ^ ^ 11
'
1
Mil 1
11' ' . 1'
nil Ml,
1 , 1 >
.
' 1
)
'
<
,
'
1
'
'
'
'
1
1 1 '
' 1
'ill
Figure 25-2. Photomicrograph of pit structure (Courtesy University of Miami)
CD Replication
363 Disc
Optical system
Half-silvered mirror
Laser
Tracking, focus,| and detection Figure 25-3. The basic laser reading system.
CD REPLICATION The replication process for CDs is similar to that used for LPs. The basic process is shown in Figure 25-4. A photoresist coating is placed on an optically flat glass substrate, and a laser beam, fed by a properly formatted signal derived from the digital program source, exposes the photoresist material. The surface is then developed, and exposed areas are etched away. Metal deposition then takes place, and this initiates a series of repetitive negativepositive replication cycles, which are similar to conventional LP production.
Chapter 25
364 Photoresist^
^sss^^s^s^s^^s^s^^^\s^^^^^^^^ss^s^^^^^^^ss^^^^^^^s^^^s^ Glass-
^v^.^M
KM
kvvv^vM
NVM
r.vvvvvvM
kVT^
fvVsWH
-BSJ
kSSSSSSl
^
KSNSSSSM
-^^
After laser recording
After developing and metal deposition
Metal^'^^^^^sjijijip^iililiiijf'''''''''''^^^^^^
Metal master (negative) Metal mother (positive) Stamper (negative)'
Electroforming
Electroforming
Electroforming
Metal mother (positive) Stamper.
Stamping PlasticProtective coating Metalization on plastic surface
Finished product Figure 25-4. Replication of CDs.
Subcode Information on CDs
365
SUBCODE INFORMATION ON CDS Along with the digital audio signal, time code and other subcodes are included. These include information on lead-in, lead-out, track numbers, playing times, copy inhibit, and the like. Figure 25-5 shows a typical program code sheet, which is sent to the disc manufacturer along with the digital master tape. SMPTE time code is used to indicate the start and end of program, along with the exact frame timings for the start of each band. In this case, the program is indexed for continuous music presentation from beginning to end. That is to say, there is continuous "room sound" between the actual banded segments of the program. Note that sub-banding is possible, enabling specific points within a musical movement to be accessed by the player. (Most pop/rock recordings have a segment of silence between bands.) Delos International, Inc. Catalog #: D/CD 3073 Title: Howard Hanson: Symphonies 1 & 2, Elegy Track # Index # 1 2 3 4
5 6 7
PQ Subcode information Date: 20 Dec 88 Mastered by: AS, U W
Trtle Howard Hanson: Symphony #1, Nordic Andante solenne Andante teneramente Allegro con fuoco
(29:19) 12:41 6:05 10:24
00:00:00:00 00:14:45:00 00:20:55:004
Elegy in Memory of Serge Koussevitsky
12:37
00:31:30:00 (note)
Time
Symphony #2 Romantte (28:20) 13:34 Adagio 721 Andante 7:14 Allegro
Format : Sony 1630 Total playing time: 70:39 Last time present on tape: 01:14:00:00
SMPTE Code begin-end
Notes
Use 31.30 as start of track, although music begins slightly later
00:44:19:00 00:57:59:00 01:05:25:00-01:12:39:00
Figure 25-5. Typical programming sheet for a CD. (Courtesy Delos International)
Today, all CD-pressing facilities will accept program material that has been "burnt" by the mastering studio onto a standard CD-recordable (CD-R) blank.
PROGRAMING FOR THE CD The CD can accommodate a flat power bandwidth signal. That is, the medium requires no pre- or de-emphasis, as do analog tape and LP disc mediums. This means that the program signal does not have to be compressed, or otherwise limited in any part of its frequency range, in order to be accommodated on CD.
Chapter 25
366
There is provided in the system specifications, however, an optional 10-dB recording pre-emphasis, which is a user option. When this is employed, a flag is entered into the digital word structure so that complementary de-emphasis will be automatically engaged during playback. It is the general consensus among mastering engineers that emphasis is not neededand may be a source of confusion at some later stage in the mastering process. You will generally find emphasis used only in the earliest CDs. Since the CD does not "know the difference" between 20 Hz or 20 kHz in a musical signal context, the mastering engineer does not have to worry about compression of high frequencies, or any of the other problems which have plagued analog media since their inception. This is both blessing and pitfall. During the early rush to get product into the marketplace, many record companies routinely used equalized copies of master tapes to make CDs. The problem here was that those tapes had in many cases been "shaped" for the stereo LP, with its special characteristics and limitations. When used for making CDs, the resulting sound was often too bright, and even harsh. This was probably the leading cause of critical objection to the earliest CD releases. Good recording practice dictates that the earliest studio sources be used, and remixed if necessary, in order to do justice to the CD medium.
HIGH-DENSITY MEDIA Figure 25-6 shows a section view of a hybrid SACD with a CD layer and a high density layer. The relative pit size will give you a good idea of the difference in scale between the standard CD and the high-density nature of the DVD. Note also that different laser wavelengths are used for the different pit sizes. Hybrid disc signal reading CD layer (entirely reflective) 0.6 mm
0.6 mm
HD (high density) pick-up Wavelength: 650 nm Aperture: 0.6 Focused only on HD layer
HD layer (layer reflects 650 nm wavelength and is penetrated with 780 nm laser beam)
CD pick-up Wavelength: 780 nm Aperture: 0.45 Focused only on CD layer
Figure 25-6. Laser reading of a dual layer hybrid SACD disc. (Data after Sony)
Application of Lossless Data Packing in High-Density Media
367
In the hybrid SACD, the base layer (top in Figure 25-6) on the disc is the standard CD layer. The high density layer is semi-transparent and is ignored by the 780-nm (nanometer) wavelength laser. When the high-density program is played, the 650-nm wavelength layer is directed at the semi-transparent layer. Figure 25-7 shows the program allocation on the two layers of a hybrid SACD. You will note that the high-density layer contains both stereo and surround sound program, along with optional visual on-screen data such as lyrics, program notes, and still pictures.
Hybrid disc content 12 cm
CD layerPCM stereo -
High density layer— Lyrics, graphics, video DSD multichannel— DSD stereo
Figure 25-7. Program allocation on the two layers of a hybrid SACD. (Data after Sony)
APPLICATION OF LOSSLESS DATA PACKING IN HIGH-DENSITY MEDIA Both DVD audio and SACD make use lossless data packing in order to accommodate the extremely high data rate those systems require. Those of you who are familiar with computer ZIP drives will already know something about data packing. These systems make use of redundancy in digital signals to achieve a net overall space saving of about two to one. DVD udio uses as technique known as Meridian lossless packing (MLP), while the SACD highdensity layer uses a proprietary algorithm.
Chapter 25
368
THE MINIDISC Sony's MiniDisc is very popular in both Japan and Europe as a recordable small-format optical disc. It hasn't fared well in the United States, and some readers may even be unaware of it. A profile of the MiniDisc is shown in Figure 25-8. Like a computer 3.5-inch floppy disc, the MiniDisc is selfcontained in a small shell for protection. The medium makes use of a technique called adaptive transform acoustic coding (ATRAC) that falls under the general category of perceptual coding. Other examples of perceptual coding are found in MP3 and in a number of codes, such as AC-3, DTS, and SDDS that are used for motion picture digital soundtracks.
Figure 25-8. Profile of the MiniDisc.
PERCEPTUAL CODING Perceptual coding algorithms take advantage of psychoacoustical masking to "simplify" the audio signal's data requirements. (You will also hear the term low bit rate transmission to describe perceptual coding.) Our ears analyze sounds in small frequency intervals known as critical bands. Over most of the MF and HF ranges the width of a critical band is roughly a third of an octave. Within a critical band, softer tones tend to be masked by louder ones and may not be heard as such. Figure 25-9 shows a typical example. A tone of 1 kHz at a listening level of 60 dB SPL will exhibit masking thresholds at frequencies above and below the tone as shown.
Perceptual Coding
369
A secondary tone of about 1.6 kHz lies just on the threshold, as indicated. That tone will not be heard as such, nor will any other tones that lie on or below the threshold curve. Rather than recording the 1 kHz signal using a full 16 or 24-bit digital signal, we can get by in this case with only about 10 bits. 120
500
1kHz
2k
5k
10 k
20 k
Frequency (Hz)
Figure 25-9. Masking thresholds in perceptual coding.
In perceptual coding, the program is broken down into sub-bands, and a frequency analysis is made in each of about 25 to 30 bands covering the entire program frequency range. A determination is then made, depending on the levels of the signals in each band, of just how many bits may be actually be required to record that band. The analysis process is repeated at time intervals of about 2 to 10 milliseconds or so, and bits are allocated as needed. A generalized block diagram of the basic recording and playback process is shown in Figure 25-10. In the encoder there is a primary path that includes frequency division into sub-bands. A side path performs an fast fourier transform (FFT) on the input signal, and this information is fed to a psychoacoustic "look-up" table that determines the amount of masking provided in each subband. This information is then fed to the quantizing section, which allocates bit depth in each band as needed. The signal is them formatted, along with error correction data, and fed to the recorder. The decoding function is basically the inverse of the coding function. The sub-bands are individually decoded, corrected and recombined. This in a nutshell is how the basic code-decode system works; obviously there are many subtleties in actual operation, and the systems have been fine-tuned through many listening tests to be psychoacoustically "transparent."
Chapter 25
370 Recording processor Digital audio in 0-—^-^
31
^1
31^1
>^ Filter band 32 subbands
Unear quantizer 0
-\
k
^ t
FFT 1024 points
Bitstream formatting; CRC
0 n
Coded audio out o
)^
Coding of side information
Psychoacoustic model
External control
Playback processor Coded audio in
31 Demultiplexing and error correction
1 31^ Inverse filterbank 32 subbands
Dequantization of subband signals 0
)k
o
1 0 '
Decoding of infonnation
Figure 25-10. Signal flow diagram for perceptual coding and decoding.
The net result for stereo music is an advantage of about four-to-one in bit savings as compared standard transmission. The various motion picture low bit rate systems have even greater bit saving ratios due to further simplifications resulting from a joint multichannel analysis of the signal. In some cases the data reduction ratio may approach 10 or 12 to 1, depending on the specific nature of the audio program. MP3, the scheme used for music transmission over the internet, often operates at even higher ratios.
Chapter 26 THE STEREO LONG-PLAYING (LP) RECORD
A BRIEF HISTORY The stereo LP has rapidly declined in sales due to the immense success of the compact disc. Since 1947 to the present day, however, the LP has represented the longest period of compatibility between product and players in the history of consumer audio, exceeding the era of the 78-rpm disc (1895 to 1947). As the 21 St century gets under way, the LP continues to hold its ground in the disk jockey driven world of dance music, where the rapid manual cueing capability of the LP is still a very important factor. Technologically, the disc is an outgrowth of Edison's original cylinder medium which dominated recording during the last quarter of the 19th century. Berliner's disc rapidly overtook the cylinder in the early years of the 20th century, primarily because enormous manufacturing advantages. Until the late 1920s, recording and playback remained an acoustomechanical process. At that time, Maxfield and Harrison developed electrical recording, and the major problems of bandwidth and distortion were solved. In 1947, Peter Goldmark of CBS combined the advantages of a quiet vinyl plastic pressing material with microgroove geometry and 33 1/3 rpm to produce playing times up to about 25 minutes. The stereo LP had been invented conceptually by Blumlein in the early 1930s, when he demonstrated that two independent modulation channels could be cut at ±45 degrees to the surface of the disc. But it wasn't until 1957 that the stereo disc became a commercial success, and during the golden era of the stereo LP (1960 to 1985) many significant improvements were made in the electromechanical aspects of both cutting master discs and playing the pressed discs. As a result, the medium attained audiophile status, compromised only by occasional pressing problems along with ticks and pops. Consumers were inclined to overlook these defects because of the high level of audio quality that was otherwise obtained. The CD and other optical media have been in the ascendancy for about 20 years, but the LP is far from gone. While the great recordings of the past have been reissued in various new formats, there are many other relatively obscure LP recordings which are not likely be reissued in any form. For this reason alone, there will be LP enthusiasts for decades to come.
372
Chapter 26
As you proceed through this chapter you will develop an appreciation of just how complex and highly engineered the entire disc cutting and playback process has become.
PHYSICAL PROFILE OF THE LP DISC Figure 26-1 shows physical details of the LP. The diameter is 12 inches (301 mm), and the maximum thickness in the center (label) portion is 0.015 inches (3.8 mm). The recorded portion of the disc is thinner than the center and outer diameter; this contouring saves vinyl material and provides some degree of protection for the grooves when the discs are stacked on a record changer. The various starting and stopping diameters of recording are standardized, as are the pitches of lead-in and lead-out grooves. {Pitch here refers to the number of grooves per unit radius, not the frequency of a signal.)
STEREO MICROGROOVE GEOMETRY AND REFERENCE LEVELS Figure 26-2 shows the basic movements of the cutting and playback styli in the plane of the master disc. Lateral motion {Ä) results from identical signals fed to the 45-degree/45-degre cutting coils. Motions at B and C represent right channel only and left channel only, respectively. The motion shown at D results from an anti-phase relationship between the two input signals. Figure 26-3 shows a scanning electronic microscope view of typical stereo modulation. Note that each groove wall is independently modulated. The outer groove wall of the stereo disc is modulated by the right channel and the inner groove wall by the left channel. The cutting stylus is chisel shaped and is made of sapphire or diamond. The nominal width of an unmodulated groove is about 0.0025 inches (0.064 mm). During heavy modulation, the groove width and depth may increase by a factor of about three, while on upward swings of the cutting stylus the width can be as small as 0.001 inch (0.025 mm). In the early days of disc recording, a wax formulation was used as the recording medium. Since the early 1940s, a lacquer formulation on an aluminum substrate has been used. It is customary to use a stylus heated by a small coil to facilitate cutting the lacquer material and to reduce noise that would otherwise be generated in the process.
Stereo Microgroove Geometry and Reference Levels
373
^ 4.76" (121 mm) ' 11.5: (292 mm)12" (305 mm)
Figure 26-1. Physical views of the LR Surface view (A); section view (B).
The normal zero reference level in stereo disc cutting is defined as lateral peak stylus velocity of 7 cm/sec at 1 kHz. On a per-channel basis, this corresponds to peak velocity of 5 cm/sec.
Chapter 26
374
-<—^
B
D
t Figure 26-2. Basic stylus motions shown in section view. Lateral (mono) motion (A); right channel only (B):left channel (C); vertical motion (D).
RECORDING AND PLAYBACK EQUALIZATION The input signal to the record head is pre-emphasized 34 dB over the range from 30 Hz to 20 kHz. The need for pre-emphasis was established early in the art, when it was determined that high signal levels at low frequencies tended to cause excessive groove excursion, thus consuming valuable recording space. High-frequency signals, on the other hand, tended to get lost in the basic noise level of the medium. With a reduction of low-frequency signals and a corresponding boost of high frequencies, the stereo disc exhibits a desirable trade-off between adequate playing time and reproduced noise.
Recording and Playback Equalization
375
Figure 26-3. Scanning electron microscope photograph of stereo grooves, 100-times magnification. Note the independent modulation of each groove wall. (Courtesy Victor Company of Japan)
During the early years of the LP, there was no clear agreement between manufacturers regarding signal pre-equalization, and the proliferation of recording curves of the day was as shown in Figure 26-4. Today, the universal
z
ic(Dlumbis
Ifi^S^ M i l l t^ 1 M M
1
i
(D
1 i
>
^ 1 TS^
L.0-^*-
—
I
^
jio iBj
%
i>
: 5
1k
Frequency (Hz)
Figure 26-4. Disc recording pre-emphasis characteristics of the early 1950s.
376
Chapter 26
playback curve for both stereo and mono LPs is as shown in Figure 26-5. This is literally a worldwide standard and is known in the United States as the Record Industry Association of America (RIAA) de-emphasis, or playback curve.
1
T
^1 1
p
^ ^^^ ^
F
Jill J IH 1 lilll ^^*t*'***^;
F
10
1 ! i : :i
JBJ
= 5 Ik Frequency (Hz)
Figure 26-5. The RIAA playback de-emphasis characteristic. The three principal transition frequencies are: 50 Hz (3180 |isec), 500 Hz (318 jisec), and 2120 Hz (75 |isec). An optional roll-off below 20 Hz reduces effects of turntable rumble.
OVERLOAD IN DISC SYSTEMS Since the disc cutting and playback processes are mechanical, there are fundamental geometrical limits imposed on the system. These fall into three areas:
Displacement overload At low frequencies, excessive signal input can result in one groove intersecting the preceding one, resulting in overcut. The remedy for this is to increase the recording pitch so that the extra space needed by the excessive modulation is provided. The trade-off here is playing time; if a long recording is required, the overall signal level will have to be lowered to accommodate it. In severe cases, the cutting stylus may lift out of the cutting medium entirely (so-called cutter lift), if there is an excess of anti-phase (vertical) program content at low frequencies.
Slope overload In the midrange, high signal levels may cause such rapid side-to-side (lateral) motion that the back facets of the cutting stylus hit the newly cut groove and scuflF it. As a practical matter, a stylus motion of 45 degrees with respect to the plane of modulation is considered an upper limit. Details are shown in Figure 26-6.
Complementary Predistortion of the Stereo Signal
211
Burnishing facet
Disc surface
Stylus travel
Disc travel
Figure 26-6. Cutting styli. View of stylus showing major surfaces (A); section view showing cutting action (B); normal view of disc during cutting action (C).
Curvature overload At high frequencies it is possible for an overdriven stylus to cut a groove that has a curvature exceeding that of the playback stylus. When this is encountered in the playback process, the result is high distortion and accelerated record wear. The effect is more problematic at inner recording diameters, since wavelengths diminish as the tangential groove velocity is reduced at inner diameters. As an aid to smooth cutting, the sides of the cutting stylus have very thin burnishing facets, which help to form a smooth groove wall. In modem disc cutting systems these forms of overload are monitored and compensated through careful signal limiting.
COMPLEMENTARY PREDISTORTION OF THE STEREO SIGNAL The simplest predistortion of the stereo signal comes from mechanically tilting the cutting stylus to that it moves in a non-perpendicular direction to the disc which matches that of the playback stylus. The action is shown in Figure 26-7, and the degree of mechanical tilt is in the range of 20 to 25 degrees. If this precaution were not taken, the mismatch between cutting and
Chapter 26
378
playback vertical angles would produce undesirable self-modulation of the signal during the playback process.
(b)
1 Stylus''^'^ j
/ "^
y-
R
Stylus Support Member
2C Tension Wire
(c)
B = Cutting Angle
(d)
Figure 26-7. Controlling the cutting angle. Typical playback cartridge tracking angle (A); cutter heads tilted to match (B and C); typical input signal is skewed as it is cut into the groove, and playback tracking angle matches cutting angle (D).
Complementary Predistortion of the Stereo Signal
379
Another improvement can be made by further predistorting each groove wall so that the discrepancy between the fine-edged cutting stylus and the spherical-tipped playback stylus is compensated. Referring to Figure 26-8, if we wish to replicate an accurate sine wave, we will have to predistort the signal cut into the disc (shown by the solid curve). This signal has been distorted electrically so that, when traced by a playback tip of a standard radius, the trajectory of the playback stylus will produce a sine wave (shown by the dashed curve). Spherical bearing radius of playback stylus
Locus of playback stylus traces an accurate sinewave
Predistorted groove wall, sinewave Input Figure 26-8. Tracing simulation. The cutting stylus must be fed a predistorted signal. For a sine wave input, as shown, the cutting stylus must execute the solid curve. The spherical radius of the playback stylus will then trace a replica of the sine wave (dashed curve).
Chapter 26
380
CUTTING HEAD DESIGN All modem stereo cutting heads are of the moving coil type and make use of inverse feedback to reduce distortion. The basic feedback process is shown in Figure 26-9A. Without feedback the motion of the stylus reflects its primary
input Drive coii
- ^ ) Feedbacl< < coii
"o
ü
(D
>
> Frequency (iHz)
Frequency (Hz)
fo Frequency (iHz)
Frequency (iHz)
Figure 26-9. Principle of motional feedback. Cutting stylus motion is monitored via a feedback coil, and any errors are corrected electrically (A); without feedback, the moving system has a pronounced resonant peak (B); with feedback, the resonance is damped (D); phase response of system with feedback (C); drive current requirement with feedback (E).
Playback Stylus-Groove Relationships
3 81
resonance, fQ, as shown at B. When inverse feedback from the sensing coil is returned at negative polarity, the signal can be corrected so that its amplitude will be flat (Z)). The process produces phase and amplitude response in the signal drive current as shown at C and E. A section view of the Neumann SX-74 cutting head is shown in Figure 26-lOA. Note that the cutting stylus cantilever is actuated at angles of ± 45 degrees relative to the disc surface. The Ortofon design shown at B uses an isosceles T-bar to translate vertical motions of each coil into 45 degree groove motion. Figure 26-IOC presents a view of the Neumann SX-74 cutter head as seen slightly below the horizontal plane. The front-back dimension of the cutter head is approximately 2.5 inches (6.4 cm).
PLAYBACK STYLUS-GROOVE RELATIONSHIPS Figure 26-1 lA shows a normal view of a recorded disc. The section view of the cutting stylus is shown at A. Further section views shown three kinds of playback styli: line contact at B, elliptical contact at C, and conical contact at D, The details in Figure 26-12 show contact areas and section views in the vertical plane for the three kinds of playback styli. The best reproduction generally results when the overall area of contact is large and when the contact dimension is small in the direction of groove motion. These considerations favor the Shibata and elliptical stylus designs, as opposed to the simple conical tip designs. The large contact area results in minimal record wear, and the small contact dimension in the direction of grooved motion results in improved resolution of short recorded wavelengths.
Chapter 26
382
Courtesy, Georg Neumann GMBH
Drive Coil
B
Moving Element Feed Back Magnets
Feed Back Coils (Reversely Wound)
Flexible Joint
Rocking Bridge ISOptm max. deflection
Stylus Heater 106 Mm
150 /im (mono, both channels) Courtesy Ortofon
Playback Stylus-Groove Relationships
383
Magnet
Helium Input
Stylus Holder Suction Pipe for Chip Removal
Linkages to Drive Coils
Contacts for Heater Current
Figure 26-10. Section views of two cutting heads showing drive and feedback coil locations. Neumann SX-74, courtesy Georg Neumann GmbH (A, left); Ortofon, courtesy Ortofon (B, left); photo of Neumann SX-74 cutting head (C, above).
Figure 26-11. Playback stylus and groove relationships. Trajectory of cutting stylus (A); trajectory of Shibata playback stylus (B); trajectory of elliptical playback stylus (C); trajectory of conical stylus (D). (Data courtesy Audio Technica)
384
Chapter 26
Courtesy Audio Tcchnica
Figure 26-12. Details of stylus-groove contact during playback. (Courtesy Audio Technica)
DISC TRANSFER SYSTEMS A disc transfer system is a specialized audio chain whose function is to transfer a master tape to disc with the required signal processing. Ideally, a master tape which reaches this stage in production has been carefully prepared and should not require extensive last-minute treatment. This may not always the case, and a flexible transfer system should be able to provide the following functions:
Disc Transfer Systems
385
1. Signal Processing a. Comprehensive equalization and filtering; all functions easily resettable. b. Compression and limiting; all functions easily resettable. 2. Signal Routing a. Stereo input to stereo disc transfer mode. b. Stereo input to mono disc transfer mode. c. Left or right input to mono disc transfer mode. d. Patching facilities in and out of major blocks in the system. 3. Monitoring and metering points a. Tape output. b. Preview (advance head) output. c. Signal processing output (cutter drive input). d. Cutter feedback signal. e. Disc playback. 4. Signal conditioning a. Tracing simulation. b. Slope and curvature limiting. c. LF vertical limiting. 5. Calibration facilities a. Provision for constant velocity cutting and playback above 500 Hz. b. Noise weighting filter and gain adjustment for reading low-level noise signals. 6. Mechanical functions a. Flexibility for accommodating various tape speeds, disc diameters, and disc speeds. b. Comprehensive control of groove pitch and depth of cut (including duplication of signal processing elements in the preview system). A block diagram for a complete disc transfer system is shown in Figure 26-13.
Chapter 26
386 Insert point for noise reduction
devices Tape transport
/ ^ ^ - ^
Progranr» output
^M^
Preview output Ö
EQ
Insert point for other signal processing devices
/ ^>Mode selector!
EQ
Tracing simulator!
-e-
Cutter
Feedback
TT c
Ö
A
^ ^ - ^
n h
Cutter drive lampUfiersI
Variable pitch & depth control
^To lathe
Phono r cartridge t (on lathe) J
\ Ö
Ö
D
Insertion of signal processors matching program channels
Preamplifiers with switchable weighting networks and gain settings Metering L&R
Monitor select:] A - Tape out B - Program C - Feedback D - Preview E - Phono
^^> ^£1
"^
Monitor loudspeakers
^
Figure 26-13. Signal flow diagram for a complete disc-cutting system.
VARIABLE PITCH AND DEPTH CONTROL If you take a careful look at an LP recording you will see that groove spacing varies throughout the disc. Variable groove pitch and depth control optimize both playing time and level capability on longer discs through the efficient use of space on the disc. Essentially, grooves are made narrower and spaced closer together when the signal level is low, and they are deepened and spaced farther apart as the signal level increases. It is a complex task to do this efficiently, and early methods of pitch and depth control were fairly coarse and rudimentary in their operation.
Diameter Losses
3 87
In the most advanced cutting systems, the groove depth requirements are determined by the lateral and vertical components of the program being transferred. The information for this is picked up by the preview head on the tape playback transport and stored until needed half a revolution later. Groove pitch requirements are determined by three factors: 1. Left groove wall requirements. The left channel program input determines this need, and the information is stored for one revolution so that the right wall of the following groove will not interfere. 2. Right groove wall requirements. The right preview signal determines this need. 3. Pitch change as a resuh of depth increase. The preview vertical component determines this need; the information is stored for one-half revolution until it is needed. The above actions are indicated in Figure 26-14A, and typical performance is shown at B.
DIAMETER LOSSES An additional effect of mechanical cutting of the disc is the attenuation of HF signals at inner cutting diameters. The reduced tangential velocity of the groove at inner diameters increases wavelength-dependent losses, as shown in Figure 26-15. Here, we have plotted the losses versus recorded diameter for several frequencies for three disc sources: a vinyl pressing, a master lacquer, and a metal mother (a step in the replication process). The losses are due to several phenomena. Scanning losses result from the finite width of both burnishing facets on the cutting stylus and the width of the playback stylus contact in the direction of groove motion. Deformation losses result from the plasticity of the lacquer or vinyl recording medium. The master lacquer recording and the pressing are very much the same as regards the losses. However, the metal mother is virtually free of deformation effects.
Chapter 26
388
Left preview
1
signal
|
Right preview
1
signal
|
P^
•^
^
«
brJ Fv^
Depth programmer
To depth control
Pitch programmer
^ To Ditch control
Left proqram
1
signal
|
"^
l-^'
1. 2.
>^
3. -^
I
- ^
Disk Center Courtesy Georg Neumann GMBH
Figure 26-14. Variable pitch and depth. In the Neumann VMS-70 cutting system, three signals, as shown at (A) are used to determine pitch and depth requirements. Typical action of the system is shown at (B). The right channel modulation in groove 2 requires a decrease in pitch substantially ahead of modulation so that there will be no overcut into groove 1. The decrease in pitch must be maintained one revolution so that groove 3 can be accommodated without overcut. Modulation on the left wall of groove 4 does not require a preview signal for proper pitch decrease; the signal that controls this is the left program input. Again, the decrease in pitch must be maintained one revolution in order to make room for groove 5. (Courtesy Georg Neumann GmbH)
The Cutting Process
389
Vinyl Pressing 0 -5 -10 Master Lacquer 7 kHz 10 kHz 14 kHz
-10 Metal Mother
-10
11.5
10.5
9.5
8.5
7.5
6.5
5.5
4.75
Recording Diameter (inches)
Figure 26-15. Diameter losses in disc recording.
THE CUTTING PROCESS When a stereo master tape is received at the cutting facility, the lacquer mastering engineer runs the tape down, noting those sections of it that may be extremely loud. Any basic signal processing, such as limiting or equalization, may be noted at this stage. The playing time of the side is carefully noted, since it will influence the actual cutting level on the disc. Banding requirements are also noted, along with any anticipated band-to-band changes in signal processing. An experimental cut is often made at this stage to ensure that all settings are workable. In preparation for mastering, the engineer places a lacquer 14-inch diameter master blank on the lathe's turntable. The outer portion of the blank can be used for a short test cut to ensure that the correct hot stylus current and stylus depth of cut are within standards. The freshly cut groove is examined with a microscope to ensure that everything is working correctly. As the final cut is made, the engineer lowers the cutter head into the rotating lacquer. The "chip" is the portion of the lacquer material that is actually cut from the disc; it must be immediately picked up by the suction tube, otherwise it may become ensnared in the stylus-heating coil assembly. The styli are made of sapphire or diamond material and are good for a number of cuts. If everything goes well, the mastering engineer examines the freshly cut disc
390
Chapter 26
with the microscope to ensure that there are no cutter lifts or overcut that may cause processing problems in the pressing plant. If the master disc passes this test it is carefiilly packed and sent to the plant. The approximate playing times on an LP side as a fiinction of average cutting pitch are given below: Pitch (lines-per-lnch): 300 250 200
Approximate playing time: 30:20 25:20 20:15
You can easily appreciate that the disc mastering engineer is a person of considerable skill and mechanical sensibilities who has to deal with fussy producers and artists on the one hand and with inspectors at the pressing plant on the other.
RECORD PRESSING The processing of a master lacquer disc through the various metal-to-metal replication operations and finally to the vinyl pressing is a very intricate one involving many disciplines, including metal plating, plastic formulation, and plastic forming. The basic operations in the three-step process are shown in Figure 26-16. The master lacquer is carefully inspected, cleaned, sensitized, and "silvered" by reduction of silver nitrate on its surface. This renders it electrically conductive. It is then preplated at low electrical current density to build up a thin nickel surface, which is a negative representation of the lacquer surface. Then the current density is increased to produce a substantial backing of nickel. The metal negative so produced is called the metal mästen It is further treated so that a metal mother can be grown from it. The mother is a positive and can be played to check for problems in transfer. Minor defects can often be repaired. Finally, the mother is plated and a stamper is produced. This is a metal negative part which is used for final production. The stampers are ground smooth on their backside so that they will fit snugly into the press. The edges are crimped and the parts carefully centered in the press. The pressing cycle begins by placing a charge of hot vinyl plastic between the stampers, along with the labels. Pressure and heat are applied, and the plastic is molded to
391
Record Pressing Master lacquer
Lacquer
Cleaning, sensitizing, silvering Metal master - formed by electroplating nickel on silvered lacquer
Separation of metal master from lacquer; passivation of nickel surface Metal mother - formed by electroplating nickel on metal master
Separation of metal mother from master; passivation of nickel surface Stamper - formed by nickel plating of mother
Stamper
Plastic
Separation of stamper from mother; preparation of stamper for production
Pressing - formed by heating, pressure, and cooling of vinyl plastic
Stamper
Figure 26-16. The three-step disc replication process.
conform with the stampers. When the molding cycle is completed, cold water is run through the channels of the molds, cooling the record so that it can be removed from the press without warping or other deformation. The remaining plastic around the edge of the disc, referred to as "flash," is trimmed and the process is finished.
Chapter 26
392
DIRECT METAL MASTERING (DMM) Under the trade name Direct Metal Mastering, the Teldec company of Germany introduced a process of cutting master discs directly on freshly plated amorphous copper, eliminating two steps in the replication process. Their efforts have been complemented by those of Georg Neumann GmbH in the areas of lathe and cutter head development. The technology differs from the traditional approach in the following ways: 1. The cutting is done on a copper layer, which directly becomes the metal mother for subsequent production of stampers. Figure 26-17 shows a view of the cutting lathe with a freshly cut master disc on the turntable.
Figure 26-17. Lathe for cutting Direct Metal Mastering (DMM). (Courtesy Georg Neumann GmbH)
2. There is no spring-back effect in the metal, as there is with lacquer, and deformation effects, such as "groove echo," are virtually eliminated.
Direct Metal Mastering (DMM)
393
3. The diamond cutting stylus does not require burnishing facets, and HF recorded detail is much greater than with conventional cutting. 4. A new, more powerful, cutting head is required to engrave the signal, and the physical cutting angle is about 5 degrees. This necessitates electronic processing of the stereo signal by delay modulation to produce an effective net cutting angle of 20 degrees. Details of this process are shown in Figure 26-18. plane of QCIUQI cutting stylus motion plane of standard cutting stylus motion non standard cutting angle E
time delay
VTA a stylus excursion a V Turntable Speed *dmax=\f". (sina-sinS) ^^^ Vmin for ä s 50pm
time advance
Vmin = 200 mm/s <X = 20*> e = 5<» td - 65ps "max
Figure 26-18. Effect of vertical tracking angle converter for DMM. (Courtesy Georg Neumann GmbH)
Further advantages of the system include up to 10-dB improvement in noise performance and the ability to accommodate a general 15-percent increase in playing time, depending on the nature of the program.
Chapter 27 RECORDING STUDIO DESIGN FUNDAMENTALS
INTRODUCTION No book on recording engineering would be complete without a chapter devoted to the principles of recording studio design—yet no other aspect of the overall facility is more in need of expert guidance than studio layout and acoustical design. Regardless of scale, the requirements for successful studio operation are basically the same. A low-cost studio in the home environment deserves the same acoustical and technological considerations as a multichannel installation in a major metropolis if both are to successfully deliver useful services. In this chapter we will cover some of the basic acoustic considerations in building a studio. More fundamentally, we will address the problems of site selection and the development of a business plan. We will discuss the nature of professional equipment, its costs, and its rapid obsolescence.
PLANNING FOR BUSINESS It has long been the dream of many engineers to build and operate a recording studio. Too often, the dream gets ahead of sensible financial planning, and the studio, whether built from the ground up or acquired through other channels, runs into trouble. The biggest pitfall is usually investing too much money and real estate in a large recording space, overlooking the income possibilities of more, and smaller, work spaces. A conservative business plan is essential. Essentially, a business plan is an outline of income potential minus the costs of getting into, and staying in, business. The first thing to do is determine what the income potential really is. Questions such as these need to be answered: 1. Is there a large enough client base in your business area to adequately "load" the facility at its break-even point? It is important to be conservative in these estimates. Observe how other studios in your area have done. In particular, take a good look at those that have
Planning for Business
395
failed, and determine why. Likewise, analyze those that have succeeded, and determine why. 2. What is the income growth potential for the studio? For example, if the intended specialty of the studio is advertising or video related work, find out what the long-range growth potential is in your area for such activity. Any responsible lending institution will expect an analysis prepared by a financial specialist who understands the field. 3. What are your own qualifications for specifying, building, and operating a studio? A realistic appraisal is essential here. Identify those consultants in studio design and construction who can help you. The next step is to outline the initial and ongoing costs of being in business: 1. Outline in detail the costs of leasehold or property development for the studio, and note the time required for this work. 2. Identify the equipment that must be purchased outright as opposed to equipment that may be leased against eventual purchase option. Keep an eye out for equipment auctions and other sources of competitive equipment in excellent shape. 3. Outline all personnel salary and benefit requirements for running the studio, as well as all items of overhead. (Incentives are a common thing in hiring competent recording engineers, and commissions are routinely paid for work which experienced engineers can bring to the studio. Deal with this openly and professionally.) 4. Take note of building codes, licenses and other legal details. 5. Identify legal and financial counsel who have experience in this or in related fields. Financial data should be presented in a balance sheet format projected on a quarterly basis. Inflationary factors should be considered along with other factors affecting the community at large. Do not rely on a "rich uncle" who wants to finance an operation which the banks have turned down. The banks are probably right.
396
Chapter 27
SITE SELECTION AND DEVELOPMENT In most cities, studios are considered commercial enterprises and must be located in areas zoned for such purposes. Many small studios exist in residential locations only because they do not create a noise or traffic nuisance to the extent that neighbors have complained. In picking a site for a studio, take careful note of who your commercial neighbors will be and what their business hours are. It is expensive to acoustically isolate a studio, and the ideal situation is to have neighbors who do not make noise themselves and who are tolerant of moderate noise levels. Avoid proximity to busy thoroughfares and approach paths to airports. A little homework is advised in checking out future state and federal plans for a given site. For example, a freeway passing near the site could profoundly influence prospects, and not always for the better. Ensure that the neighborhood is stable by checking with those businesses already in place. The availability of space in light industrial parks attracts many in the studio business. Landlords are quite willing to make changes, but the major problem is likely to be a problem with an insufficient total floor-to-roof distance. It is best to make such arrangements when such a facility is in the design stages. Another concern here is parking; make certain that there is enough space for your purposes.
ACOUSTICAL CONSIDERATIONS The essential advice here is to engage a qualified acoustical consultant at the outset of the project. Pick one who has a good track record and good references. The acoustician can be of great help in laying out the basic floor plan for the facility, and with proper care at this stage many potential acoustical problems can be avoided entirely. Figure 27-1 shows both small and large studio layouts. In either case, the live recording areas, which require good acoustical isolation, are separated from each other through buffer zones, such as storage areas, hallways and maintenance activities. This simple approach means that the acoustical transmission loss (TL) between adjacent live performance areas has to be sufficient only to reduce transmission of sound from the buffer zone to the recording area to the necessary degree. If the noise in the buffer zone can be controlled, then the TL between that zone and the studio can be minimal, and considerable cost savings can be realized. Many times, the buffer zone may simply be air space between two structural walls.
Acoustical Considerations
397
120 ft
Figure 27-1. Suggested floor plans for recording facilities. Smaller facility (A); larger facility (B).
Chapter 27
398
When a buffer zone has been defined, care must be taken that there are no flanking acoustical paths between recording areas which would defeat the buffer zone. Figure 27-2 shows some details:
i
^^^^^^^^^^^'i^i^i^^^^^^^S^i?^^
^
^
TLs(dB)
^
B P^^^^^^^^^^^^^^^^^^^^^^^^
^i'i^i^^^i^i'i^i;^;^^^ 2TLs(dB)
t
I
rrrrrrm
» ^ » ^
TLF(dB)
&///////////////////^^^^^
-I
^
• 2TLs(dB)
^
'^^:i^$i:^$$^^i:^i:$^$$^$$$^m 2TLs(dB)
^
i »f^^^^/^^^f^ ^^^^^^^^jy^^^ii^y^^^
w)/////ß//////////^^^^^ Figure 27-2. Studio isolation. Single studio on concrete (B); adjacent studios on timber construction (B); adjacent studios on concrete (C); studios on adjacent floors (D).
At A we see the ideal case of a studio located on a concrete slab. There is virtually no transmission into or out of the studio through the floor, and the transmission loss TL^ through the walls and ceiling may be, depending on construction, between 35 and 55 dB.
Acoustical Considerations
399
At B we see what may happen when two studios are located on the same floor where the substructure is made of timber. The airborne sound isolation between the studios will be about twice the TL noted at A; however, the structure borne TL through the common floor will be significantly less and will effectively defeat the excellent isolation you may have provided through the walls. At C the excellent airborne sound isolation between studios will be preserved because of the great inertance (mass) of the concrete slab. It is generally massive enough so that there will be very little structure borne sound transmitted through it. However, any direct hammering or mechanical impact on the slab is likely to be heard anywhere on the slab. At D we show two studios on adjacent floors. This approach works best when one studio is on a ground floor slab, where structure borne sound transmission will be minimal. On adjacent floors of wood or steel structures, great care must be taken so that structure borne sound transmission does not nullify an otherwise good job of airborne sound isolation. It is in such matters as these where a consultant can be invaluable. Do not take the seriously the casual advice, however honestly offered, of building contractors who have not had experience in studio building. They are likely to tell you that you are spending too much money on sound isolation. If anything, the opposite tends to be the case. Allow for final space requirements in choosing a site. An undeveloped space may seem quite large to start with, but it will be much smaller after all aspects of isolation and acoustical treatment have been dealt with. Depth requirements for bass traps in the control room can easily be in the range of 3 feet. Dropped ceilings and air conditioning ductwork can also take up much space. As a general rule, no undeveloped space with a ceiling height less than 18 feet should be considered for first-rate studio design. Where existing sites are being considered for conversion to studio use, existing air conditioning systems should be carefully assessed as follows: a. Is the cooling effort equal to the maximum anticipated load placed on it? b. Does the noise level of the system meet studio requirements ? (More about this later.) c. Does the ductwork provide sufficient sound isolation between the studio and adjacent work areas? This question needs to be answered in terms of possible interference in either direction.
Chapter 27
400
A facility that consists of a single studio is relatively simple to deal with; the cooling load must be carefully determined and the ductwork sized to avoid audible turbulence at the vents. Fans must be isolated so that their noise will not be heard. Where multiple studios are planned, there are specific problems to watch for. Figure 27-3A shows the worst case: two studios on a single air conditioning feed. While the physical isolation of the two studios may be sufficient, sound leakage through the common air conditioning feed may well swamp it out. At B we show a significant improvement; here, both studios have separate feeds from a common cooling plant, and the isolation will usually be sufficient. The best solution is shown at C While more expensive in construction, it may also be most energy efficient in the long run. Again, watch out for free advice from air conditioning contractors who have not had studio construction experience.
v ^jj^t!i^^^jj^i^^^^#^^^
B
r
1
I Figure 27-3. Air handling. Adjacent studio on same feed (A); adjacent studios on separate feeds from a single plant (B); adjacent studios fed from separate plants.
Studio Noise Level Requirements
401
STUDIO NOISE LEVEL REQUIREMENTS The foregoing remarks on studio noise levels are aimed at helping prospective studio builders avoid existing sites and structures that are basically unworkable, or appear to present extreme difficulties. Once a site has been chosen, the remaining problems of studio isolation and quieting must be solved. Noise levels are customarily measured according to accepted noise criteria (NC) curves, which are shown in Figure 27-4. Note the resemblance to the Robinson-Dadson equal loudness contours discussed in Chapter 2. Like the Robinson-Dadson curves, the NC curves take into consideration the ear's relative insensitivity to LF noise at low levels.
125
250
500
1000
2000
Octave band center frequencies (Hz)
Figure 27-4. Noise criteria (NC) curves.
With the help of a sound level meter and an octave band analyzer, the acoustical consultant can determine the existing NC rating in the studio space, as well as in the various spaces outside the studio area. A good studio should have an NC rating of 10 to 15. If the maximum noise levels outside the
402
Chapter 27
studio area can be measured or estimated with reasonable accuracy, the NC 10 values can be subtracted directly from the outside noise levels on an octave band basis to give the actual isolation requirements for the studio. If the studio is to be located in relatively quiet surroundings with office areas close by, then the main problem may not be from the outside into the studio, but rather leakage from the studio activity into the relatively quiet surroundings. This is especially likely to be the case with pop-rock recording. The acoustical consultant must then make another set of measurements, this time assuming a reasonable NC rating for the office areas and estimating the highest levels to be encountered in the studio. Both sets of isolation requirements, outside-to-studio and studio-to-outside, are then compared and the maximum values within each octave band noted. These maximum values then define the isolation requirements for the studio, taking into account the possibility of sound transmission in both directions. The next step is to identify the wall and ceiling structures that will provide the required overall degree of isolation. For general architectural purposes, various wall structures are rated according to their sound transmission class (STC) value. These are a set of weighted curves that take into account the ear's relative insensitivity to low frequencies at low levels. The family of STC curves is shown in Figure 27-5. Note that these curves are roughly the inverse of the NC curves, a fortunate situation inasmuch as wall structures are usually more effective in isolating the relatively short wavelengths of MF and HF sound than the long wavelengths of LF sound. The next step is to compare the maximum isolation requirements with the family of STC curves and choose structures that will meet the most stringent requirements. At this point, realistic construction costs for the studio complex can be estimated. It is possible that the target isolation requirements may put the project well beyond your intended budget. In that case, the acoustical consultant will have to make adjustments, possibly rearranging the studio floor plan to lessen some of the isolation requirements. This is the point in the design procedure where such decisions must be made. Careful attention must be given to doors and their fittings, using only those whose STC ratings are comparable with the wall, ceiling, and floor requirements of the space. A single small leakage path between adjacent areas is all that it may take to defeat an otherwise excellent job of sound isolation.
403
Impact Noise Isolation 60
F~l 1 1 1 1 i 1 1 1 1 i 1 1 1 i 1 n 1
STC-50
L
50
40
STC 45
h
P
STC-40
Fh
STC 35
L-
STC-30
CD if)
O
L STC-25
O 30
h
c 20
STC-20
u h
h 10 h
h
1 t 1 1 i 1 125
250
1 1 1 1 i 1 500
1000
2000
t 1 4000
Band center frequency (Hz)
Figure 27-5. Sound transmission class (STC) curves.
IMPACT NOISE ISOLATION In the previous paragraphs we have considered only airborne noise. Impact noise is that w^hich is generated directly by vibration against structural members of a building. Some of the problems are: a. Footsteps on wood floors or thin concrete slabs in multistory buildings. b. Poorly isolated motors associated with air conditioners or elevators. c. Noisy plumbing fixtures.
404
Chapter 27
It is surprising how far impact noise can travel in buildings; tapping and hammering on uncarpeted floors can be heard up to three of four floors away in some reinforced concrete structures. While plumbing and motor noise may be expensive to isolate, the noise of footsteps on the floor above may be easily solved by offering the tenant the necessary carpeting and padding required to isolate them. Impact noise is often one of the most elusive aspects of acoustical design, and proper assessment of it may require long-time monitoring of a proposed site. For example, a site survey made during the winter months may fail to pinpoint a problem due to air conditioning compressors located on a nearby roof. A survey limited to daytime hours may fail to identify noisy weekend or night-time activities on adjacent floors.
ACOUSTICAL CONDITIONS IN THE STUDIO So far, we've dealt only with isolation of the studio from its surroundings. Let's now consider the treatment required to attain the necessary control over sound produced in the studio. Most pop-rock studios are acoustically fairly dry, with reverberation time on the order of half a second or less. There are good reasons for this. First, the relatively absorptive conditions provide for good separation between instrumental pickup; second, a live small studio has a tendency to sound "boxy" through the predominance of widely spaced midrange room modes, as discussed in Chapter 1. But make sure that the studio is not too dead, since this will make musicians uncomfortable. The best working balance usually results from a combination of moderately dry acoustics and isolation baffles. The baffles are used to provide added separation, and they can be moved out when not needed. Many engineers prefer a combination of live and dry acoustics in the same space. Variable wall panels that are absorptive on one side and reflective on the other can facilitate this. Since so much monitoring in the studio is done over headphones, we could easily imagine a very dead studio in which all the musicians are immersed in an artificial and arbitrary acoustical environment. This is often the case; but there will always be times when the headphones come off and the room must come into play. Again, use a good consultant in determining these options. When it is set for its most reflective acoustical condition the studio should have plenty of surfaces that promote sound diffusion or scattering. Surfaces such as those shown in Figure 27-6 are often used. These can range from simple random textures, as shown at A, to the elaborate quadratic residue diffu-
Decor and Atmosphere
405
sor shown at B, which promotes uniform diffusion over a fairly large range of middle and high frequencies. In recent years there has been a trend to recording drum tracks in a very live environment, creating a sound that no reverberation unit can match. For this purpose a live, but not too large, space is required. During the mixing process the individual tracks are normally gated to reduce excessive reverberation and ring-out. Incident sound
Incident sound
Figure 27-6. Diffusing surfaces. Random (A); Quadratic residue diffusor (B).
DECOR AND ATMOSPHERE Avoid making the studio look clinical and institutional. The feeling should be one of warmth so that the musicians will feel comfortable and even at home. Selection of colors and textures should be made with the help of a studio designer so that those choices can satisfy both appearance and functional needs. Lighting is an important element in creating the right atmosphere. Provision should be made for high illumination when needed, while recording activities will likely take place at moderate to low lighting levels.
Chapter 27
406
THE CONTROL ROOM From the point of view of monitoring. Figure 27-7 illustrates the goal of maintaining a listening area at the console that is substantially free of early, fairly high level, reflections. Such reflections can cause timbre shifts that may vary significantly from one position to another at the console. Extending the design further, Figure 27-8 shows plan and side elevation views of a modem control room, which we'll now discuss in detail:
Figure 27-7. A reflection-free zone around a console.
Acoustical requirements The control room must be acoustically neutral with no audible standing waves, and this requires distributed absorption on the side walls and ceiling. The back wall often includes bass traps in the comers and diffusing elements over a large portion of the center. The front wall sections surrounding the loudspeakers are often hard wood. The combination of absorptive and diffusive materials ensures that the control room will not be acoustically dead, and that is an important requirement. Actually, detailed measurements will show that the amounts of direct and reflected acoustical power at the prime listening positions are about equal.
The Control Room
407
Plan view To equipment room Bass trap & storage
Diffusor View to studio Bass trap & storage
Entrance
Monitor
Side elevation view
loudspeaker
a
9ft f*-Tri
h
11ft
•
Figure 27-8. Modem control room design in plan (A) and side elevation (B) views. (Data courtesy G. Augspurger and Perception, Inc.)
Appearance factors The modem control room often makes use of cloth scrim on walls and ceiling to hide the specific absorptive materials. Diffusing elements are very often in the open because of their functional appearance. Floor treatment in the console area is normally hard wood, with carpet elsewhere. Lighting should be flexible, ranging from small spots on the working surfaces to bright floods for cleanup and maintenance work.
Space requirements The modem control room is generally larger than older designs. In addition to normal tracking activities, the control room will be used for mixing as well.
408
Chapter 27
and often has to accommodate a number of people with complete comfort. The equipment island often doubles as a work surface for producers as well as keyboard players. The width of the room ranges from about 17 feet at the front to about 22 feet at the rear. Height ranges from about 9 feet at the front to about 11 or 12 feet at the rear. Large pieces of recording gear are no longer housed in the control room. They now go into machine rooms adjacent to the control room and are accessed by the engineer via remote control.
Monitoring options A center loudspeaker is often soffit mounted in the front along with the traditional large stereo loudspeakers, and this is to facilitate film work. Some control room will have multiple small surround loudspeakers mounted high on the side and back walls behind scrim. For many commercial surround mixes, smaller consumer type standmounted loudspeakers will be moved into place as needed, and the control room must be large enough to accommodate them without crowding. There will also be a number of popular loudspeaker models available for temporary mounting on the meter bridge of the console to suit the needs of producers and artists.
BIBLIOGRAPHY
PHYSICAL AND PSYCHOLOGICAL ACOUSTICS Altschuler, M. "Balanced Attenuation Communications, vol. 35, no. 3 (1989).
Ear Protection," Sound &
Ando, Y. Concert Hall Acoustics, Springer-Verlag (1985). Augspurger, G. "More Accurate Calculation of the Room Constant," J. Audio Engineering Society, vol. 23, no. 5 (1975). Barron, M. "The Subjective Effects of First Reflections in Concert Halls— The Need for Lateral Reflections." J. Sound and Vibration, vol. 15 pp. 475-494. Bauer, B. "Phasor Analysis of Some Stereophonic Phenomena," J. Acoustical Society of America, vol. 33, no. 11 (1956). Benade, A. "From Instrument to Ear in a Room: Direct or Via Recording," J. Audio Engineering Society, vol. 33, no. 4 (1985). . Fundamentals of Musical Acoustics, Oxford University Press, New York (1976). Beranek, L. Acoustics, McGraw-Hill, New York (1954). . Concert and Opera Halls, American Institute of Physics, New York (1996). Blauert, J. Spatial Hearing, MIT Press, Cambridge, MA (1997). Cremer, L. and Mueller, H. Principles and Applications of Room Acoustics, Applied Science Publishers, 1982 (translated by T. Schultz). Cooper, J. Building a Recording Studio, Synergy Group, Los Angeles (1996).
4]0
Bibliography
Damaske, P. "Subjective Investigation of Sound Fields," Acustica, vol. 22, no. 4(1967-68). Doelle, L. Environmental Acoustics, McGraw-Hill, New York (1972). Eargle, J. Electroacoustical Reference Data, Kluwer Academic, Boston (1994). . Music, Sound, & Technology, Kluwer Academic, Boston (1995). Forsyth, M. Buildings for Music, MIT Press, Cambridge, MA (1985). Gelfand, S. Hearing, an Introduction to Psychological and Physiological Acoustics, Marcel Dekker, New York (1981). Haas, H. "The Influence of a Single Echo on the Audibility of Speech," reprinted in J. Audio Engineering Society, vol. 20, no. 2 (1972). Kinsler, L. et al.. Fundamentals of Acoustics, Wiley, New York (1982). Knudsen, V. and Harris, C. Acoustical Designing in Architecture, Acoustical Society of America, New York (1978). Kuttruff, H. Room Acoustics, Applied Science Publishers, London (1979). Norris, R. "A Derivation of the Reverberation Formula," published in Appendix II of V. Knudsen, Architectural Acoustics, John Wiley & Son, New York (1939). Occupational Safety and Health Act (OSHA), 1970. Peutz, V. "Quasi-steady-state and Decaying Sound Fields," Ingenieursblad, vol. 42, no 18 (1973, in Dutch). Rathe, E. "Note on Two Common Problems of Sound Propagation," J. Sound Vibration, vol. 10, pp. 472-479 (1969). Robinson, D and Dadson, R. British Journal of Applied Physics, vol. 7, p. 166 (1956). Roederer, J. Introduction to the Physics and Psychophysics of Music, page 29, Springer-Verlag, New York (1973).
Bibliography
411
Sabine, W. "Collected Papers on Acoustics," Harvard University Press (1927). Schroeder, M. "Progress in Architectural Acoustics and Artificial Reverberation: Concert Hall Acoustics and Number Theory," J. Audio Engineering Society, vol. 32, no. 4 (1984). Schubert, E. (ed.). Psychological Acoustics, Dowden, Hutchinson, and Ross, Stroudsburg, PA(1979). Schultz, T. "Acoustics of Concert Halls", IEEE Spectrum, vol. 2, no. 6 (1965). Terhardt, E. Calculating Virtual Pitch, Hearing Research, 1: 155 (1979). Winckel, F. Music, Sound, and Sensation, Dover Publications, New York (1967).
MICROPHONES Bauch, F. "New High-Grade Condenser Microphones," J. Audio Engineering Society, vol. 1, no. 3 (1953). Bauer, B. "A Century of Microphones," Proceedings of the IEEE, vol. 50, pp. 719-729 (reprinted in J. Audio Engineering Society, vol. 35, no. 4 (1987). Beranek, L. Acoustics, McGraw-Hill, New York (1954). (Reprinted by Acoustical Society of America, 1993) Blumlein, A. "British Patent Specification 394,325 (Directional Effect in Sound Systems), J. Audio Engineering Society, vol. 6, pp. 91-98 (reprinted 1958). Bore, G. Microphones, Georg Neumann, GmbH, Berlin (1989). Bore, G. and Temmer, S. "MS Stereophony and Compatibility," Audio Magazine (Apri] 1958). Borwick, J. Microphones: Technology and Technique, Focal Press, London (1990).
412
Bibliography
Ceoen, C. "Comparative Stereophonic Listening Tests," J. Audio Engineering Society, vol. 20, pp. 19-27 (1972). Clark, H. et al., "The 'Stereosonic' Recording and Reproducing System," J. Audio Engineering Society, vol. 6, pp. 102-133 (1958). Braunmlihl, H. and Weber, W. Hochfrequenztechnik und Elektroakustik, vol. 46, pp. 187-192(1935). Dickreiter, M. Tonmeister Technology, Temmer Enterprises, New York (1989). Dooley, W. and Streicher, R. "MS Stereo: a Powerful Technique for Working in Stereo," J. Audio Engineering Society, vol. 30, pp. 707-717 (1982). Eargle, J. The Microphone Book, Focal Press, Boston (2001). Gayford, M., editor. Microphone Engineering Handbook, Focal Press, Oxford, England (1994). Gerzon, M. "Periphony: With-Height Sound Reproduction," /. Audio Engineering Society, vol. 21, no. 1 (1973). Hibbing, M. "High Quality RF Condenser Microphones," in M. Gayford, ed.. Microphone Engineering Handbook, Chapter 4, Focal Press, London (1994). . "XY and MS Microphone Techniques in Comparison," Sennheiser News, June 1989 (Sennheiser Electric Corporation). Huston, C. "A Quadraphonic Microphone Development," Recording Engineer/Producer, vol. 1, no 3 (1970). Lubin, T. "The Calrec Soundfield Engineer/Producer, vol. 10, no. 6 (1979).
Microphone,"
Recording
Mosely, J. "Eliminating the Stereo Seat," J. Audio Engineering Society, vol. 8, no. 1 (1960). Olson, H. "Directional Microphones," J. Audio Engineering Society, vol. 15, no. 4 (1967).
Bibliography
413
Olson, L. "The Stereo-180 Microphone System," J, Audio Engineering Society, vol. 27, pp. 158-163 (1979). Robertson, A. Microphones, Hayden Book Co., New York (1963). Sank, J. "Microphones," J. Audio Engineering Society, vol. 33, no. 7/8 (1985). Sessler, G. and West, J. "Condenser Microphones with Electret Foil," J. Audio Engineering Society, vol. 12, no. 2 (1964). Wong, G. and Embleton, T., editors. AIP Handbook of Condenser Microphones, American Institute of Physics, New York (1995). Woram, J. Sound Recording Handbook, H. Sams, Indianapolis (1989). Woszczyk, W. "A Microphone Technique Applying the Principle of SecondOrder-Gradient Unidirectionality," J. Audio Engineering Society, volume 32, number 7/8 (July/August 1984). Various, Microphones (an anthology of articles on microphones from the pages of the J. Audio Engineering Society, vol. 1 through vol. 27).
STEREOPHONY AND SURROUND SOUND Bartlett, B., Stereo Microphone Techniques, Focal Press, Boston (1991). Bauer, B. "Some Techniques Toward Better Stereophonic Perspective," J. Audio Engineering Society, vol. 17, no. 4 (1969). Borwick, J. Microphones, Technology and Technique, Focal Press, London (1990). Bruck, J. "The KFM 360 Surround Sound-A Purist Approach," presented at the 103rd AES Convention, New York November 1997, preprint number 4637. Ceoen, C. "Comparative Stereophonic Listening Tests," J. Audio Engineering Society, \o\. 20, no, 1 (1970)
414
Bibliography
Cohen, E. and Eargle, J., "Audio in a 5.1 Environment," presented at the 99th Audio Engineering Society Convention, New York October 1995, preprint 4071. Cooper, D. and Bauck, J. "Prospects for Transaural Recording," J. Audio Engineering Society, vol. 37, no. 1/2 (1989). Culshaw, J. Ring Resounding, Viking Press, New York (1957). . Putting the Record Straight, Viking Press, New York (1981). Damaske, P. "Head-Related Two-Channel Stereophony with Loudspeaker Reproduction," J. Acoustical Society of America, vol. 50, pt. 2, pp. 1109-1115 (October 1971). Davis, J. "Practical Stereo Reverberation for Studio Recording," J. Audio Engineering Society, vol. 10, no. 2 (1962). Eargle, J. "Evolution of Artificial Reverberation," Engineer/Producer, vol. 18, no 2 (1987).
Recording
. "Stereo/Mono Disc Compatibility," J. Audio Engineering Society, vol. 19, pp. 552-559(1969). Franssen, N., Stereophony, Eindhoven, The Netherlands, Philips Technical Bibliography 1964. Gardner, M. "Some Single- and Multiple-Source Localization Effects," J. Audio Engineering Society, vol. 21, pp. 430-437 (1973). Gerzon, M. "Periphony: "With-Height Sound Reproduction," J, Audio Engineering Society, vol. 21, no. 1 (1973). Harvey, F. and Schroeder, M. "Subjective Evaluation of Factors Affecting Two-Channel Stereophony," J. Audio Engineering Society, vol. 9, pp. 19-28(1961). Holman, T. 5.7 Surround Sound—Up and Running, Focal Press, Boston (2000). Jacobson, L. "Psychoacoustic Satisfaction," Mix Magazine, October 1991, pp.40-55.
Bibliography
415
Jecklin, J. "A Different Way to Record Classical Music," J. Audio Engineering Society, vol. 29, no. 5 (1981). Lipschitz, S. "Stereo Microphone Techniques: Are the Purists Wrong?" J. Audio Engineering Society, vol. 34, no. 9 (1986). Meyer, J. Acoustics and the Performance of Music, Verlag Das Musikinstrument, Frankfurt (1978). (Translated by Bowsher and Westphal) Mitchell, D. "Tracking for 5.1," Audio Media Magazine, (October 1999). Rumsey, F. Spatial Audio, Focal Press, Boston (2001). Schroeder, M. "An Artificial Stereo Effect Obtained from a Single Channel," J. Audio Engineering Society, vol. 6, pp. 74-79 (1958). . "Models of Hearing," Proc, IEEE, vol. 63, pp. 150-155 (1963). Snow, W. "Basic Principles of Stereophonic Sound," J. Society of Motion Picture and Television Engineers, vol. 61 (November 1953). Theile, G. "Multichannel Natural Recording Based on Psychoacoustical Principles," presented at the 108th AES Convention, Paris 2000, preprint number 5156. Streicher, R. and Everest, F. The New Stereo Soundbook, Audio Engineering Associates, Pasadena (1998). Williams, M. "Microphone Array Analysis for Multichannel Sound Recording," presented at the 107th AES convention New York, September 1999, preprint number 4997. Various authors. Stereophonic Techniques (an anthology prepared by the Journal of the AES, 1986).
4]6
Bibliography
SIGNAL TRANSMISSION AND PROCESSING Aiken, W. and Swisher, C. "An Improved Automatic Level Control Device," J. Audio Engineering Society, vol. 16, no. 4 (1968). Ballon, G., editor. Handbook for Sound Engineers, Focal Press, Boston (2001). Bartlett, B. "A Scientific Explanation of Phasing," J. Audio Engineering Society, vol. 18, no. 6 (1970). Berger, J., et al, "Removing Noise from Music Using Local Trigonometric Bases and Wavelet Packets," J. Audio Engineering Society, volume 42, number 10 (October 1994). Borwick, J., editor. Sound Recording Practice, Oxford (1987). Connor, D. and Putnam, R. "Automatic Level Control," J. Audio Engineering Society, vol. 16, no. 4 (1968). Davis, D. and C. Sound System Engineering, Focal Press, Boston (1986). Dickreiter, M. Tonmeister Technology, Temmer Enterprises Incorporated, New York (1989) Dolby, R. "An Audio Noise Reduction System," J. Audio Engineering Society, vol. 15, no. 4 (1967). . "The Spectral Recording Process," J. Audio Engineering Society, vol. 35, no. 3(1987). Frayne, J. and Wolfe, H. Sound Recording, Wiley, New^ York (1949). Giddings, P. Audio System Design and Installation, Focal Press, Boston (1990). Jung, W. IC OP-AMP Cookbook, Sams, Indianapolis (1974). Monforte, J. "A Dynamic Phase Meter for Program Material," J. Audio Engineering Society, vol. 36, no. 6 (1988).
Bibliography
417
Perkins, C. "Microphone Preamplifiers—A Primer/' Sound & Video Contractor, vol. 12, no. 2 (February 1994) Springer, A. "A Pitch Regulator and Information Changer," Gravesano Review, vol 11/12(1958). . "Acoustic Speed and Pitch Regulator," Gravesano Review, vol. 13 (1959). Thiele, N. "Three-level Tone Signal for Setting Audio Levels," J, Audio Engineering Society, vol. 33, no. 12 (1985). Tremaine, H. The Audio Encyclopedia, Sams, Indianapolis (1969). Werner, R. "On Electrical Loading of Microphones," J. Audio Engineering Society, vol. 3, no. 4 (October 1955). Woram, J. Sound Recording Handbook, Sams, Indianapolis (1989). Various authors. Motion Picture Sound Engineering (prepared by the Research Council of the Academy of Motion Picture Arts and Sciences), D. Van Nostrand, New York (1938).
LOUDSPEAKERS "Symposium on Auditory Perspective," Electrical Engineering, pp. 9-32, 214-219 (January 1934). Acoustics Handbook, Hewlett-Packard Applications Note 100 (November 1968). Augspurger, G. "The Importance of Speaker Efficiency," Electronics World (January 1962). . "Versatile Low-Level Crossover Networks," db Magazine, (March 1975). . "Loudspeakers in Control Rooms and Living Rooms," Proceedings of the 8th AES International Conference, "The Sound of Audio," Washington, DC, 3-6 May 1990.
418
Bibliography
Benson, J. "Theory and Design of Loudspeaker Enclosures/' AWA Technical Review, vol. 14, no. 1 (August 1968). Blauert, J. and Laws, P. "Group Delay Distortion in Electroacoustical Systems," J. Acoustical Society of America, vol. 63, no. 5 (May 1978). Benson, K., Audio Engineering Handbook, McGraw-Hill, New York (1988). Beranek, L. Acoustics, McGraw-Hill, New York (1954), pp. 313-322. Borwick, J. Loudspeaker and Headphone Handbook, Focal Press, Boston (2001) Collums, M. High Performance Loudspeakers, Pentech Press, London (1985). Engebretson, M. "Low Frequency Sound Reproduction," J. Audio Engineering Society, vol. 32, no. 5 (1984). Eargle, J. The Loudspeaker Handbook, Kluwer Academic Press, Boston (1996). . "Equalizing the Monitoring Environment," J. Audio Engineering Society, vol. 21, no. 2 (1973). . "Requirements for Studio Monitoring," db Magazine, (February 1979). Eargle, J. and Engebretson, M. "A Survey of Recording Studio Monitoring Frohhms,'' Recording Engineer/Producer, vol. 4, no. 3 (1973). Molloy, C. "Calculation of the Directivity Index for Various Types of Radiators," J. Acoustical Society of America, vol. 20, pp. 387-405 (1948). Smith, D., Keele, D., and Eargle, J., "Improvements in Monitor Loudspeaker Design," J. Audio Engineering Society, vol. 31, no. 6 (1983). Various, Loudspeakers, (an anthology of articles appearing in the Journal of the Audio Engineering Society, 1953 to 1977; eds., Cooke, R. and Gander, M.).
Bibliography
419
MAGNETIC TAPE RECORDING AND SYNCHRONIZATION Camras, M. Magnetic Recording Handbook^ Van Nostrand Reinhold, New York (1988). Carlson, W. et al., U. S. Patent 1, 640,881 (1927). Dolby, R. "An Audio Noise Reduction System," J. Audio Engineering Society, vol. 15, no. 5 (1967). Eilers, D. "Development of a New Magnetic Tape for Music Mastering," J. Audio Engineering Society, vol. 18, no. 5 (1970). Ford, H. "Audio Tape Revisited," Studio Sound, vol. 21, no. 4 (1979). Griesinger, D. "Reducing Distortion in Analog Tape Recorders," J. Audio Engineering Society, vol. 23, no. 2 (1975). Gundry, K. and Hull, J. "Introducing Dolby S-type Noise Reduction," Audio Magazine, June 1990. Gundry, K. "Headroom Extension for Slow-Speed Magnetic Recording of Audio," AES Convention preprint number 1534 (1979). Hickman, W. Time Code Handbook, Cipher Digital, Boston (1984). Jorgensen, F. The Complete Book of Magnetic Recording, TAB, Blue Ridge Summit, PA (1980). Katz, S. et al., "Alignment," Recording Engineer/Producer, vol. 6, no. 1 (1975). Kempler, J. "Making Tape," Audio Magazine, April 1975. Lowman, C. Magnetic Recording, McGraw-Hill, New York (1972). McKnight, J. "Operating Level in the Duplication of Philips Cassette Records, J. Audio Engineering Society, vol. 15, no. 4 (1967).
420
Bibliography
Mee, C. and Daniel, E. Magnetic Recording, Volume III, McGraw-Hill, New York (1988). Martin, M. "Some Thoughts on Cassette Duplication, J. Audio Engineering Society, vol. 21, no. 9 (1973). Mills, D. "The New Generation of High Energy Recording Tape," Recording Engineer/Producer, vol. 5, no. 6 (1974). Moog, R. "MIDI: Musical Instrument Digital Interface," J, Audio Engineering Society, vol. 34, no. 5 (May 1986). Poulsen, V. "The Telegraphone: A Magnetic Speech Recorder," Electrician, vol. 46, pp. 208-210(1900). Robinson, D. "Production of Dolby B-type Cassettes," J. Audio Engineering Society, vol. 20, no. 10 (1972). Rumsey, R MIDI Systems & Control, Focal Press, Oxford (1994). Smith, O. "Some Possible Forms of Phonograph," Electrical World, vol. 12, pp. 116-117(1888). Vogelgesang, P. "On the Intricacies of Tape Performance," db Magazine, vol. 13, no. 1 (1979). Woram, J. Handbook of Sound Recording, Sams, Indianapolis (1989).
DIGITAL AUDIO Blesser, B. "Digitization of Audio," J. Audio Engineering Society, vol. 26, no. 10(1978). Blesser, B. and Lee, F. "An Audio Delay system Using Digital Technology," J. Audio Engineering Society, vol. 19, no. 5 (1971). Bloom, J. "Into the digital Studio Domain," Studio Sound, vol. 21, no. 4/5 (1979).
Bibliography
421
Brandenburg, K and Stoll, G. "ISO-MPEG-1: A Generic Standard for Coding of High-Quality Digital Audio," J. Audio Engineering Society, volume 42, number 10 (October 1994). Camras, M. Magnetic Recording Handbook, Van Nostrand Reinhold, New York (1988). Gerzon, M. and Craven, P. "A High-Rate Buried-Data Channel for Audio CD," J. Audio Engineering Society, vol. 43, no. 1/2 (Jan/Feb 1995). Lambert, M. "Digital Audio Interface," J. Audio Engineering Society, vol. 38, no. 9(1990). Lipshitz, S., Vanderkooy, J. and Wannamaker, R. "Minimally Audible Noise Shaping, J. Audio Engineering Society, vol. 39, no. 11 (Nov 1991). Nakajima, H., et al. Digital Audio, TAB Books, Blue Ridge Summit, PA (1983). Nyquist, H. "Certain Topics in Telegraph Transmission Theory," Transactions, AIEE {K^n\ 1928). Oppenheim, A. Applications of Digital Signal Processing, Prentice-Hall, Englewood Cliffs, NJ (1978). Pohlmann, K. Principles of Digital Audio, Sams, Indianapolis (1985). . Advanced Digital Audio, Sams, Indianapolis (1991). Rodgers, H. and Solomon, L. "A Close Look at Digital Audio." Popular Electronics (September 1979). Shannon, C. "A Mathematical Theory of Communication," Bell System Technical Journal (October 1968). Stockham, T., et al. "Blind Deconvolution Through Digital Signal Processing," Proc. IEEE, vol. 63 (April 1975). Vanderkooy, J. and Lipschitz, S. "Resolution Below the Least Significant Bit in Digital Systems with Dither," J. Audio Engineering Society, vol. 32, no. 3(1984).
422
Bibliography
Vaseghi, S. and Fray ling-Cook, "Restoration of Old Gramophone Recordings," J. Audio Engineering Society, vol. 40, no. 10 (October 1992). Watkinson, J. The Art of Digital Audio, Focal Press, London (1989). Wright, M. "Putting the Byte on Noise." Audio Magazine (March 1989). Various, Digital Audio, Collected papers from the AES Conference, Rye, NY, 3-6 June 1982). Various authors. Papers on Digital Audio Bit Rate Reduction, edited by Gilchrist, N. and Grewin, C. Audio Engineering Society, New York (1995).
ANALOG DISC RECORDING Bogantz, G. and Ruda, J. "Analog Disk Recording and Reproduction," Chapter 8 in K. Benson, Audio Engineering Handbook, McGraw-Hill, New York (1988). Braschoss, D. "Disc Cutting Machine—Computer Controlled," Radio Mentor (October 1966). Eargle, J. "Record Defects," Stereo Review (June 1967). . "Performance Characteristics of the Commercial Stereo Disc," J. Audio Engineering Society, vol. 17, no. 4 (1969). Fox, E. and Woodward, J. "Tracing Distortion—Its Cause and Correction in Stereo Disc Recording," J. Audio Engineering Society, vol. 11, no. 4 (1963). Hirsch, F and Temmer, S. "A Real-Time Digital Processor for Disc Mastering Lathe Control" (presented at the 60th Convention, Audio Engineering Society, May 1978). Narma, R. and Anderson, N. "A New Stereo Feedback Cutterhead System," J. Audio Engineering Society, vol. 7, no. 4 (1959).
Bibliography
423
C. Nelson and Stafford, J. "The Westrex 3D Stereo Disk System," J. Audio Engineering Society, vol. 12, no. 3 (1964). Read, O. and Welch, W. From Tin-Foil to Stereo, Sams, Indianapolis (1958). Stafford, J. "Maximum Peak Velocity Capabilities of the Disc Record, J. Audio Engineering Society, vol. 8, no. 3 (1960). Woodward, J. and Fox, E. "A Study of Program-Level Overloading in Phonograph Recording," J. Audio Engineering Society, vol. 11, no. 1 (1963). Various, Disk Recording, Volumes 1 and 2. Published by the Audio Engineering Society, New York.
INDEX
Absorption coefficient, 11 Acoustical guitar, 275-276 Acoustical levels, addition, 100-102 Acuity of localization, 34 Aliasing, 189 Alpha (a), 11 Ambience microphones, 280 Amplitude, 1 Amplitude response of equalizers, 217 Aphex Aural Exciter, 245 Archival transfers, 348-349 Artificial (dummy) head, 34 ATRAC (adaptive transform audio coding), 368 Attack time, compressors and limiters, 223 Attenuation of sound with distance, 17-19 Audio chain, 98 Audio oscillator, 100 Automated mixdown, 329 Automatic indexing, 167-168 Auxiliary buses and outputs, 107 Average absorption coefficient, 12-15 Average value, 7 Azimuth, 175
-BBackplate, 48 Baffles, studio, 9-11 Balanced audio line,s 136 Balanced versus unbalanced lines, 136-137
Ballistics, metering, 133-134 Bass rolloflF, 69 Battery powering of microphones, 80 BBC meter, 133 BCD (binary coded decimal), 178 Binaural hearing, 33-35 Bi-phase modulation, SMPTE time code, 178 Blumlein array, 254-255 Boltzmann's constant, 108 Booms, microphones, 84-85 Bouncing of tracks, 124 Boundary microphones, 70-72 Brass instruments, 275-276 Braunmühl-Weber microphon,e 60-62 Breathing and pumping effects in compressors, 228 "Bucket brigade" delay systems, 233 Building the mix, 327-328 Bus, 107
-CCable hanging mounts for microphones, 89 Cables, 137 Calibration of meters,134-135 Capacitor {see condenser) Capstan, 160 Capsule, condenser microphone, 48 Carbon microphone, 47-47 Carbonyl iron solution, 347 Cardioid pattern, 53-60 Cassette equalization, 354
Index Cassette high-speed duphcation, 357-358 Cassette mastering, 358 Cassette tape characteristics, 356 CD (compact disc), 184, 359 optical details, 362-363 pre-emphasis (optional), 366 profile, 359-361 replication, 363-364 subcode information, 365 CD-I, 359 CD-R,365 CD-ROM, 359 Cents (frequency), 32 Chamber groups, 299-303 Channel path, 122-123 Channel, definition, 94 Charge-coupled devices (CCD), 233 Chorus, 304 Chorus generator,s, 249 Classical recording chamber groups, 299-303 delay of spot microphones, 308-309 documentation, 307 dynamic range, considerations, 293-295 guitar and lute, 296-297 harpsichord, 295-296 large choruses, 304 large orchestra, 307-308 livening the venue, 309-310 mixing, 330-331 orchestras, 304-308 panning of spot microphones, 305 piano, 295 piano technicians, 295 pipe organ, 298-299 remote venues, 292-293 role of the engineer, 291
425 role of the producer, 290-291 setting initial balances, 305 spot (accent) microphones, 305 staffing, 291-292 string quartet, 300-301 use of artificial reverberation, 303-304 vocalists in stereo, 300 Clips (mounts) for microphones, 85 Clock, digital, 193 Cloning, 184 Coincident microphone techniques, 254-255 Comparator, automation systems, 132 Complex waves, 7 Compression, 222-225 Compression ratio, 223 Convolution in reverberation systems, 240-241 Concert halls, 40-^3 Condenser backplate, 48 capsule, 48 diaphragm, 48 Console automation, 130-133 comparator, 132 read/write modes, 131-133 recall, 131 servo faders, 130-133 touch-sensitive faders, 132 update mode, 131 Console circuitry, 108-109 Console ergonomics, digital, 210 Correlation metering, 105-106 Critical distance (D^), 17-19 Crosstalk, 178 Crosstalk cancellation, 38-39 Current draw of microphone,s, 79 Custom monitor loudspeakers, 139-140
426
Index
DDAW (digital audio workstations), 201,219,345 dBFS (dB full-scale, digital recording), 94, 135,327 dbx boom box, 245-246 dbx noise reduction, 169 "Dead" spaces, 9-13 Decay of sound, 11-13 Decibel, 2-4 De-esser, 227 Degaussing {see demagnetization) Delay tubes, 232-233 Delay unit,s, 232 Demagnetization, 175 DI (directivity index), 145 Diaphragm, 48 Diffraction, 21-24 Digital editing, 344-345 Digital equalizers, 219 Digital meters, 134 Digital reverberation, 238-239 Digital technology DASH format, 200 AES-EBU, 199-200 aliasing, 189 bit, 187 buffer, 185 byte, 181 clocking function, 193 cloning, 184 console ergonomics, 210 DAW (digital audio workstation), 201 dBFS, 192 digital word structure, 187 dither, 185, 189, 192-193 editing, 205-206 EDL (edit decision list), 208
EIAJ (Electronics Industry Association of Japan) format, 200 equalization, 185 error concealment, 192 error correction, 185, 189 filter ripple 194 filtering, 185, 187 interleaving of data, 191 jitter, 193-194 linearity, 194 MADI (multichannel audio digital interface), 200 MSB (most significant bit), 181, 187 noise, 189 noise removal, 207-208 noise shaping, 192-193 Nyquist frequency, 187 Nyquist rate, 187 parity bits, 191-192 PCM (pulse code modulation), 187 PD (Pro-Digital), 200 random access, 206 R-DAT (or DAT) format, 200 recording media, 195 redundancy, 189 sampling, 184 sampling rate, 187 SDIF-2, 200 standards, 199-200 Direct boxes, 274 Direct incidence response, 69-70 Direct inputs to console, 274 Directivity factor (Q), 19 Distance factor (DF), 58-60 Distortion in microphones, 75-76 Dither, 185, 189, 192-193 Dolby HX Pro, 356-357
Index Dolby noise reduction (B, C, and S type), 354-355 Dolby noise reduction, 169-172 Dolby SR (Spectral Recording), 169-172 Drop frame, time code, 180 Drum recording, 269-271 Dry mix, 126 DSD (Direct Stream Digital), 194-195 DSP (digital signal processing), 219 DVD audio, 359 DVD video, 359 Dynamic microphone, 50 Dynamic range, 293-295 Dynamic range of tape, 171-173 Dynamics control applications, 228-229 attack time, 223 breathing and pumping effects, 228 compression, 222-225 compression ratio, 223 de-esser, 227 expanders, 230 gain riding, 222 limiters, 226 multi band, 226 noise gates, 230-231 release time, 223 side channel, 222 stereo ganging, 228 threshold, 223 VGA (voltage-controlled amplifier), 222 zero-crossing detection, 224
All
-EEarly reflections, 41, 235 EBU (European Broadcasting Union), 177 Echo (reverberation) chambers, 26-27, 107,232 Editing "tricks of the trade," 343-344 Editing speech, 344 Editing, digital, 205-206 Editing, principles, 340-342 EDL (edit decision list), digital editing, 208 EIN (equivalent input noise), 107-108 Electrical signal summation, 101-102 Electronic instruments, 278 EMT reverberation plate, 233 End-cut filters, 219 Equalizers amplitude response, 217 digital type, 219 end-cut type, 219 filters, 213 graphic type, 213 notch type, 219 parametric type, 215-216 peak-dip type, 215-216 phase response, 217 program type, 213 shelving type, 219 Erase heads, 154 Error correction, 185, 189 Expanders, 230
428
Index
-IFall-back time, 134 Faulkner stereo array, 262-263 FFT (fast Fourier transform), 369 Figure-eight microphone, 50 Filter Q (width), 219 Filters, 213 First-order patterns, 58 Five-point-one (5.1) standard, 313-314 Flanging, 242-243 Fletcher-Munson equal loudness contours, 28 Flutter, 161 Formant frequencies, 250-251 Frequency, 1 Frequency response of monitors, 144-146 Fundamental, 7 Gain riding, 222 Goboes and baffles, 9-11, 22, 274 Graphic equalizers, 213 Graphic level recorder, 17 Ground loops, 136 GUI (graphic user interface), 219 Guitar and lute, 296-297 Half-normaledjack. 111 Harmonics, 7 Headphone monitoring, 273, 278 Headroom, 95 Hearing protection, 43-45 Hertz (Hz), 1 HF harmonic generation, 245 Hum, 136 Humidity, effects of, 21 Hypercardioid microphone pattern, 58
I/O Module, 121-124 lEC tape standards, 164-165 Image widening, 251-252 Impulse response of reverberation systems, 240-241 In-line console bouncing tracks, 124 channel path, 122-123 "dry" mix, 126 generic layout, 122 I/O module, 121-124 monitor path, 122-123 overdubbing, 124 tracking, 124-126 "wet" mix, 126 Input modules, 114-115 Insert takes, 342 Interference effects and reflections, 70-71 Interference tube microphone, 63 Intimacy, 28, 40-41 Inverting amplifiers, 109 Inverting/noninverting devices, 99 Isolation in the studio, 267-268 Isolation rooms, 24 ITU surround loudspeaker placement, 314
Jack, 111 Jack, half-normaled, 111 Jazz recording, 279-286
Index
429
-KKilohertz (kHz), 1 Klepko microphone array, 322
Large studio orchestra, 286-288 Law of first arrival, 33 Lead factor, metering, 134 Leader tape, 342, 327 Level diagrams, 120, 121 Level, definition, 3 LF sub-harmonic generation, 245-246 Limiters, 226 Line level, 111 Linear frequency scale, 5 Lissajous figures, 104 Live spaces, 9-13 Liveness, 41 Load impedance of microphones, 16r-ll
Localization, amplitude and phase effects, 28, 33-35 Logarithmic frequency scale, 4-5 Lossless data packing, 367 Loudesss versus signal duration, 30, 32 Loudness, 95-97 Loudness contours, 28 Loudness control, 30 Low bit rate transmission, 368 LP (long-playing) disc history, 371-372 microgroove geometry, 372374 profile, 372-373 burnishing facets, 377 chip removal, 389 curvature overload, 377
cutter lift and overcut, 390 cutting angle, 377-378 cutting head design, 380-384 cutting process, 389-390 deformation losses, 387-389 diameter losses, 387-389 Direct Metal Mastering (DMM), 392-393 displacement overload, 376 groove echo, 392 heated stylus cutting, 372 inverse feedback, 380-381 master lacquer, 389-390 metal master, 390 metal mother, 390 playback-stylus relationships, 381-384 record pressing, 390-391 recording and playback equalization, 374-375 reference level, 373 scanning losses, 387-389 slope overload, 376 spring-back effect, 392 stamper, 390 stylus motion, 372-374 tracking distortion, 377, 379 transfer systems, 384-388 variable pitch and depth,386387
-MMAF (minimum audible field), 29 Magnetic induction, 50 Magnetic recording ac bias, 154-156 acicular oxide particles, 157 alignment procedures, 173177 alignment tapes, 165, 173
Index
430 automatic indexing, 167-168 azimuth, 157, 175 baking of tape, 160 basic elements, 155 bias adjustment, 159 bias oscillator, 154 capstan, 160 delamination of tape, 160 demagnetizing, 175 drop-outs, 158 dynamic range of tape, 171173 eddy current losses, 157 erase heads, 154 flutter, 161 gamma ferric oxide formulation, 157 gap length, 157 head construction, 157 lEC standards, 164-165 jog mode, 168 magnetic induction, 163 NAB standards, 164-165 noise, 155 overdubbing, 165-167 playback curves, 164-165 playback heads, 154 post-echo, 158 pre-echo, 158 pressure roller, 160 print-through, 158 punching in, 167 record heads, 154 rehearsal mode, 167-168 scrape flutter, 162 Sel-Sync, 165-166 shuttle function, 168 storage of tape, 160 surface magnetic fluxivity, 165 tape, 157-159
tape base material, 158 tape guides, 275 tape idlers, 175 tape lifters, 161 tape sensitivity, 158 tape skew, 175 tension sensing, 160 test tapes, 175 track width standards, 172174 transports, 160-163 wow, 161 Masking, 28 Mastering, definition, 326 Matrix box, MS, 261 MDM (Modular Digital Multitrack), 184, 198 Mean free path (MFP), 16 Measurement, reverberation time, 17 Metering ballistics, 133-134 BBC meter, 133 calibration, 134-135 digital, 134 fall-back time, 134 lead factor, 134 PPM (peak program meter), 133 rise time, 134 VU (volume unit) meter, 133 Microphone bass rolloff, 69 battery powering 80 booms 84-85 boundary type, 70-72 Braunmühl-Weber type, 6062 cable hanging mounts 89 carbon type, 46-47 cardioid pattern, 53-60
Index
431
current draw, 79 direct incidence response, 69-70 distortion, 75-76 dynamic type, 50 Figure-eight, 50 first order, 58 high-pass filter, 93 hypercardioid pattern, 58 interference efl^ects and reflections, 70-71 interference tube, 63 load impedance, 76-77 mounts (clips) 85 noise floor, 75 off-axis response, 69-70 pads (attenuation), 75-76, 93 phantom (P48) powering, 7879 phase (polarity) inversion, 93 piezoelectric, 46-47 polar patterns, 51 -59, 64 powering, 78-79 proximity effect, 65069 Random efficiency, 58-59 random incidence response, 69-70 RF interference 81 ribbon type, 50 rifle (shotgun) type, 62-64 sensitivity, 74-75 shock mounts 87 shroud, wind, 90-91 signal de-emphasis, wireless type 81 signal diversity 82-83 signal pre-emphasis, wireless type 81 source impedance, 76-77 splitters, 90, 92 stand-alone preamps, 78
stands 84-85 stereo mounts 87-88 subcardioid, 57 T-powering 80 transducers, 46 transformers, 93 tum-arounds, 93 variable patterns, 60-62 wind/pop screens 89-91 wireless type 80-83 MIDI sequencer MIDI time code, 167 MiniDisc, 368 Minimum audible field (MAF), 1 Mixing, 126 Mixing definition, 326 Modal density in reverberation systems, 239 Monitor mix, 269 Monitor path, 122-123 Monitor system equalization, 100-101 Monitoring conditions, 329 Monitors, 139-140 alignment shifts, 148 angular coverage, 144-146 axial response, 144-146 beamwidth, 144-145 damping factor directivity index, 145 distortion, 147 effects of room reflections, 146 electrical details, 142-143 frequency response, 144-146 impedance, 149 near-field, 141 power compression, 148 specifications, 143-144 time domain accuracy, 146147
432
Index
Mono compatibility of stereo, 259-260 Monophonic (mono), 94 MP3, 368, 370 MS (mid-side) stereo pickup, 257-262 MSB (most significant bit), 181 Multi-band compression, 226 Multichannel, 94 Music assembly, 345-347
Pads in microphones, 75-76 Panpot (panoramic potentiometer), 37-38, 109,254 Parametric equalizers, 215-216 Patch bay (jack field), 111-112 Peak value, 7 Peak-dip equalizers, 215-216 Perceptual coding, 368
Period, 1, 6
-NNAB tape standards, 164-165 NC (noise criterion) curves, 401 Near-coincident stereo pickup, 262-263 Near-field monitors, 141 Noise, 189 Noise floor of microphones, 75 Noise gates, 230-231 Noise induction, 136 Noise removal, digital, 207-208 Noninverting amplifiers, 109 Non-tuned percussions, 272 Normaled patch points, 111-112 NOS pickup of stereo, 262-263 Notch filters, 219
-OOff-axis microphone response, 69-70 Opamps (operational amplifiers), 108 Orchestral recording, 304-308 ORTF pickup of stereo, 262-263 Output modules, 116-117 Overdubbing, 124 Overdubbing, 165-167
Periodic waves, 7 Phantom images, 35-37 Phase, 1 Phase response of equalizers, 217 Phasing, 242-243 Philips Compact Cassette, 352-353 Phoenix (Euroblock) connectors, 138 Phon, definition, 28 Photo-CD, 359 Piano, 272-273, 295 Piezoelectric microphone, 46-47 Pink noise, 100 Pinna transforms, 33 Pipe organ, 298-299 Pitch adjustment and regulation, 246-248 Pitch versus frequency and loudness, 31-33 Playback heads, 154 Polar patterns, 51-59 Polar response, 64, 72-73 Powering of microphones, 78-79 PPMs, 133 Pre-delay in reverberation systems, 239 Presence, 330 Program equalizers, 213 Proximity effect, 65-69 Psychoacoustics, 28 Pulse code modulation (PCM), 187
Index
433
PZM (pressure zone microphone), 70-71
0 19 -RRandom access, digital editing, 206 Random efficiency (RE), 58-59 Random incidence response, 69-70 Read/write modes, 131-133 Real images, 35-37 Recall, 131 Record heads, 154 Recording tape, 157-159 Reel-to-reel tape, 352 Rehearsing the mix, 328-329 Release time, compressors and limiters, 223 Reverberation, 12 Reverberation, 303-304 Reverberation and delay applications, 236-238 "bucket brigade" systems, 233 charge-coupled devices (CCD), 233 convolution, 240-241 delay tubes, 232-233 delay units, 232 digital systems, 238-239 early reflections, 235 echo chambers, 232 impulse response, 240-241 modal density, 239 pre-delay, 239 reverberation chambers, 232 reverberation plates (EMT), 233
sampling systems, 240-241 spring systems, 233-234 tape delay, 232 Reverberation chambers, 232 Reverberation pickup, 255 Reverberation time, definition, 14-17 Reverberation time, target values, 41 Reverberation time, variation w^ith frequency, 42, 44 RF (radio frequency) interference, 81 RIAA (Record Industry Association of America) disc standards, 376 Ribbon microphone, 50, 275 Rifle (shotgun) microphone, 62-64 Ringout (decay) of sound, 12 Ring-tip-sleeve patch cords, 138 Rise time, 134 Robinson-Dadson equal loudness contours, 28
-SS/N (signal-to-noise) ratio, 95 SACD (Super Audio Compact Disc), 195,359 SACD hybrid discs, 366-367 SAM microphone array, 319-320 Sampling, 184 Sampling type reverberation systems, 240-241 Schoeps KFM360 sphere, 318-319 Schopes OCT array, 322-323 Scrape flutter, 162 Sel-Sync, 165-166 Semitone, 32 Sensitivity of microphones, 74-75 Servo faders, 130-133 Setting proper gain structure, 121
434
Shelving equalizers, 219 Shock mounts for microphones, 87 Shrouds for microphones, 90-91 Signal (stereo) correlation metering, 104-106 Signal envelope, 94 Signal phase (polarity), 98-100 Signal spectra, 98 Small monitor loudspeakers, 139-142 SMPTE time code, 167 Sound losses due to humidity, 21 amplitude, 1 frequency, 1 period, 1 phase, 1 speed of, 6 threshold of hearing, 1 wavelength, 6-7 Soundfield microphone, 316-317 Source IDs, 345-346 Source impedance of microphones, 76^77 Spaced-apart stereo pickup, 263-266 Spaciousness, 28, 40 SPDIF format, 362 Spectral Recording, 170 Speech compression, 96-97 Speech spectra, 98 Speed of sound, 6 Splicing blades, 339 Splicing tape, 340-341 Split configuration consoles gain structure, 121 generic layout, 113 input module, 114-115 level diagram, 120-121 monitor-master module, 118119 output module, 116-117
Index Spot (accent) microphones, 305 Spring reverberators, 233-234 Stand-alone microphone preamps, 78 Stands, microphones, 84-85 STC (sound transmission class) curves, 402-403 Stereo ganging of compressors and limiters, 228 Stereo widening, 251-252 Stereo microphones, 256-256 Stereo mounts for microphones, 87-88 Stereo sound stage plots, 331-332 Stereo amplitude versus delay, 37, 39 Blumlein array, 254-255 coincident microphone techniques, 254-255 crossed cardioids, 255-256 mono compatibility, 259-260 MS (mid-side) pickup, 257262 near-coincident techniques, 262-263 NOS pickup, 262-263 ORTF pickup, 262-263 panpots (panoramic poten tiometers), 254 reverberation pickup, 255 spaced-apart pickup, 263-266 three-channel techniques, 263-265 Stereophonic localization, 35-37 Stereophonic sound stage, 36 String quartet, 300-301 Strings, 277-278 Studio acoustics, 9-17 Studio recording techniques acoustic guitar, 275-276
Index ambience microphones, 280 brass instruments, 275-276 direct boxes, 274 direct inputs, 274 documentation, 288-289 drums, 269-271 electronic instruments, 278 goboes (baffles), 274 headphone monitoring, 273, 278 isolation, 267-268 jazz recording, 279-286 large studio orchestra, 286288 monitor mix, 269 non-tuned percussions, 272 piano, 272-273 strings, 277-278 track allocation, 279 track logistics, 269 tuned percussions, 271-272 use of ribbon microphones, 275 VGA subgrouping, 288 vocalists, 273-274 woodwind instruments, 277 Studio) acoustical considerations, 396-404 acoustics, 404-405 air handling noise, 400 business plan, 394-395 control room design, 406^08 decor and atmosphere, 405 impact noise isolation, 403404 noise level requirements, 401-403 site selection and development, 396
435
sound diffusion, 404-405 sound scattering, 404-405 structure borne noise, 399 transmission loss (TL), 396 Subcardioid pattern, 57 Sub-mixes, 328 Summing localization, 36-37 Surround mixing Surround sound, 94, 126 3-2 recording, 322 5.1 standard, 313-314 down-mixing, 324 DVD (Digital Versatile Disc), 312 effects, 315 ITU (International Telephonic Union) standard, 314 joy stick panpots, 325 Klepko array, 322 movies at home, 311-313 SAM (surround ambience microphone array), 319-320 Schoeps KFM360 sphere, 318-319 Schoeps OCT (optimum cardioid triangle) array, 322323 Soundfield microphone, 316317
Tape alignment procedures, 173-177 Tape delay, 232 Tape deterioration, 346 Tape head demagnetizers, 175 Tape playback curves, 164-165 Telcom-4 noise reduction, 169 Tempo adjustment and regulation, 246-248
436
Index
Termination loads or resistors, 110 Test tapes, 175 THD (total harmonic distortion), 75-76 Three-channel techniques, 263-265 Three-two (3-2) recording, 322 Threshold of feeling, 2 Threshold of hearing, 1 Threshold setting, compressors and limiters, 223 Time code, 133 Time code, 177-181 T-powering of microphones, 80 Track width standards, 172-174 Tracking, 124-126 Tracking, definition, 326 Transducers, 46 Transient signals, 33 Tuned percussions, 271-272
VCA (voltage controlled amplifier), 222, 244 VCA subgrouping, 288, 328 Vintage tube condenser microphones, 274 Vocal booths, 24 Vocalists, 273-274, 300 Vocoders (voice coders), 250-251 Voice-over applications, 229 VU meter, 30, 133
-W-
Update mode, 131
Warmth, 28, 41 Wavelength, 6-7 Waves, average value, 7 Waves, peak value, 7 Weighting curves, 30-32 "Wet" mix, 126 White noise, 100 Wind/pop screens, 89-91 Wireless microphones, 80-83 Woodwind instruments, 277 Wow, 161
Variable microphone patterns, 60-62
Zero-crossing detection, compressors and limiters, 224
-U-