APPLICATIONS OF PARALLEL PROCESSING IN VISION
ADVANCES IN PSYCHOLOGY 86 Editors:
G. E. STELMACH
P. A. VROON
NORTH-...
9 downloads
209 Views
18MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
APPLICATIONS OF PARALLEL PROCESSING IN VISION
ADVANCES IN PSYCHOLOGY 86 Editors:
G. E. STELMACH
P. A. VROON
NORTH-HOLLAND AMSTERDAM * LONDON NEW YORK TOKYO
APPLICATIONS OF PARALLEL PROCESSING IN VISION
Edited by
Julie R. BRANNAN Departments of Neurohiology and Neurology The Mount Sinai Medical Center N e w York, NY, U.S.A.
I992 NORTH-HOLLAND AMSTERDAM LONDON NEW YORK TOKYO 9
NORTH-HOLLAND ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 2 I I, 1000 AE Amsterdam, The Netherlands
Distributors for the United States and Canada: ELSEVIER SCIENCE PUBLISHING COMPANY. INC. 655 Avenue of the Americas New Y0rk.N.Y. 10010. U.S.A.
ISBN: 0 444 88651 6 01992 ELSEVIER SCIENCE PUBLISHERS B.V. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system. or transmitted, i n any form or by any means, electronic. mechanical, photocopying, recording or otherwise. without the prior written permission of the publisher, Elsevier Science Publishers B.V., Permissions Department, P.O. Box 521, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the copyright owner, Elsevier Science Publishers B.V., unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products. instructions or ideas contained in the material herein. Printed in The Netherlands
Table of Contents List of Contributors
vii
Preface
ix
INTRODUCTION TO PARALLEL PROCESSING 1.
2.
Parallel Retinocortical Channels: X and Y and P and M Robert Shapley Parallel Processing in Human Vision: History, Review, and Critique Bruno G. Breitmeyer
3
37
PARALLEL PROCESSING AND VISUAL DEVELOPMENT
3.
4.
Parallel Processes in Human Visual Development Adriana Fiorentini Changes in Temporal Visual Processing in Normal Aging Julie R. Brannan
81
119
PARALLEL PROCESSING IN HIGHER-ORDER PERCEPTION 5.
6.
7.
M and P Pathways and the Perception of Figure and Ground Naorni Weisstein, William Maguire, a n d Julie R. Brannan
137
Cooperative Parallel Processing in Depth, Motion and Texture Perception Douglas Williams
167
Parallel and Serial Connections Between Human Color Mechanisms Qasim midi
227
CONTENTS
vi
PARALLEL PROCESSING AND VISUAL ABNORMALITIES 8.
9.
Sensory and Perceptual Processing in Reading Disability M a y C. Williams and William Lovegrove
263
How Can the Concept of Parallel Channels Aid Clinical Diagnosis? M. Felice Ghilardi, Marc0 OnOfrJ, and Julie R. Brannan
303
Author Index
327
Subject Index
337
List of Contributors Julie R. Brannan Box 1052 Departments of Neurobiology and Neurology The Mount Sinai Medical Center One Gustave Levy Place New York, NY 10029
Bruno G. Breitmeyer Department of Psychology University of Houston Houston, TX 77204
Adriana Norentini Istituto di Neurofisiologia CNR Via S. Zen0 51 1-56127 Pisa. Italy M. Felice Ghilardi College of Physicians and Surgeons Columbia University Center for Neurobiology and Behavior 722 W. 168th Street Research Annex, Room 819 New York. NY 10032 William Lovegrove Department of Psychology University of Wollongong Wollongong, New South Wales Australia. 2500 William Maguire Department of Radiology Long Island Jewish Medical Center New Hyde Park, NY 11042 Marco Onofrj Clinica Neurologica Universita G. d ' h n u n z i o Ospedale Ex-Pediatric0 66100 Chieti Italy
viii
CONTRIBUTORS
Robert Shapley Center for Neural Science Departments of Psychology and Biology New York University New York. NY 10003 Naoml Weisstein State University of New York at Buffalo (on leave) 890 West End Avenue #8B New York. NY 10025
Douglas Williams Rockefeller University The Neurosciences Institute 1230 York Avenue New York. NY 10021 Mary C. Williams Department of Psychology University of New Orleans Lakefront Campus New Orleans, LA 70148
Qasim Z d d i Department of Psychology Columbia University New York. NY 10027
Preface A fundamental property of the human visual system is that information is processed in parallel. That is. visual sensory information is analyzed simultaneously along two or more independent pathways. This insight originated in the area of color perception, with an idea first expressed by Thomas Young in 1801 and later reformulated by Hermann Helmholtz in 1909. Their theory states that our perception of color depends solely on the response of three different pigments. These three color pathways, together with their opponent interaction, are sufficient to perceive the entire chromatic continuum. It has also been argued that, based on limitations of neural timing, massive parallel processing becomes the only biologically plausible mode of operation. For example, the "100 step rule" of Feldman and Ballard (Cognltiue Science, 1982) is a criterion which demonstrates the necessity of processing our complex visual world in parallel. The maximum rate of neuronal firing is 1000 Hz. and simple perceptual tasks require approximately 100 milliseconds. Given these parameters, biologically feasible phenomena could include no more than 100 steps - unless parallel processing occurs. In the past two decades, researchers have extensively used the concept of parallel visual channels as a framework to direct their explorations of human vision. Based initially on the discovery of functionally different sets of ganglion cells in cat retina, this concept has evolved over time to encompass the psychophysically based notion of a "transient-sustained'' dichotomy, and (more recently) a division based on magnocellular and parvocellular layers in the monkey LGN. What these various conceptualizations have in common is a "division of labor" based on temporal, spatial, chromatic, and contrast attributes of the visual world. Perhaps the controversy regarding the absolute validity of a strict functional dichotomy in humans is ultimately less important than the usefulness of such a dichotomy in guiding and directing vision research. For example, many researchers (and some clinicians) have found such a dichotomy applicable to the way we organize our knowledge of visual development, higher order perception, and visual disorders, to name just a few. This volume attempts to provide a forum for gathering these different perspectives. Robert Shapley opens the volume with an overview of the anatomical and physiological data in the cat and monkey which underlie our understanding of parallel channels, and how they may relate to human perception. Bruno Breitmeyer then provides a historical perspective on the concept of parallel processing in human visual perception. The next section deals with the development of parallel pathways in vision. Adriana Fiorentini describes what we know about this process in infancy, followed by a chapter describing temporal processing changes in the aging visual system (Brannan). Next, the implications of parallel processing to higher order perception are explored. Naomi Weisstein and colleagues report recent data and a qualitative model which suggest that the activity of parallel systems may
X
PREFACE
underlie our perception of figure and ground. Douglas Williams describes depth, motion, and texture perception in the context of parallel processing, while Qasim Zaidi's chapter provides a comprehensive review of parallel color mechanisms. Finally, the potential of using this concept to improve our understanding of visual disorders is discussed by Mary Williams and William Lovegrove (reading disability) and M.F.Ghilardi and associates (clinical diagnosis). Julie R. Brannan New York, New York
Introduction to Parallel Processing
This Page Intentionally Left Blank
Applications of Parallel Processing in Vision J. B r m a n (Editor) @ 1992 Elsevier Science Publishers B.V. All rights reserved
3
Parallel Retinocortical Channels: X and Y and P and M ROBERT SHAPLEY
Multiple parallel neural mechanisms There has been a great advance in our understanding of the neural basis of vision in mammals. Different types of retinal ganglion cells, the output neurons of the retina, project in parallel, separately and independently, from the retina to the brain. The ganglion cells filter visual stimuli, sending on to the brain responses to those stimuli to which they are tuned. Furthermore, each of these cell types, or classes, is distributed throughout the retina. Therefore, activity across the population of cells in each one of the classes forms a representation of the world as "seen" by that type of cell. Previously, one might have conceived of the eye as an optical device with a sensitive film (the retina) from which neural images were transmitted to the brain. Now this scheme must be modified to include the idea that the retina is made up of many neural "films" overlaid on each other. Each neural "film" transmits a separate filtered version of the optical image formed by the eye. There h a s also been some excitement lately in relating psychophysical properties of visual sensitivity to neural mechanisms in the retina and in cerebral cortex, and this interest is connected to the concept of parallel retinocortical pathways. The focus of interest is the degree to which color vision and achromatic vision may be thought of as parallel and independent sensory analyses of the visual scene. There was a tradition for many years in theories of color vision to consider responses to black and white as being the result of a different neural mechanism from the one that can discriminate among wavelengths or wavelength distributions (see for example Hurvich and Jameson. 1957). This dualistic approach was reinforced by neurophysiological work by De Valois and Gouras and their colleagues in an earlier era of visual neurophysiology (reviewed in DeValois and DeValois. 1975: Gouras. 1984). The idea arose of a separate set of color blind retinal ganglion cells that were sensitive to a broad band of the visible spectrum and responsible for the visibility of black and white patterns. The numerous class of color opponent ganglion cells was supposed to be the sole vehicle for signals about color to travel from eye to brain. Then opinions changed and hypotheses were formulated about how all of vision, both
4
CHAPTER 1
achromatic and chromatic, could be derived from the spatiotemporal and chromatic response characteristics of the color opponent type of neuron (see for example, DeValois and DeValois. 1975; Ingling and Martinez Uriegas, 1983; Kelly, 1983; Derrington. Krauskopf. and Lennie, 1984; Rohaly and Buchsbaum, 1988, 1989). More recently, among neurophysiologists there has been some return to the dual, parallel channels point of view (Shapley and Perry, 1986; Livingstone and Hubel. 1987. 1988; Lee, Martin, and Valberg. 1988; Kaplan. Lee, and Shapley, 1990). This is still an unresolved controversy (cf. Lennie, Trevarthen, Wassle, and Van Essen. 1989).The neurophysiological. retinocortical channels probably do not correspond exactly with the achromatic and chromatic channels of psychophysics, and they probably interact more than some theories predict. Nevertheless, there is good reason to believe that there are two separate retinocortical pathways carrying different kinds of signals about the appearance of the outside world, one referring more to achromatic contrast, the other concentrating more on color contrast. In discussing the data, I will begin by considering the evidence for parallel retinocortical processing in the cat. Then, the focus will shift to the monkey visual pathway, its relation to the cat's, and functional roles of primate visual channels. One question for present research in this field is whether primates, including humans, have visual pathways organized, like the cat's. into X and Y and other similar retinocortical channels. This question has been studied most extensively in the retina of macaque monkeys. Macaque retinal ganglion cells fall naturally into different cell classes which form parallel anatomical and functional channels to the brain (Gouras. 1968; Schiller and Malpeli. 1978; DeMonasterio. 1978a). The crucial point is whether or not the macaque's ganglion cell classes are functionally similar to the cat's X and Y classes. Several lines of evidence indicate that primate M cells, the retinal ganglion cells that project to the magnocellular layers of the LGN, are functionally similar to X cells. The P cells, ganglion cells that form the input to parvocellular E N . seem to be functionally different from cat X cells and appear to be a primate specialization for color vision.
X and Y cells and the concept of parallel channels In the cat, two of the known functional classes of retinal ganglion cells, denoted the X and Y types (Enroth-Cugell and Robson, 1966) are believed to be of the greatest importance for pattern perception. This is because of their high sensitivity to spatial patterns (Derrington and Lennie. 1982; Enroth-Cugell and Robson. 1984; Shapley and Perry, 1986) and because of their direct connection to the lateral geniculate nucleus which relays visual information to primary visual cortex (Cleland, Dubin and Levick. 1971; Stone and Hoffmann. 1972: So and Shapley. 1979). The X cells are most sensitive to fine detail. sharply defined borders of objects, and small light or dark spots on a background. The Y cells respond most vigorously to coarse patterns. abrupt changes in diffuse illumination, and to large objects moving a t high velocities. The function of these neurons can be understood by considering how they act as filters of spatial and temporal stimuli from the environment. The differences in filtering
PARALLEL CORTICAL CHANNELS
5
characteristics imply that the different cell types are connected to basically different retinal neural pathways. This fact means that the retina is constructed to provide separate and independent views of the world to the brain. The cat's X and Y filters are not the only retinocortical channels. There are many other functional classes, but their visual functions are less well understood than those of X and Y cells.
Spatial and temporal filtering and the X/Y classification The initial discovery of the X and Y classes of cat retinal ganglion cells was made by Enroth-Cugell and Robson (1966. 1984).Their finding was a consequence of their investigation of these cells as spatial and temporal filters of visual signals. The activity of a ganglion cell, the modulation of its impulse rate up and down from a resting level, is strictly controlled by variations of the external visual stimulus. Therefore, each ganglion cell may be considered to be reacting to the visual environment, and passing through to the brain those stimuli to which the cell is tuned. Man-made devices of this general type are called filters. There is a general procedure for studying filters: it is known as systems analysis. In general, any neuron which can be described as having a receptive field can be conceived to be a spatial and temporal filter. Neurophysiologists who study the receptive fields of neurons are always performing a kind of systems analysis (Rodieck, 1973). One experimental test of whether a ganglion cell is acting as a linear or nonlinear filter is to combine steady and time-varying signals at the ganglion cell and to measure whether the average impulse rate depends on the presence of the time-varying signal, as Enroth-Cugell and Robson (1966)did initially. This is a special test of the property of superposition, a necessary condition for a linear system. Superposition means that if the system responds to stimulus A(t! with response a(t), and to stimulus B(t) with response b(t). then it must respond to stimulus A(t)+B(t) with response a(t)+b(t) rather than with some nonlinear combination like a+b+2ab. One particular test of superposition is to see the effect of a time varying stimulus on the response to a steady state stimulus. Enroth-Cugell and Robson used a drifting pattern to cause a time-varying signal in the retina. The pattern was presented on a background of steady light which provided a source of steady input to the ganglion cells. The retinal ganglion cells in which the average impulse rate did not vary when the time-varying signal was present, that is the cells which behaved like linear filters and obeyed superposition, they called X cells. They found many cells in which the average impulse rate increased when a time-varying stimulus was present, behavior which is characteristic of a particular kind of nonlinear filter, and these they called Y cells. To understand the visual function of the retina, it is not enough to know that some cells resemble linear filters and others nonlinear. One needs to know how ganglion cells combine signals from different photoreceptors. A way to answer this question is to examine the linearity of spatial summation of neural signals. A useful tool for this test is a sinusoidal grating pattern. This is a visual stimulus in which the variation in retinal illumination in the direction perpendicular to the
6
CHAPTER 1
bars of the grating is a sinusoidal function of position. The retinal illumination of a sinusoidal grating may be represented formally as
I(x) = 10 + 11 sin(2xkx + 01. where x is position on the retina. 10 is the mean illumination of the grating. 11 is the amplitude of the variation of illumination with position; it is also the maximum illumination minus the mean. The ratio I 1 /I0 is the contrast of the grating, and it indicates how bright or dark the bars of the grating are compared to the mean level of illumination. The fineness or coarseness of the pattern is determined by k, the spatial frequency in cycles/degree of visual angle. 0 is the spatial phase of the grating. The spatial phase is equivalent to the position of the grating on the retina. Since the grating is periodic, all possible positions of the grating are specified by a value of spatial phase between 0 and 271. If a ganglion cell is simply adding u p neural signals from different spatial locations in its receptive field, and there is no difference in the time course of the response from the different signal sources, then positions can be found at which introduction and withdrawal of the grating produce no response (Enroth-Cugell and Robson. 1966). These so-called "null positions" exist because the sine grating may be placed so that the zero crossing of the sine function lies over an axis of symmetry of the receptive field. Then introduction of the pattern produces a s much net positive signal from one side of the field as it produces net negative signal from the other side of the field, and the two signals of equal magnitude but opposite sign cancel when added. Null positions can be found for X cells, but there are no null positions for Y cells (Enroth-Cugell and Robson. 1966). This strengthens the hypothesis that X cells act like linear spatial filters, while Y cells behave like nonlinear spatial filters. Hochstein and Shapley (1976a.b) probed the functional differences between X and Y cells by studying how their sensitivities depended on the spatial phase of a sine grating not just at null positions but a t a whole range of positions of the grating with respect to the cells' receptive fields. In order to study the problem systematically, they introduced the use of the contrast reversing sine grating into visual neurophysiology. This stimulus can be written formally as I(x.t) = 10
+ 11 sin(2xkx + 0) *
M(ft)
where x is position on the retina, t is time, 10 is mean luminance, 11 is the modulation depth, 11/10 is the peak contrast of the contrast reversing grating, and M(ft) is the temporal modulation signal, usually which is modulated between the values of + 1 and - 1, usually a sine wave or a square wave with a temporal frequency f. The responses of X cells follow a sinusoidal function of spatial phase in response to a contrast reversing sine grating (Hochstein and Shapley, 1976a). This is illustrated in Figure 1. The reason for a sinusoidal dependence on spatial phase is a s follows. When spatial frequency and contrast are held fixed and only spatial phase is vaned, the local illumination at a point on the retina is proportional to the sine of the spatial phase. Therefore, because the receptor's response is proportional to the local illumination from the grating pattern and thus
PARALLEL CORTICAL CHANNELS
7
is a locally linear transduction, each photoreceptor's response also is a sinusoidal function of spatial phase. The sum of such Jmctions from all the photoreceptors which converge on the ganglion cell will also be a sinusoidal function of spatial phase. This is a crucial mathematical point worth emphasizing. A summation of sine functions that have as a argument the sum of a fixed term and a variable term can be written as a product of the sine of the fixed term multiplied by another sinusoidal function. This is a consequence of the fact that sinusoidal functions may be viewed as complex exponential functions. This reasoning works for X cells and not Y cells because the process of signal transduction is locally linear, and summation is simple linear addition only in X cells. Null positions are the spatial phases at which the sinusoidal function of spatial phase equals zero.
Fundamen&-'
0 0
E l-
z
0 0
-200
!
-180
I
-120
-60
0
60
I I
I
120
180
SPATIAL PHASE
Figure 1. Spatial phase dependence of the response of a cat X ganglion cell. The amplitudes of the fundamental component, and the second harmonic component, in response to sinusoidal contrast reversal, are graphed as a function of spatial phase of the contrast reversing pattern. The spatial frequency of the grating was near optimal for this cell. It is clear that the fundamental amplitude data are well fit by a sinusoidal function of spatial phase, while the second harmonic data are in the noise. Redrawn from Hochstein and Shapley (1976a). The responses of cat Y cells to contrast reversing sine gratings have a complicated and peculiar dependence on spatial phase. To understand it, one can analyze the modulation of the ganglion cell's impulse rate in response to sinusoidal modulation of the contrast reversing stimulus. An X cell's impulse rate modulation is a t the temporal frequency of modulation of the contrast reversal (Hochstein
CHAPTER 1
8
and Shapley 1976a; Victor and Shapley, 1979a; Enroth-Cugell and Robson. 1984). A Y cell's response, however, contains two frequency components: one at the temporal modulation frequency of the stimulus, the fundamental frequency, and one a t twice the frequency of the stimulus, the second harmonic frequency. The fundamental, linear component of the Y cell's response varies sinusoidally with spatial phase like the X cell's, and thus appears to come from a single linear spatial summation mechanism. However, the nonlinear harmonic component is invariant in amplitude when the spatial phase of the grating pattern is varied. The linear component of the Y cell resolves patterns less well than X cells in its vicinity on the retina, usually two to four times less well (Hochstein and Shapley. 1976b; So and Shapley, 1979). However, the nonlinear component of the Y cell's response has a spatial frequency resolution two to four times higher than the linear component, comparable to that of the X cells in the neighborhood (Hochstein and Shapley. 1976b; So and Shapley. 1979). Furthermore, the peak of the nonlinear response is usually a t a spatial frequency a t which the linear component has vanished into the noise. If one graphs linear and nonlinear response components vs spatial frequency, the two curves intersect (as in Figure 2 redrawn from So and Shapley, 1981). This intersection is what has been called the "Y cell signature" (Spitzer and
al m t
0 Q
m
0
a l ! 0.01
I
0.10
I
1.oo
I
10.00
Spatial Frequency (c/deg)
Figure 2. Responses of a Y cell to gratings of different spatial frequency: the "Y cell signature." These are data from two experiments on one cat Y cell. The empty circles are the fundamental amplitudes to drifting sine gratings, as a function of spatial frequency. The filled diamonds are second harmonic amplitudes of responses to contrast reversal, also as a function of the spatial frequency of the grating pattern. The Y cell signature is the crossing of these two curves at high spatial frequency.
PARALLEL CORTICAL CHANNELS
9
Hochstein. 1985). The spatial phase invariance and the high spatial frequency resolution of the nonlinear response imply that there must be many spatially dispersed mechanisms contributing to the nonlinear response in Y cells. Therefore, the hypothesis was advanced that there are many receptive field subunits, each of which pools photoreceptor signals over a small area. Then subunit signals are summed by the Y cell after a nonlinear transduction (Hochstein and Shapley. 1976b; Victor and Shapley, 1979b). Perhaps the most immediate question about the nonlinear subunits is, where is the nonlinearity located? Some have speculated that the photoreceptors might be intrinsically nonlinear, while others have guessed that the nonlinearity might be in the Y ganglion cell itself. Nonlinear systems analysis of Y cells has revealed that the Y subunit nonlinearity must be embedded in the network of the retina between photoreceptors and ganglion cells (Victor and Shapley, 1979b). The results on Y cells are consistent with the idea that the nonlinear transduction in the cat retina is located a t the bipolar-amacrine connection (Shapley and Perry, 1986). Such amacrine cells would have to serve as a major input to Y cells, but not X cells. It is important to note that the X/Y dichotomy a s presented here and previously (Hochstein and Shapley. 1976ab) is not based on a single property of linear or nonlinear filtering, but rather on evidence of linear spatial filtering measured across a wide range of spatial frequencies, in the case of the X cells, or of the Y cell "signature." as described above and shown in Figure 2. in the case of the Y cells. The procedure for determining whether a cell is X or Y thus requires the measurement of a cluster of spatial filtering properties (cf. Shapley and Perry, 1986). The different receptive field mechanisms of X and Y cells must contribute to their different roles in the cat's perception. Nonlinear subunits provide an excitatory signal to Y cells. That is why there is a burst of impulse firing in a Y cell at each contrast reversal of a grating pattern, and why the Y cell's firing rate is increased when a pattern drifts across the retina. The widely dispersed array of subunits excites a Y cell whenever a pattern moves or changes at any locus within a wide area of the retina. Y cells therefore send an increased signal to the brain whenever a pattern is present, but they indicate in an imprecise manner the location of the pattern. X cells are accurate about location. X cells in the center of the cat retina can give a maximally modulated response when a grating pattern is moved from a null position by as little as 0.1 degree of visual angle, which is a change in position of about 20 microns on the retina (Shapley and Victor, 1986). In the process of pattern perception, Y cells could signal "object present" while X cells could be used for pattern recognition by their very precise responses to fine detail and the relative locations of borders.
Other physiological properties of X and Y cells There are several other physiological and anatomical properties of cat X cells which differentiate them from cat Y cells, besides the spatial filtering characteristics stressed so far. One physiological trait is the more transient nature of a Y cell's response to a bright step of light on a dim background (Cleland et al.. 1971). However, this "sustained-transient" distinction seems to be much more apparent at
10
CHAPTER 1
high contrast than low (Lennie. 1980; Shapley and Victor, 1981). Y cells respond better to large targets and to higher target velocities than X cells (Cleland et al.. 1971: Stone and Fukuda. 1974); this is due, at least qualitatively, to the larger receptive field centers of Y cells. The average conduction velocity of Y cell axons in the optic nerve and tract is considerably higher than that of X cell axons (Cleland et al.. 1971; Fukada. 1971; Stone and Hoffmann. 1972). There is a wide distribution of conduction velocities within each class, but the two distributions are distinct. Several studies agree that the average conduction velocity for X cell axons between optic nerve and chiasm is about 18 m/sec (Fukada, 1971; So and Shapley. 1979).The estimated average velocity for Y cell axons is somewhat more variable because of measurement difficulty: values between 30 and 70 m/sec have been reported (Fukada, 1971; So and Shapley. 1979). These velocity differences cause rather small latency differences a t the lateral geniculate nucleus between X and Y afferent input --- about 1 to 2 msec in comparison to a common visual latency for both X and Y cells of 30-40 msec (Shapley and Victor, 1981) in the light adapted state, and even longer in the dark adapted state.
"W-cells" The "W cells" (Cleland and Levick. 1974a.b; Stone, 1983). which are all the cat ganglion cells not classifiable a s X or Y, are probably composed of several classes of cells. Many of these cells have axons with a slow conduction velocity. Some however have axons with conduction velocities close to that of X cells. Evidence about spatial filtering in these ganglion cells is fragmentary, except for one class: the suppressed-by-contrast cells. These cells behave, crudely speaking, like off-center Y cells with inhibitory instead of excitatory subunits (Troy, Einstein, Schuurmans. Robson, and Enroth-Cugell, 1989). Suppressed-by-contrast cells are very sensitive to contrast. Many of the other 'W cells" are quite unresponsive to contrast, however. One particular class of rarely encountered ganglion cell in the cat is especially interesting for trans-species comparisons with macaque monkeys. This is the class of color-coded units discovered by Daw and Pearlman (1970). These cells react to visual stimuli in a manner similar to blue-on, yellow-off color-opponent cells in the monkey. In the cat the color-coded cells project to the C laminae of the LGN (Daw and Pearlman, 1970; Cleland and Levick. 1974b). and to superior colliculus. The corresponding cell type in the monkey projects only to the parvocellular layers of the LGN (Schiller and Malpeli, 1977, 1978).
Morphology of cat ganglion cells The morphology of the dendritic trees of cat ganglion cells has been found to correlate well with the physiological classifications (Boycott and Wassle. 1974: Leventhal. 1982). The large ganglion cells denoted alpha cells are certainly Y cells. Probably most of the ganglion cells with medium sized cell bodies and smaller dendritic trees, denoted beta cells, are X cells. The variation of dendritic field size with retinal eccentricity of the alpha and beta cells (Boycott and Wassle, 1974) is just like the variation in receptive field center size of Y and X
PARALLEL CORTICAL CHANNELS
11
cells although the correspondence of the receptive field center to dendritic field is not exact. The gamma cell class of Boycott and Wassle was recognized initially to be heterogeneous and while it is clear that there is considerable diversity among this population, it is not known whether these cells form a continuum of morphological types or discrete subsets. They probably correspond to some of the ganglion cells usually referred to as "W cells." Another anatomical property which correlates with physiological classification is the central destination of the cell's axon. The following picture of connectivity is derived from electrophysiological studies and horseradish peroxidase (HRP) labelling. X cells connect primarily with the A or A1 layer of the lateral geniculate nucleus though a small fraction may connect to the C laminae of the LGN. Y cell axons typically have many collaterals and most if not all project to the A or Al. and C laminae of the LGN, as well as to the superior colliculus (Wassle and Illing, 1980; Illing and Wassle. 1981). There is also a massive Y cell projection to the medial interlaminar nucleus (MIN) (Stone, 1983).The central projections of "W cells" are very diverse, including the C laminae of the LGN. the superior colliculus, the accessory optic nuclei in the midbrain (Stone, 1983).None project to where the X cells project, the A and A1 laminae of the LGN.
P (Parvocellular) and M (Magnocellular) pathways Visual neurophysiological study of the macaque monkey is especially interesting to students of human vision because the visual performance of macaques is very similar to that of humans, and the anatomy of the retinocortical pathway is similar in macaque and human. The visual performance of cats and humans is similar on some psychophysical tasks but there are significant differences also. As shown in Figure 3. the contrast sensitivity function for sine gratings has a similar shape in human and cat but the human function is higher and shifted to the right, meaning that the human has a somewhat higher peak contrast sensitivity and higher spatial frequency resolution. The spatial frequency at the peak of the light-adapted cat contrast sensitivity function is about 0.6 c/deg while for the human it is about 3 c/deg: a five-fold smaller spatial scale in human than in cat. The monkey contrast sensitivity function peaks at a slightly lower spatial frequency than human, so that the ratio of space scales of monkey and cat may be closer to 1:4. Besides the spatial difference, the monkey like the human has the capability of color vision: fine wavelength discrimination and color categorization. While the cat may be trained to perform some crude wavelength discriminations, its color vision is much poorer than a primate's. These visual functional differences between cat and monkey should be kept in mind when comparing the visual properties of ganglion cells from the different species. The parallelism of functional pathways begins in the retina, so we begin our consideration of parallel processing in the monkey with retinal ganglion cells. There are clearly some similarities between X and Y ganglion cell classes in cat, and ganglion cell classes in monkey, but there is an ongoing debate about which monkey ganglion cells are most like cat X cells and which like Y cells.
12
CHAPTER 1
There are three clear subdivisions of monkey ganglion cells. One class of cells has been called "tonic" (Gouras. 1968; Schiller and Malpeli. 1978; DeMonasterio. 1978a). These cells have very small receptive fields and are usually selective for the wavelength of the visual stimulus. Such a cell will give sustained responses to light when the wavelength is a t the peak of the cell's spectral sensitivity curve. These cells respond phasically, however, to white light or other broad-band illumination (DeMonasterio, 1978a). They have a concentric center-surround organization, and the surround has a different action spectrum from that of the center, giving the cell color-opponent properties in response to stimuli that cover center and surround. Some "tonic" cells are blue-excitatory, yellow inhibitory cells like the color-opponent ganglion cells in the cat retina which project to the C laminae of the cat's LGN. In the monkey, the "tonic" cells send axons only to the four, dorsalmost parvocellular laminae of the LGN (Schiller and Malpeli. 1977, 1978; Kaplan and Shapley, 1986).
loo0l
Cat \
\
I! 0.1
1.o
10.0
1 100.0
SPATIAL FREQUENCY (c/deg)
Figure 3. Human and feline contrast sensitivity compared. These are photopic contrast sensitivities to sinusoidal grating patterns. Redrawn from Pasternak and Merigan (1981). Another major class of monkey ganglion cells is the group called "phasic" (Gouras, 1968). These ganglion cells have concentric center-surround receptive fields. They respond in a transient manner to a step of broad-band illumination, in this way resembling the "tonic" cells. The time course of their response to monochromatic or highly colored light has been little investigated, but may be transient or sustained. "Phasic" cells show little overt wavelength selectivity though recent work suggests that they may receive antagonistic signals from different cones (Shapley and Kaplan. 1989; Reid and Shapley, 1990) like their LGN targets, the Type IV cells of Wiesel and Hubel (1966; cf.
PARALLEL CORTICAL CHANNELS
13
Derrington, Krauskopf, and Lennie, 1984). The axons of "phasic" ganglion cells project mainly to the Magnocellular layers of the LGN (Schiller and Malpeli, 1977; Kaplan and Shapley. 1986),though there is a small fraction of "phasic" cells which projects also to the superior colliculus. Because the tonic-phasic nomenclature puts too much emphasis on the initially discovered differences in the dynamics of response and not enough on the subsequently discovered differences in chromatic and spatial properties, Shapley and Perry (1986) referred to "phasic" cells as M cells because they project mainly to the Magnocellular layers of the LGN, and "tonic" cells as P cells because their only projection is to the Parvocellular layers in the LGN. A third catch-all class contains all those ganglion cells that are neither M nor P and has been referred to as the "rarely-encountered'' class (DeMonasterio. 1978b). The cells in this group resemble some of the cells classified a s "rarely-encountered'' in the cat (Cleland and Levick, 1974b). None have been found to be wavelength selective (DeMonasterio. 1978b). This group provides the bulk of the retinal input to the midbrain, particularly the superior colliculus. Presumably the "rarely-encountered'' class in the monkey actually Is composed of several distinct classes (or subclasses) of ganglion cell, as is thought to be the case for the "W-cells"in the cat (Stone, 1983).
Functional significance of parallel channels in primate The story about parallel channels for color and brightness acquired a new force in the attempts to explain the layering of the monkey's Lateral Geniculate Nucleus (LGN). For many years there was a mystery about the multi layered structure of the LGN of Old World primates, including humans (Walls. 1942). In the main body of the Old World primate's LGN there are six clearly segregated layers of cells. The four more dorsal layers are composed of small cells and are named the Parvocellular layers. The two more ventral layers, composed of larger neurons, are called Magnocellular layers. Recent work on functional connectivity and the visual function of single neurons has revealed that the different types of cell layers in the LGN receive afferent input from different types of retinal ganglion cells. The evidence on functional connectivity of retina to LGN comes from Leventhal. Rodieck and Dreher (1981) and Perry, Oehler. and Cowey (1984) who labeled axon terminals in specific LGN layers of the macaque monkey with Horseradish Peroxidase (HRP) and looked back in the retina to see which ganglion cells were labeled retrogradely. Direct electrophysiological evidence about retinogeniculate connectivity comes from Kaplan and Shapley ( 1986) who recorded excitatory synaptic potentials (from retinal ganglion cells) extracellularly in different LGN layers and who found that different types of retinal ganglion cell drove different LGN layers. For example, LGN cells that are excited by red light but inhibited by green light (so called red green color opponent neurons) are only found in the Parvocellular layers. These "Red-Green Opponent" LGN cells receive excitatory synaptic input from "Red-Green Opponent" ganglion cells: "Red-Green Opponent" ganglion cells only provide direct excitatory input to Parvocellular LGN neurons of the "RedGreen Opponent" type. The specificity of ganglion cell types exactly
14
CHAPTER 1
matches that of their LGN targets (Kaplan and Shapley. 1986: Shapley and Kaplan, 1989). Our direct evidence about this issue confirmed the earlier correlative results of DeValois. Abramov. and Jacobs (1966) and Wiesel and Hubel (1966) in the LGN, and Gouras (1968). DeMonasterio and Gouras (1975). and Schiller and Malpeli (1977) on retinal ganglion cells. As we will discuss in more detail, Parvocellular neurons are color opponent. This means that their responses, to stimuli which fill their entire receptive fields, change sign from excitatory to inhibitory contingent on the wavelength of the stimulating light (DeValois et al., 1966). The property of color opponency is conferred on them by their ganglion cell inputs, the P cells (Gouras. 1968: Malpeli and Schiller. 1977: Kaplan and Shapley. 1986). From the neuroanatomical work, one may infer that P cells are very numerous and densely packed, with small cell bodies and dendritic trees. Magnocellular neurons are generally thought to give the same sign of response to all wavelengths of light: this property is referred to as broad band spectral sensitivity (Gouras, 1968: Malpeli and Schiller. 1977). However, only some (about half) of the Magnocellular cells are truly broad band: the other Magnocellular neurons are color opponent by the above definition. These are the cells Wiesel and Hubel (1966) called Type lV. cells which have an excitatory receptive field center mechanism that is broad band, and a n antagonistic inhibitory surround mechanism that is selectively sensitive to long wavelength red light. The properties of the Magnocellular neurons, both broad band and Type IV. are determined almost completely by their retinal ganglion cell inputs (Kaplan and Shapley. 1986). The HRP experiments of Leventhal et al. (1981) and Perry et al. (1984) showed that Magnocellular cells receive input from a class of retinal ganglion cells, the M cells, that are somewhat larger in cell body size and dendritic extent than P cells.
Spatial summation in primate ganglion cells The first experiments on spatial summation in monkey ganglion cells produced results which were interpreted to mean that P cells were linear and therefore like X cells in the cat, while M cells were nonlinear and were like Y cells (DeMonasterio. 1978a). However, for various reasons I believe these interpretations were erroneous. The major problem with these experiments was that the stimulus contrast was very high, around 0.6. Even in cat X cells, one can elicit second harmonic responses when such contrasts are used (Hochstein and Shapley. 1976a). This is because the contrast gain of the neural networks that drive cat X cells, and monkey M cells, is very high. Such high gain networks must saturate and create response distortion at relatively low contrast. This brings up the main distinction between the monkey's M and P cells when examined as spatial and temporal filters: the gain of the M cells is much higher than for P cells. I will deal with this topic below. To return to experiments on the X/Y distinction in monkey, it is reasonable to test for linear signal summation a t low contrasts where the cell's response is proportional to contrast. When this is done, as many as 80% of the M cells behave like X cells in response to grating contrast reversal. Their response amplitudes vary sinusoidally with
PARALLEL CORTICAL CHANNELS
15
spatial phase, and their response is predominantly at the fundamental temporal modulation frequency of the stimulus (Kaplan and Shapley, 1982, 1986).In this way, many M cells resemble cat X cells. This result is consistent with the earlier finding that about 80% of Magnocellular neurons, the targets of M cells in the LGN, also are X-like in this respect (Shapley. Kaplan and Soodak, 1981; Kaplan and Shapley, 1982; Blakemore and Vital Durand. 1986). A small fraction of M ganglion cells and their magnocellular target cells were found to behave like Y-cells; they had the same "Y-cell signature" a s Y cells in the cat (Kaplan and Shapley, 1982. 1986). When tested for linear signal summation, almost all P ganglion cells are X-like as are their LGN targets, the parvocellular neurons. However, P cells are very unlike cat X-cells in their contrast gain and other visual characteristics.
Conduction velocity of M and P axons The conduction velocity of the axons of monkey ganglion cells, between optic chiasm and optic tract, h a s been measured with the following results. The average conduction velocity of M cells is 21 m/sec. The conduction velocity of P cells on the average is 13 m/sec. The distributions of velocity are broad and there is considerable overlap of the velocity distributions (Schiller and Malpeli. 1977). Comparing average velocities with the cat, one finds that no monkey ganglion cells have been found that have the very fast conduction speed of Y cells (often more than 50 m/sec). The M cell conduction velocity is close to that of cat X cells (18 m/sec). The P cell conduction velocity is significantly lower than that of cat X cells.
Trans-species comparisons There are two main proposals for grouping monkey ganglion cells in a correspondence with cat ganglion cells. The original idea was that P cells were functionally similar to X cells and M cells were similar to Y cells. This idea originated from experimental results on Parvocellular and Magnocellular LGN neurons (Dreher, Fukada and Rodieck, 1976) and was then applied to their retinal inputs (Schiller and Malpeli. 1978; DeMonasterio. 1978a). This hypothesis was based on the following considerations: (1) the P cells and the X cells were the ganglion cells with the smallest receptive fields. and dendritic trees, at each retinal locus in each retina; (2) the M cells and the Y cells had the axons which were the fastest for their respective species: (3) the response of P cells was more sustained that of M cells, just as the response of X cells was more sustained than that of Y cells. In each case, the argument is based on relative properties between the two cell classes. A somewhat different proposal has been based on visual and spatial filtering characteristics (Shapley. Kaplan and Soodak. 1981; Kaplan and Shapley, 1982; Shapley and Perry, 1986). The proposal is that M cells and their Magnocellular targets actually are composed of two subgroups which correspond to X and Y cells. The more numerous MX variety projects to the Magno-X cells, while the less numerous MY type cell projects to the Magno-Y cells. Part of this proposal is that the
16
CHAFTER 1
monkey P cell group has no exact functional equivalent in the cat, but is a hyperplasic enlargement of the color-coded class of cat ganglion cells that project to the C-laminae. This proposal is based on the following considerations: (1) a large majority of Magnocellular neurons and their M cell inputs are X-like in terms of spatial summation and spatial filtering; (2)a small fraction of Magnocellular neurons and M cells have "the Y-cell signature"; (3)all monkey ganglion cells have transient responses to white light and most have more or less sustained (4) the contrast gain of responses to monochromatic light: Magnocellular neurons and M cells is about ten times greater than that of Parvocellular neurons and P cells, and the contrast gain of M cells is comparable to that of X and Y cells in the cat (see below); (5)the P cells a n d their L G N targets, t h e Parvocellular n e u r o n s , a r e wavelength-selective while cat X cells are not; (6) most M cells make synaptic contacts only with Magnocellular neurons in the LGN, while almost all cat Y cells bifurcate and contact geniculate and superior colliculus neurons. The arguments for the second proposed trans-species comparison rest mainly on absolute comparisons of the visual capabilities of neurons from monkey and cat and detailed comparisons of neural connectivity. There are some more recent findings that support this second version of trans-species comparison. For example, the degree of rod input to near peripheral M cells seems much stronger than to nearby P cells (Purpura, Kaplan and Shapley, 1988); the cell type that most strongly resembles cat X cells in this respect is the MX type. Sustained inhibition or excitation from the receptive field surround is strong in P cells but weak or nonexistent in M ganglion cells, a s in cat X cells (Shapley and Kaplan. 1990; Enroth-Cugell, Lennie. and Shapley. 1975). Whether or not one can make assignments of functional equivalence between monkey and cat, we believe the weight of the evidence is against one proposed functional equivalence: monkey P cells and cat X cells. These two classes differ in the following important ways: the receptive field size distribution with eccentricity; the dendritic tree diameter's dependence on eccentricity, contrast gain, wavelength selectivity, and conduction velocity; and degree of rod input. There is no visually significant way in which these two cell classes are similar.
Contrast gain in M and P pathways Besides their spectral sensitivities, the other property that distinguishes Parvocellular from Magnocellular neurons is contrast gain. To make this clear I provide the technical definition of contrast, then proceed to define contrast gain. Contrast is used technically in vision research to mean the variation in the amount of light in a stimulus, normalized by the mean amount of light. For example, in a periodic grating pattern in which the peak amount of light is P and the least amount of light is T (for trough), then contrast is defined as
C = (P - T)/(P + T).
PARALLEL CORTICAL CHANNELS
17
This definition goes back to Rayleigh (1889), and Michelson (1927). and is equivalent to our earlier definition for sine grating patterns. Contrast is the stimulus variable that the retina responds to under photopic conditions (Robson, 1975; and many others reviewed in Shapley and Enroth-Cugell. 1984). I t is thought that such response dependence on contrast evolved because the contrasts of reflecting objects are invariant with changes in illumination occasioned by shadows, weather, or the passage of the sun. The retina thus sends signals to the brain that are more closely linked to surface properties of reflecting objects than they are to variations in illumination. Contrast gain is defined as the change in response of the neuron per unit change in contrast, in the limit as the contrast goes to zero. Contrast gain is thus the differential responsiveness of the neuron to contrast around the operating point of the mean illumination. The different contrast gains of Parvocellular and Magnocellular LGN neurons are illustrated in Figure 4 (Shapley and Kaplan. unpublished; compare with retinal ganglion cells in Kaplan and Shapley. 1986). As can be seen from the figure, the response as a function of contrast grows much more steeply for the Magnocellular neuron than for the Parvocellular, especially a t low contrast near the behavioral detection threshold. This is a general finding. The ratio of the average contrast gains of the population of Magnocellular neurons to the population average of Parvocellular neurons is approximately eight under midphotopic
n
0
8o
T
e al
E
;3.
w u)
z
0
Q v)
W
U
--II
0.00
0.20
0.40
0.60
0.80
1.oo
CONTRAST
Figure 4. Responses of macaque LGN neurons and a cat X ganglion cell as a function of contrast. One on-center Magnocellular neuron and one off-center (+M-L) Parvocellular neuron are shown, together with the data from an on center cat X cell. Mean luminance was 60cd/m2. Responses were calculated a s the best fitting Fourier component a t 4 Hz. the temporal frequency of the drift. Spatial frequency was chosen to be optimal for each neuron.
18
CHAPTER 1
conditions (Kaplan and Shapley, 1982; Hicks, Lee, and Vidyasagar. 1983; Derrington and Lennie. 1984). Subsequently, Dr. Ehud Kaplan and I showed that this contrast gain difference in LGN neurons is already set up in the retina. The retinal ganglion cells that innervated Magnocellular neurons had about eight times the contrast gain of ganglion cells that provided the excitatory drive for Parvocellular LGN neurons (Kaplan and Shapley. 1986). We still do not know the mechanistic reason for the substantial differences in contrast gain for cells in the two pathways. There are various possible factors that may contribute. The receptive field centers of P cells are smaller than those of M cells, and if the local contrast gains from points in each field are equal, then the larger summing area of the M cells would lead to a higher contrast gain for an optimal sine grating pattern (see Enroth-Cugell and Robson, 1966). Though this factor must contribute something, it does not seem to account for all the difference between M and P. Then there is the possibility that in P cells, but not M cells, there are antagonistic interactions between cone types within the receptive field center. Though this may be the case in many neurons, it is possible to find some P cells in which the center is driven predominantly by one cone type only. Both these hypotheses are considered in the review by Kaplan. Lee, and Shapley (1990). There is still a puzzle here because neither mechanism mentioned above is sufficient to account for all the difference between M and P contrast gains. Whatever the complete explanation is. it must involve retinal mechanisms since the M and P differences in contrast gain begin in the retina. Next, we must consider in more detail the responses neurons in the P and M pathways to chromatic stimuli. This discussion requires a prior analysis of the three cone photoreceptors in the Old World primate retina. and the effect of the properties of the cones on chromatic responses.
Three photoreceptors and spectral sensitivity There are three cone photoreceptor types in human and macaque retinas. The spectral sensitivities of these photoreceptors have been determined for macaque retina by Baylor et al. (1987) and for human retina by Schnapf et al. (1987). using suction electrodes to measure cone photocurrent directly. These direct measurements of photoreceptor spectral sensitivities are in generally good agreement with microspectrophotometric measurements of cone absorption spectra (Bowmaker and Dartnall. 1980: Bowmaker. Dartnall, and Mollon, 1980). The photocurrent measurements agree even more closely with estimates of cone spectral sensitivity based on human psychophysics (Smith and Pokorny, 1975). The Smith and Pokorny fundamentals (estimated cone spectral sensitivities a s measured at the retina after the light has been pre filtered by the lens) are three smooth functions of wavelength peaking at 440 nm (S cones). 530 nm (M cones) and 560 nm (L cones). Please note that the historically older nomenclature about cones denotes the Middle wavelength cones as M cones b u t this is unrelated to the designation of Magnocellular projecting ganglion cells as M cells.
PARALLEL CORTICAL CHANNELS
19
Human sensitivity to light across the visible spectrum under photopic, daylight conditions is called the photopic luminosity function. denoted Vx. It might be thought that the easiest way to determine Va would be to measure psychophysically the sensitivity for increments of light of different wavelength on a photopic background. However, the photopic luminosity function is not measured in this way, mainly because such measurements are variable between and within observers because of the complexity of the visual system (H. Sperling and Harwerth. 1971; King-Smith and Carden. 1976). Rather. the procedure known a s heterochromatic flicker photometry (HFP) h a s been employed. Monochromatic light of a given wavelength is flickered against a white light at a frequency of 20 Hz or above, and the radiance of the monochromatic light is adjusted until the perception of flicker disappears or is minimized (Coblentz and Emerson, 1917). This technique exploits the fact that neural mechanisms that can respond to the color of the monochromatic light are not able to follow fast flicker. The photopic luminosity function has been measured more recently using contour distinctness (Wagner and Boynton, 1972) and minimal motion (Cavanagh. MacLeod, and Anstis. 1987) as response criteria. These measurements agree remarkably well with the luminosity function determined by flicker in the same subjects. The agreement is remarkable because these are such different spatiotemporal stimuli. The luminance of a light source is its effectiveness in stimulating the visual neural mechanism that has as its spectral sensitivity the photopic luminosity function. Thus, the luminance of any light may be computed by multiplying its spectral radiance distribution, wavelength M and L cones and the Photopic Luminosity Function
" "T
Wavelength (nm) Figure 5 . Spectral sensitivity functions of the M and L cones, and the photopic luminosity function (labelled LUM). Data are redrawn from Smith and Pokorny (1975).
20
CHAPTER 1
by wavelength, by the photopic luminosity function and summing the products from all the wavelengths. The spectral sensitivities of the M (530 nm) and L (560 nm) cones and the photopic luminosity function are graphed in Figure 5. The purpose of this graph is to show the degree of overlap of the two longer wavelength cones with the photopic luminosity function, and also to demonstrate the closeness of the luminosity function to the L cone sensitivity especially at longer wavelengths. This becomes significant in the consideration of cone contrasts in chromatic, equiluminant stimuli. I will emphasize later the importance of variation in the photopic luminosity function, VI,. The photopic luminosity curve graphed in Figure 5 is an average of curves from many subjects. There is substantial variation in the normal population in the peak wavelength and particularly in the long wavelength limb of the VI, curve (Coblentz and Emerson, 1917: Crone, 1959). For example, some people who have normal color vision can have half a log unit less relative sensitivity to 620 nm light than the average observer (Coblentz and Emerson, 1917). There is variance also in the reported spectral sensitivity of cones (Baylor et al.. 1987) and in the pigments’ spectral absorption (Bowmaker et al., 1980).
Color exchange and equiluminance Color exchange, or silent substitution (Estevez and Spekreijse (1974. 1982) is a technique for identifying contributions from photoreceptors or other spectral response mechanisms. For any spectral sensitivity function, and two light sources with different spectral distributions within the band of the sensitivity function, one can perform a color exchange experiment that will provide a characteristic color balance for that particular spectral sensitivity. For example, if one chooses two monochromatic lights with wavelengths such that they are equally effective at stimulating the L (560 nm) cone, then temporal alternation between these two lights a t equal quantum flux should cause no variation in the response of the L cone. The same argument works for the photopic luminosity function which presumably is the spectral sensitivity of a neural mechanism that receives additive inputs from M and L cones. Two lights that, when exchanged, produce no response from the luminance mechanism are called equiluminant. The results of a simulated color exchange experiment on cones and a broad band cell with a VI, spectral sensitivity are illustrated in Figure 6. The calculations are based on the spectral sensitivities of the M and L cones and the photopic luminosity function as graphed in Figure 5. The spectral distributions of the light sources were those of the red and green phosphors on standard color television sets, designated P22 phosphors. The red phosphor is narrow band centered around 630 nm. The green phosphor is more broad band centered around 530 nm. Such colored lights have been used in many experiments on color vision (Derrington et al.. 1984; DeValois and Switkes. 1983: Kaplan. Shapley. and Purpura. 1988; Livingstone and Hubel. 1987: Tootell. Silverman, Hamilton, DeValois, and Switkes, 1988). The experiment that is simulated is color exchange between the red (denoted capital R) and green (denoted capital G)phosphors. I have scaled the x axis so that when the G/R ratio is 1.0, the green phosphor
PARALLEL CORTICAL CHANNELS
21
has the same luminance as the red phosphor. When the luminance of the green phosphor is approximately 0.4 that of the red (G/R ratio 0.41, the response of the M (530 nm) cones is nulled. When the G/R ratio is about 1.2. the L (560 nm) cone response is nulled. Notice that the shape of each of these spectral mechanisms is similar: near the null the response vs. G/R ratio forms a V. This is based on the assumption of small signal linearity. a good assumption in the case of macaque P and M pathways (Kaplan and Shapley. 1982; Derrington et al., 1984: Blakemore and Vital-Durand, 1986).
a, fn C
0 L1
fn
a, K
G/R Ratio Figure 6. Color exchange response functions for M and L cones and luminance. The predicted response of the cones to different G/R ratios was calculated from the cross product of the G and R phosphors with the spectral sensitivities of the M and L cones from Figure 5. Also plotted in this graph are response amplitudes of a Magnocellular neuron from a macaque monkey, stimulated by 1 c/deg drifting heterochromatic gratings (unpublished observations of Shapley and E. Kaplan). It is clear that this representative Magnocellular neuron's responses fit the responses predicted by the human photopic luminosity function. One can prove that a spectral mechanism that sums the responses of M and L cones will have a null in a color exchange experiment at a G/R ratio between the nulls of the two cones. If the spectral sensitivity of the summing mechanism is K*L+M. where K is a number between zero and infinity, then when K approaches zero, the color exchange null approaches the M cone null from above. When K goes to infinity, the color exchange null approaches the L cone null, from below. The null of the luminosity curve between the cone nulls in Figure 6 is a case in point. For that curve, K is approximately 2. One
22
CHAPTER 1
must qualify the assertion to include the condition that the photoreceptor signals have the same time course, and that in the process of summation their time courses are unaffected. The existence of sharp V's in color exchange experiments on M ganglion cells and Magnocellular cells is reasonably good evidence that M and L cones have similar time courses under the conditions of those experiments (Lee, Martin, and Valberg. 1988; Kaplan et al. 1988; Shapley and Kaplan, 1989).
In the work that we will discuss next, investigators often have applied a neurophysiological result on monkeys to human perception, and vice versa. This requires an assumption that the visual pathways in humans and monkeys function similarly. Support for this assumption comes mainly from the work of R.L. DeValois and his colleagues (DeValois, Morgan, Polson. and Hull, 1974; DeValois, Morgan, and Snodderly, 1974). They showed that for Old World monkeys, such as rhesus or cynomolgus monkeys which are the usual species studied in neurophysiological experiments on vision, detailed behavioral measurements of the spectral sensitivity function, wavelength discrimination function, and contrast sensitivity function resemble human performance. It is well known that there is a similarity in the neuroanatomy of the retinocortical pathway between humans and Old World monkeys. More recent evidence on similarities in detailed structure and layout of the retina in human and macaque monkeys strengthens the argument for functional similarity (Rodieck. 1988).
Responses of M and P neurons to equiluminant stimuli One particular color exchange experiment has become crucial, namely measuring responses of P and M neurons to equiluminant color exchange. In their large and influential paper on perceptual effects of parallel processing in the visual cortex, Livingstone and Hubel (1987) assumed that because Magnocellular cells were broad band, their responses would be nulled at equiluminance. As the above discussion was aimed to show, this is a non sequitur. To repeat, there could be a whole family of broad band neurons in the visual pathway that summed signals from L and M cones with different weighting factors Ki, such that spectral sensitivity of the i-th mechanism was Ki*L+M. Each mechanism would have a null a t a different point on the G/R axis. The striking thing about M cells and Magnocellular neurons is that, for stimuli that produce responses from the receptive field center mechanism, the position of the null on the color exchange axis is close to that predicted from the human photopic luminosity function, VI (Lee et al., 1988; Shapley and Kaplan. 1989; Kaplan et al.. 1990). There is no more variability in the position of the color exchange null in the neurophysiological data than there is in psychophysical experiments on the luminosity function in humans (Crone. 1959) or in behavioral experiments on macaques (DeValois et al.. 1974a). There are other experiments that indicate that, under stimulus conditions where the center of the receptive field is not the only response mechanism contributing to t h e response. M a n d Magnocellular neurons do not have a color exchange null a t equiluminance. Lee et al. (1988) reported that large disks that stimulate center and surround have nulls away from equiluminance.
PARALLEL CORTICAL CHANNELS
23
Shapley and Kaplan (1989) used heterochromatic sine gratings to study chromatic properties of receptive field mechanisms. Heterochromatic sine gratings are formed by producing a sine grating on, say, the red phosphor of a color monitor, and producing an identical sine grating on the green phosphor except for a n exact 180 degree phase shift. Thus where the red phosphor has a bright red bar the green phosphor has a dark green bar, and vice versa. The sum of these two grating patterns in antiphase yields as a spatial pattern a red green, ergo heterochromatic, grating. Shapley and Kaplan (1989) reported that heterochromatic sine gratings of low spatial frequency may produce no color null in Magnocellular neurons. Derrington et al. (1984), using the technique of modulation in color space, found that many Magnocellular units exhibited properties expected of color opponent cells. Undoubtedly, all these results are related to the earlier work of Wiesel and Hubel (1966) who found that many Magnocellular neurons had a receptive field surround that was more red sensitive than the receptive field center. Such neurons could behave as color opponent cells to stimuli which covered both center and surround if the spectral sensitivities of center and surround were different enough. Similar M ganglion cells were reported by DeMonasterio and Schein (1980).Thus, in psychophysical experiments, if the stimulus is designed to tap the receptive field center of cells in the M pathway, it will elicit a spectral sensitivity function like VL. Such a stimulus will be nulled in a color exchange experiment a t equiluminance. However, should other stimuli be detected by the M-Magnocellular pathway but not isolate the central receptive field mechanism, one might discover a color opponent mechanism driven by M cells. There is another result that indicates a failure of nulling at equiluminance in Magnocellular neurons. This is the second harmonic distortion discovered by Schiller and Colby (1983). In color exchange experiments with large area stimuli, these investigators often found strong frequency doubled responses. Such results were not reported by Derrington et al. (1984) who found frequency doubling rarely (20% of the time) in their experiments. Shapley and Kaplan (1989) reported that frequency doubling was dependent on spatial frequency of the pattern used for color exchange. Center isolating stimuli elicited no frequency doubling but it could be observed when spatial frequency was s o low, less than 0.5 c/deg. that the receptive field surround could contribute to the M cell's response. This also could contribute to failure to achieve sharp psychophysical equiluminance with stimuli of large area or low spatial frequency, even with stimuli that isolated a perceptual mechanism driven only by the M pathway.
Chromatic opponency in P and M cells The basis for wavelength selectivity in the visual pathway is antagonistic (excitatory vs. inhibitory) interactions between signals from different cone types. The simplest type of antagonism is subtraction. There is good evidence for subtractive interactions between M and L cones on P ganglion cells (DeMonasterio and Gouras, 1975; Zrenner and Gouras. 1983) and Parvocellular neurons (DeValois et al.. 1966; Wiesel and Hubel, 1966; Derrington et al.. 1984). The classical evidence is a change in sign of response with wavelength (DeValois et al.. 1966). For
24
CHAPTER 1
example, many P cells that receive opponent inputs from M and L cones have a sign change at a wavelength near 570 nm. The "blue excitatory" cells referred to earlier often have a change from excitation at short wavelengths to inhibition at long wavelengths at around 490 nm. These cells receive excitatory input from S cones and inhibitory input from some combination of M and L cones. The precise mapping of cone types t o receptive field mechanisms is a problem not yet solved, though we have some interesting preliminary results on this problem. Wiesel and Hubel (1966) postulated that color opponent cells received excitatory (or inhibitory) input from one cone type in the receptive field center and antagonistic inputs from a complementary cone type in the receptive field surround. However, the detailed quantitative evidence that would be needed to support or to reject this hypothesis was not available then. One problem is spatially isolating center from surround because receptive fields in the monkey's retina, and presumably in human too, are quite small. Though Wiesel and Hubel's (1966) proposal may be true, there are a number of other possibilities. One alternative hypothesis is that there is mixed receptor input to the receptive field surround, and only or predominantly one cone input to the center of the receptive field (see Paulus and Kroger-Paulus, 1983: Kaplan et al., 1990). However, my own recent research with R. Clay Reid indicates that the chromatic opponent inputs to P cells are specifically wired to individual cone types, for example L+M- and the receptive field surround is just the receptive field mechanism that receives weaker and more spatially diffuse input from one of the cone inputs (Reid and Shapley. 1990).
Comparison of achromatic and chromatic contrast sensitivity The spatial characteristics of vision have been studied for many years by measuring the contrast sensitivity function for sinusoidal gratings (Campbell and Robson. 1968: DeValois, Morgan and Snodderly. 1974b. among many others). These have traditionally been achromatic measurements and the contrast sensitivity has been taken to be the reciprocal of the luminance contrast at psychophysical threshold. More recently, luminance contrast sensitivity has been compared with the spatial frequency dependence of chromatic contrast sensitivity as measured with equiluminant heterochromatic grating patterns (van der Horst, de Weert, and Bouman, 1967: Kelly, 1983: Mullen, 1985). The luminance contrast sensitivity function is band pass while the chromatic contrast sensitivity is low pass and cuts off at a fairly low spatial frequency compared with luminance. In both P and M pathways, response to equiluminant heterochromatic gratings is best a t the lowest spatial frequencies. P cells and Parvocellular neurons respond much better to equiluminant heterochromatic gratings of low spatial frequency because, under those conditions, the antagonistic center and surround become synergistic (DeValois and DeValois. 1975). However, Type IV M cells and their Magnocellular targets in LGN also become more sensitive a t low spatial frequencies of heterochromatic gratings because of their color opponency.
PARALLEL CORTICAL CHANNELS
25
The responses to middle and high spatial frequencies are better when luminance than when equiluminant gratings are used a s stimuli. Thus, if the data were plotted as response vs. G/R ratio, one should expect a dip in response near equiluminance at middle to high spatial frequency. Such results were reported by Mullen (1985). It would be important to measure the equiluminant G/R ratio on the same subject with heterochromatic flicker photometry or minimal motion or minimally distinct border to see whether the same or different spectral mechanisms are a t work in detecting the heterochromatic gratings. K.K. DeValois and Switkes (1983) and Switkes et al. (1988) have demonstrated that heterochromatic grating patterns are detected by spatial frequency channels like those involved in achromatic grating detection (Campbell and Robson, 1968; Graham, 1980).Thus, elevation of threshold for detecting an equiluminant grating is produced by pre exposure to an equiluminant grating of the same spatial frequency, and less elevation of threshold is produced by more distant spatial frequencies. Moreover, color gratings mask and adapt color and luminance gratings but, as we will discuss below, luminance gratings may facilitate detection of color gratings. The work on spatial frequency channels in color throws a new light on receptive field models that have sought to explain chromatic and luminance spatial contrast sensitivity functions in terms of single channel receptive field models (Ingling and Martinez-Uriegas. 1983; Kelly, 1983). The chromatic contrast sensitivity function is an envelope of chromatic spatial frequency channels, just a s the luminance contrast sensitivity function is thought to be an envelope of the well studied achromatic spatial frequency channels. Single channel models, though they may be of heuristic value in summarizing a body of data, must be only a first approximation to a true mechanistic model of these multi channel systems.
Possible neural substrates for contrast sensitivity The M and P pathways must be the conduits for signals about detection of contrast. The high gain M system is well suited to handle detection of grating patterns with low to medium spatial frequencies (Shapley and Perry, 1986; Kaplan et al.. 1990). The numerous P cells may be required to represent veridically the spatial waveform for grating patterns near the acuity limit (Lennie et al.. 1989). The neural basis for photopic contrast sensitivity in the primate is still a controversial topic. Derrington and Lennie (1984) claimed that the contrast insensitive Parvocellular neurons might still support contrast sensitivity performance through the mechanism of "probability summation" among the numerous Parvocellular neurons. However, this argument presupposes that there is response independence among Parvocellular neurons, a prerequisite for probability summation. Furthermore, if probability summation is admitted as a mechanism, then it is not clear why Magnocellular neurons do not also contribute to detection by this mechanism. Probability summation can be viewed as fractional power law summation. Threshold will be lowered by some fractional power of the number of summing elements N. A reasonable estimate might be N-114.If one uses the figure of ten times as many Parvocellular neurons as Magnocellular, one can calculate that the relative increase of Parvocellular neurons' contrast sensitivity compared
CHAPTER 1
26
to Magnocellular contrast sensitivity caused by probability summation would be about 1.8. Since Magnocellular neurons are about ten times more sensitive than Parvocellular neurons, neuron by neuron, one would have to conclude that even with probability summation the Magnocellular neurons ought to be the neural system signalling contrast detection. Recent experiments on lesions of the P and M pathways indicate a larger role for the P pathway in photopic achromatic contrast detection. Merigan and Eskin (1986)found that contrast sensitivity was reduced in monkeys poisoned chronically with acrylamide. This toxic substance appeared to cause anatomical damage preferentially in the P pathway in LGN and retina, though its physiological effects on the P and M pathways have not been studied. Schiller. Logothetis and Charles (1990) placed ibotenic acid lesions in the Magnocellular and Parvocellular layers of macaque monkeys and then studied the lesioned animals' visual performance on several tasks. One of these was contrast detection of a checkerboard pattern. Contrast sensitivity in this task was reduced by Parvocellular but not Magnocellular lesions. This finding is unexpected based on our neurophysiological results and remains something of a mystery. One curious feature of the Schiller group's measurements is the low contrast sensitivity for the checkerboard pattern in control monkeys - the highest sensitivity was 0.1, about one log unit below the best contrast sensitivity for sine gratings. Furthermore, the control monkeys had highest contrast sensitivity at the lowest fundamental spatial frequency of the checkerboard, another unexpected finding. There needs to be more research on the relation of neurophysiology with detection thresholds in this system. Recent neurophysiological results by Purpura. Kaplan and Shapley (1988) indicate that the P cells become visually unresponsive to grating patterns when the mean luminance drops below 0.1 cd/m2 at the rod/cone break. M cells become less sensitive progressively as mean luminance is reduced, but they are so much more sensitive in the light that they remain responsive into the scotopic range. We suggested that these results might mean that spatial vision under scotopic conditions would be dependent on M cell signals. Wiesel and Hubel (1966) and Gielen et al. (1982) reported rod driven responses in Parvocellular LGN cells under scotopic adapting conditions, an apparent contradiction to the results of Purpura et al. (1988). However, both these sets of authors reported that a rod driven Parvocellular neurons was rarely encountered, and moreover, they did not test for spatial vision under scotopic conditions. In the Purpura et al. study, we did observe rod driven responses in P cells but only with very low spatial frequency gratings or diffuse light as spatial stimuli.
Cortical target areas for P and M signals
v1
I
There is indirect evidence that Magnocellular and Parvocellular signals are kept somewhat segregated within striate cortex, V1, Hawken and Parker (1984) and Hawken, Parker and Lund (1988) have shown that cortical neurons with high contrast gain, like Magnocellular neurons, can be found in layer lVc alpha of V1. Color opponent neurons
PARALLEL CORTICAL CHANNELS
27
are located in layer IVc beta, and these are presumably the targets of the LGN afferents from Parvocellular cells. There are subdivisions within the upper layers of the cortex, layers I1 and 111, that may be preferentially influenced by Magnocellular signals. All of layers I1 and I11 receive inputs from layer IVc beta, so. presumably receive Parvocellular signals filtered through the cortical network. However, from experiments on labelling of active cells with 2 deoxyglucose, Tootell. Hamilton and Switkes (1988)found that there was weak but significant labelling of the cytochrome oxidase blobs in layers I1 and 111 of V1 cortex when stimuli of low achromatic contrast were used. The cytochrome oxidase blobs were shown to contain cortical neurons broadly tuned for orientation by Livingstone and Hubel (1984).This finding may mean that Magnocellular and Parvocellular signals converge onto blob neurons. The cytochrome oxidase blobs have been found to form a network throughout macaque V1 (Horton. 1984;Livingstone and Hubel. 1984)and it has been hypothesized that they form a separate system for the analysis of color (Livingstone and Hubel. 1984, 1987).Many of the cells in the blobs are color selective. A test of this idea is whether cells in the inter blob regions of layers 11, I11 of V1 are not color selective or are substantially less color selective than blob neurons. There are recent single unit data on this question from Lennie, Krauskopf and Sclar (1989)and the results indicate that color selectivity in blob cells is not that different from inter blob cells. Furthermore, Tootell. Silverman et al. (1988)used equiluminant color gratings to label layer 11,111 cells with 2 deoxyglucose and found that labelled cells were found throughout the upper layers, though there was stronger labelling of the blobs with diffuse color patterns. These data are essentially consistent with the findings of Lennie et al. (1989).
v2. Using cytochrome oxidase as a marker, Tootell, Silverman. DeValois and Jacobs (1983)demonstrated stripe like structures in secondary visual cortex V2 in macaque monkeys. Subsequently, Shipp and Zeki (1985)and DeYoe and Van Essen (1985)have shown that distinct anatomical regions within primary visual cortex make characteristic connections with regions in macaque V2. Neurons in the blobs of V1 are connected to one of the sets of darker stripes in V2; neurons in the interblob regions of layers 11,111 are connected to stripe like regions of low cytochrome oxidase staining in V2. Livingstone and Hubel (1987).from their measurements in squirrel monkeys, also propose that layer IVb, which receives Magnocellular signals from layer IVc alpha, projects to the alternating dark cytochrome stripes in macaque V2. The functional consequences of this complex sequence of connection is that parallel functional pathways proceed from V1 to V2. Livingstone and Hubel (1987.1988) have made a very detailed psychophysical linking proposition based on the anatomy and receptive field properties of neurons in V1 and V2. They propose that blob cells, connecting to one set of V2 stripes, constitute a system for color vision. The putative Magnocellular pathway from layer IVc alpha through layer MI to the other set of dark V2 stripes is supposed to be important for responding to objects in depth. The interblob neurons in V1, connected
28
CHAPTER 1
to pale stripes in V2. are supposed to be important for form vision, mainly because neurons located in pale stripes in V2 were found to be end stopped, i.e. more strongly responsive to corners and the ends of lines than to long contours (Hubel and Livingstone, 1987). While it is thought provoking, there are problems with the specifics of the Livingstone and Hubel story. One problem is that depth perception seems to depend on both chromatic and achromatic information: this is even indicated in the paper by Livingstone and Hubel (1987). Another problem is that "form" is a poorly defined concept. Certainly the shape of an object may be defined by motion in random dot cinematograms (Braddick, 1974). Even three dimensional shape may be perceived based on dot trajectories in cinematograms (Sperling, Dosher and Landy. 1988). So the assignment of motion perception to one pathway and shape perception to another pathway seems greatly oversimplified. A third problem is, even at the level of detailed neuroanatomy, V1 compartments seem to receive convergent input from Parvocellular and Magnocellular signals. For example, Tootell. Hamilton and Switkes (1988) found that V1 blobs were active not only in response to chromatic equiluminant patterns, but also to patterns of low spatial frequency and low contrast - indicative of input from Magnocellular signals possibly relayed via layer IVc alpha. It is likely from the neuroanatomy that many interblob neurons also receive mixed Magnocellular and Parvocellular inputs. So the rigid segregation of function and simplicity of connectivity between V1 and V2 postulated by Livingstone and Hubel is not likely given the richness of cortical connectivity within V1.
Motion Among the psychophysical proposals of Livingstone and Hubel (1987). the most robust idea is that Magnocellular signals form the basic excitatory drive of the motion pathway. Motion perception is greatly disturbed a t equiluminance. Heterochromatic color gratings appear to move more slowly (Cavanagh. Tyler and Favreau, 1984; Livingstone and Hubel, 1987). Apparent motion is greatly reduced or abolished (Ramachandran and Gregory, 1978; Livingstone and Hubel. 1987). However, Livingstone and Hubel (1987) state that they observed reduction in apparent motion at a G/R ratio that was 20% less than the G / R ratio for equiluminance determined with flicker photometry. This is significant because it may indicate that contrast in a cone mechanism, or some other neural mechanism than the specific V;I mechanism, is being selected in these experiments. There are many experiments on equiluminant vision that have been designed with random dot cinematograms (Ramachandran and Gregory. 1978) or random dot stereograms (Livingstone and Hubel. 1987). These may all be subject to artifacts as a result of chromatic aberration (Flitcroft, 1989). Chromatic aberration may affect spatial frequencies as low a s 4 c/deg, and it certainly may affect experiments with random dot patterns which will be broad band in spatial frequency. Cavanagh et al. (1987) used a minimum motion technique to estimate the cone inputs to the motion mechanism as well a s to determine spatial and temporal tuning of the motion pathway. One of their chief findings was that b cones provide very little input to the
PARALLEL CORTICAL CHANNELS
29
motion pathway. Furthermore, minimum motion and flicker photometry give virtually the same equiluminant point for a given pair of colored lights. This is strong evidence for a single pathway with a single spectral tuning curve, as would be the case if M signals were the front end for the motion signal. However, there is a motion response to equiluminant stimuli: the motion system just signifies the wrong velocity. Furthermore, evidence from motion aftereffects (Cavanagh and Favreau. 1985: Mullen and Baker, 1985) indicates there may be some, albeit weaker, inputs from color opponent signals to the motion pathway. There are many sites along the visual pathway at which interactions may occur (see below) and where a Magnocellular signal might be modulated by Parvocellular signals before it reached the site of motion perception. The evidence for Parvocellular inputs involves suprathreshold motion perception.
Interactions between M and P pathways The independence of P and M pathways as they travel in parallel to cortex from the retina, and through visual cortex is remarkable. However, there are several psychophysical experiments on facilitation of detection and on suppression of detection that indicate substantial coupling between chromatic and achromatic signals. First, there are the results of Switkes et al. (1988) on masking and facilitation of color by luminance, and luminance by color. The most interesting in this paper is the facilitation of detection of equiluminant color patterns by luminance patterns even if the latter were substantially suprathreshold. This suggests that one of the functions of the Magnocellular pathway might be to gate Parvocellular signals into the cortex. This concept would also make sense of Kelly's finding that equiluminant chromatic patterns suffer great losses in contrast sensitivity when stabilized on the retina (Kelly, 1983). It is well known that Parvocellular signals are sustained in time when the stimulus is a colored pattern (e.g.. Schiller and Malpeli. 1978). Yet, an image defined solely by color fades faster and more completely than a luminance pattern. Other studies that suggest a role for luminance signals in facilitating or gating chromatic signals are the investigations of the gap effect by Boynton. Hayhoe and MacLeod (1978) and by Eskew (1989). These studies show that luminance steps near the border of colored test object may facilitate chromatic discrimination. The effect is only significant for colored stimuli that are defined by b cone modulation. Yet the effect does indicate the possibility for interaction between M and P pathways. The search for evidence about P-M interaction should prove a s fruitful and challenging as the previous work on P-M parallelism and independence at lower levels of the visual pathway.
Conclusions The retina contains many visual systems within it. In the cat, the X and Y and the many W classes project from the retina to diverse targets within the brain. The different spatially filtered versions of the world presented by these different neurons are obviously used for different visual functions. In the monkey, because of the extraordinary
30
CHAPTER 1
importance of color vision, the segregation of function between different retinocortical channels is even more obvious than in the cat. The P pathway, from P ganglion cells through the Parvocellular layers of the LGN to primary visual cortex V1. carries signals about color and location. The M pathway carries signals about contrast and motion. These two pathways are kept separated up to visual cortex, but there are important interactions between the pathways in cortex that remain to be explored. Acknowled cfem e n t a : I would like to thank my colleagues Shaul Hochstein, Jonathan Victor, Ehud Kaplan, Jim Gordon, Keith Purpura, Norman Milkman, Clay Reid, Yuen Tat So, and Michael Hawken for their great help. Preparation of this article was partly supported by NIH grant EY 01472. and NSF grant BNS 870606, and by a grant from the Sloan Foundation.
References Baylor, D.A., Nunn. B.J. and Schnapf. J.L. (1987). Spectral sensitivity of cones of the monkey Macaca fascicularis. Journal of Physiology, 390, 145- 160. Blakemore, C.B. and Vital Durand. F. (1986). Organization and post natal development of the monkey's lateral geniculate nucleus. Journal of PhySiolOgy. 380, 453-491. Bowmaker, J.K. and Dartnall. H.J.A. (1980). Visual pigments of rods and cones in a human retina. Journal of Physiology. 298. 501-51 1. Bowmaker. J . K . . Dartnall, H.J.A. a n d Mollon. J . D . (1980). Microspectrophotometric demonstrations of four classes of photoreceptor in an Old World primate, Macaca fascicularis. Journal of Physiology, 298, 131-143. Boycott, B.B. and Wassle, H. (1974). The morphological types of ganglion cells of the domestic cat's retina. Journal of Physiology, 240, 397-419. Boynton, R.M.. Hayhoe. M.M. and MacLeod. D.I.A. (1977). The gap effect: chromatic and achromatic visual discrimination a s affected by field separation. Optica Acta, 24, 159-177. Campbell, F. W. and Robson, J. G. (1968).Application of Fourier analysis to the visibility of gratings. Journal of Physiology, 197. 551-566. Cavanagh. P. and Favreau, O.E. (1985). Color and luminance share a common motion pathway. Vision Research, 26. 1595-1601. Cavanagh, P.. Anstis, S.M. and MacLeod, D.I.A. (1987). Equiluminance: spatial and temporal factors and the contribution of blue sensitive cones. Journal of the Optical Society of America A. 4, 1428-1438. Cavanagh. P.. Tyler, C.W., and Favreau, O.E. (1984) Perceived velocity of moving chromatic gratings. Journal of the Optical Society of America A 1. 893-899. Cleland, B.G.. Dubin. M.W. and Levick. W.R. (1971). Sustained and transient neurones in the cat's retina and lateral geniculate nucleus Journal of Physiology, 228. 649-680. Cleland. B.G., and Levick. W.R. (1974a). Brisk and sluggish concentrically organized ganglion cells in the cat's retina. Journal of Physiology, 240. 421-456.
PARALLEL CORTICAL CHANNELS
31
Cleland, B.G. a n d Levick. W.R. (1974b). Properties of rarely encountered types of ganglion cells in the cat's retina and a n overall classification. Journal of Physiology, 240. 457-492. Coblentz, W.W. and Emerson, W.B. (1917). Relative sensibility of the average eye to light of different colors and some practical applications to radiation problems. Bulletin of Bureau of Standards, 14, 167-236. Crone, R. (1959). Spectral sensitivity in color defective subjects and heterozygous carriers. American Journal of Ophthalmology, 48. 231 235. Daw. N. and Pearlman, A.L. (1970). Journal of Physiology, 211. 125137.
DeMonasterio, F.M. (1978a). Properties of concentrically organized X and Y ganglion cells of macaque retina. Journal of Neurophysiology, 41. 1394-1417. DeMonasterio. F.M. (1978b). Properties of unusual ganglion cells of macaque retina. Journal of Neurophysiology, 41, 1435-1449. DeMonasterio, F.M. and Gouras, P. (1975). Functional properties of ganglion cells in the rhesus monkey retina. Journal of Physiology, 251, 167-195. DeMonasterio, F.M. and Schein, S.J. (1980). Protan like spectral sensitivity of foveal Y ganglion cells of the retina of macaque monkeys. Journal of Physiology, 299. 385-396. Derrington, A. M. and Lennie. P. (1982). The influence of temporal frequency and adaptation level on receptive field organization of retinal ganglion cells in cat. Journal of Physiology, 333,343 -366. Derrington. A.M. and Lennie. P. (1984). Spatial and temporal contrast sensitivities of neurones in the lateral geniculate nucleus of macaque. Journal of Physiology, 357.219-240. Derrington, A.M., Krauskopf. J.. and Lennie. P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology. 357. 241-265. DeValois. K.K. and Switkes. E. (1983). Simultaneous masking interactions between chromatic and luminance gratings. Journal of the Optical Society of America. 73. 11-18. DeValois. R.L., Abramov. I. and Jacobs, G.H. (1966). Analysis of response patterns of LGN cells. Journal of the Optical Socfety of America, 56, 966-977. DeValois. R.L.. Morgan, H.C., Polson, M.C., Mead, W.R., and Hull, E.M. (1974a). Psychophysical studies of monkey vision. 1. Macaque luminosity and color vision tests. Vision Research, 14, 53-67. DeValois, R.L., Morgan, H.C., and Snodderly, D.M. (1974b). Psychophysical studies of monkey vision. 111. Spatial luminance contrast sensitivity tests of macaque and human observers. Vision Research, 14. 75-81. DeValois. R.L. and DeValois, K.K. (1975). Neural coding of color. in E.C. Carterette and M.P. Friedman (Eds.). Handbook of Perception: Seeing V O ~ .5, p.117-166. DeValois, R.L., Snodderly. D.M.. Yund. E.W., and Hepler, N.K. (1977). Responses of macaque lateral geniculate cells to luminance and color figures. Sensory Processes. 1. 244-259. DeYoe, E.A. and Van Essen, D.C. (1985). Segregation of efferent connections and receptive field properties in visual area V2 of macaque. Nature, 317,58-59.
32
CHAPTER 1
Enroth-Cugell, C.. Lennie. P. and Shapley, R. (1975). Surround contribution to light adaptation in cat retinal ganglion cells. Journal of PhySiolOgy., 247. 579-588. Enroth-Cugell, C. and Robson, J. G. (1966).The contrast sensitivity of retinal ganglion cells of the cat. Journal ofPhysiology, 187, 517-552. Enroth-Cugell. C. and Robson, J. G. ( 1984). Functional characteristics and diversity of cat retinal ganglion cells, Inuestgatiue Ophthalmology and Visual Science, 25, 250-267. Eskew. R.T. (1989). The gap effect revisited: slow changes in chromatic sensitivity as affected by luminance and chromatic borders. Vision Research, 29. 7 17-729. Estevez. 0. and Spekreijse. H. (1974). A spectral compensation method for determining the flicker characteristics of the human colour mechanism. Vision Research, 14. 823-830. Estevez, 0.and Spekreijse, H. (1982). The "Silent Substitution" method in visual research. Vision Research, 22, 681-691. Flitcroft, D.I. (1989). The interactions between chromatic aberration, defocus, and stimulus chromaticity: implications for visual physiology and colorimetry. Vision Research, 29, 349-360. Fukada. Y. (1971) Receptive field organization of cat optic nerve fibers with special reference to conduction velocity. Vision Research, 11, 209- 226. Gielen. C.C.A.M., van Gisbergen, J.A.M., and Vendrik. A.J.H. (1982). Reconstruction of cone system contributions to responses of colour opponent neurones in monkey lateral geniculate. Biological Cybernetics, 44, 2 11-221. Gordon, J. and Abramov, I.(1977). Color vision in the peripheral retina. 11. Hue and saturation. Journal of the Optical Society of America, 67, 202-207. Gouras, P. (1968). Identification of cone mechanisms in monkey retinal ganglion cells. Journal of Physiology. 199. 533-547. Gouras, P. (1984). Color Vision. In N. Osborne and G. Chader (Eds.). Progress in Retinal Research, vol. 3, p. 227-262. Pergamon. Oxford. Graham, N. (1980). Spatial frequency channels in human vision: detecting edges without edge detectors. In C. S . Harris, (Ed.). Visual Coding and Adaptability. Lawrence Erlbaum, Hillsdale, New Jersey. Hicks, T.P., Lee, B.B.. and Vidyasagar. T.R. (1983). The responses of cells in the macaque lateral geniculate nucleus to sinusoidal gratings. Journal of Physiology, 337, 183-200. Hochstein. S . and Shapley. R. (1976a). Quantitative analysis of retinal ganglion cell classifications. Journal of Physiology. 262. 237-264. Hochstein, S . and Shapley. R. (1976b). Linear and nonlinear spatial subunits in Y cat retinal ganglion cells. Journal ofPhysiology, 262. 265-284. Hurvich, L and Jameson. D. (1957). An opponent process theory of color vision. Psychological Review, 64. 384-404. Graham, N. (1980). Spatial frequency channels in human vision: detecting edges without edge detectors. In C. S. Harris (Ed.), Visual Coding and Adaptability. Lawrence Erlbaum, Hillsdale. New Jersey. Gregory, R. (1977). Vision with isoluminant colour contrast. 1. A projection technique and observations. Perception, 6, 113-119. Hawken. M . J . and Parker, A.J. (1984). Contrast sensitivity and orientation selectivity in lamina IV of the striate cortex of Old World monkeys. Experimental Brain Research, 54, 367-372.
PARALLEL CORTICAL CHANNELS
33
Hawken, M.J., Parker, A.J., and Lund, J.S. (1988).Laminar organization and contrast sensitivity of direction selective cells in the striate cortex of the Old World monkey. Journal of Neuroscience, 8,3541-3548. Hicks, T.P., Lee, B.B.. and Vidyasagar. T.R. (1983).The responses of cells in macaque lateral geniculate nucleus to sinusoidal gratings. Journal of Physiology, 337, 183-200. van der Horst, G.J.C., de Weert. C.M.M. and Bouman, M.A. (1967). Transfer of spatial chromaticity contrast a t threshold in the human eye. Journal of the Optical Society of America, 57. 1261-1266. Horton. J . C . (1984). Cytochrome oxidase patterns: a new cytoarchitectonic feature of monkey striate. Philisophical Trans of the Royal Society of London [Biol.),304. 199-253. Hubel, D.H. and Livingstone. M.S. (1987).Segregation of form, color, and stereopsis in primate area 18. Journal of Neuroscience, 7. 33783415. Illing. R.B. and Wassle. H. (1981).The retinal projection to the thalamus in the cat: A quantitative investigation and comparison with the retinotectal pathways. Journal of Comparative Neurology, 202, 265- 285. Ingling, C.R. and Martinez Uriegas, E. (1983).Simple opponent receptive fields are asymmetrical: G cone centers predominate. Journal of the Optical Society of America, 73. 1527-1532. Ingling, C.R. and Tsou. B.H.P. (1988). Spectral sensitivity for flicker and acuity criteria. Journal of the Optical Society of America A , 8 . 1374-1378. Kaplan, E. and Shapley, R. (1982). X and Y cells in the lateral geniculate nucleus of macaque monkeys. Journal of Physiology, 330, 125-143. Kaplan. E. and Shapley, R. (1986).The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proceedings of the National Academy of Science USA, 83. 2755-2757. Kaplan, E.. Shapley. R., and Purpura, K. (1988).Color and luminance contrast as tools for probing the organization of the primate retina. Neuroscience Research (suppl.), 2, 151-166. Kaplan. E., Lee, B.B., and Shapley, R. (1990).New views of primate retinal function. In Osborne and Chader (Eds.). Progress in Retinal Research, vol. 9,Pergamon, Oxford. Kelly, D. (1983).Spatiotemporal variation of chromatic and achromatic contrast thresholds. Journal of the Optical Society of America, 73,742750. King Smith, P.E. and Carden, D. (1976).Luminance and opponent color contributions to visual detection and adaptation and to temporal and spatial integration. Journal of the Optical Society of America, 66.709717. Lee, B.B., Martin, P.R. and Valberg. A. (1988). The physiological basis of heterochromatic flicker photometry demonstrated in the ganglion cells of the macaque retina. Journal of Physiology, 404,323-347. Lennie, P. ( 1980).Perceptual signs of parallel pathways. Philosophical Trans. Royal Society of London, 290. 23-37. Lennie, P., Trevarthen C.. Waessle, H., and Van Essen, D. (1989) Parallel processing of visual information. In L. Spillman and J. Werner (Eds.). Visual Perception: The Neurophysiological Foundations, Academic, New York.
34
CHAPTER 1
Leventhal. A.G. (1982). Morphology and distribution of retinal ganglion cells projecting to different layers of the dorsal lateral geniculate nucleus in normal and Siamese cats. Journal of Neuroscience, 2, 10241042. Leventhal, A.G., Rodieck, R.W. and Dreher, B. (1981). Retinal ganglion cell classes in the old world monkey: morphology and and central projections. Science. 213. 1139-1142. Livingstone. M.S. and Hubel, D.H. (1984). Anatomy and physiology of a color system on the primate visual cortex. Journal of Neuroscience, 4, 309-356. Livingstone. M.S. and Hubel. D.H. (1987). Psychophysical evidence for separate channels for the perception of form, color, motion, and depth. Journal of Neuroscience, 7 . 34 16-3468. Livingstone. M.S. and Hubel. D.H. (1988). Segregation of form, color, movement, and depth: anatomy, physiology, and perception. Science, 240.740-749. MacLeod. D.I.A. and Boynton. R.M. (1979). Chromaticity diagram showing cone excitation by stimuli of equal luminance. Journal of the Optical Society ofAmerica. 69. 1183-1186. Merigan, W.H. and Eskin, T. A. (1986). Spatiotemporal vision of macaques with severe loss of P beta ganglion cells. Vision Research, 26. 1751-1761. Michelson, A.A. (1927). Studies in Optics. University of Chicago Press, Chicago. Mullen. K. (1985). The contrast sensitivity of human colour vision to red green and blue yellow chromatic gratings. Journal of Physiology, 359. 381-400. Mullen. K.T. and Baker, C.L. (1985). A motion aftereffect from an isoluminant stimulus. Vision Research, 25. 685-688. Pasternak, T. and Merigan, W. (1981). The luminance dependence of spatial vision in the cat. Vision Research, 21. 1333 -1340. Paulus. W. and Kroger Paulus, A. (1983).A new concept of retinal colour coding. Vision Research, 23. 529-540. Perry, V.H.. Oehler. R., and Cowey. A. (1984). Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience, 12, 1101-1123. Purpura. K.. Kaplan, E.. and Shapley. R.M. (1988). Background light and the contrast gain of primate P and M retinal ganglion cells. Proceedings of the National Academy of Science USA, 8 5 . 4534-4537. Ramachandran, V.S. and Gregory, R. (1978). Does colour provide a n input to human motion perception? Nature, 275. 55-56. Robson, J. G. (1975). Receptive fields: neural representation of the spatial and intensive attributes of the visual image. In E. C. Carterette and M. S. Friedman (Eds.), Seeing. Vol. 5 of Handbook of Perception. Academic Press, New York. Rodieck. R.W. (1988). The Primate Retina. Comparative Primate Biology, 4. 203-278. Rohaly. A.M. and Buchsbaum, G. (1988). Inference of global spatiochromatic mechanisms from contrast sensitivity functions. Journal of the Optical Society of America A. 5. 572-576. Rohaly. A.M. and Buchsbaum. G. (1989). Global spatiochromatic mechanism accounting for luminance variations in contrast sensitivity functions. Journal of the Optical Society of America A, 6. 312-317.
PARALLEL CORTICAL CHANNELS
35
Schiller, P.H. and Colby. C.L. (1983).The responses of single cells in the lateral geniculate nucleus of the rhesus monkey to color and luminance contrast. Vision Research, 23, 1631-1641. Schiller, P.H. and Malpeli. J.G. (1977). Properties and tectal projections of monkey ganglion cells. Journal of Neurophysiology, 40, 428-445. Schiller. P.H. and Malpeli, J.G. (1978).Functional specificity of lateral geniculate laminae in the rhesus monkey. Journal of Neurophysiology. 41. 788-797. Schnapf. J.L.. Kraft, T.W. and Baylor, D.A. (1987).Spectral sensitivity of human cone photoreceptors. Nature, 325. 439-441. Shapley, R. and Enroth-Cugell. C. (1984). Visual adaptation and retinal gain controls. In N. Osborne and G. Chader (Eds.), Progress inRetinaZ Research Vol. 3. p. 263-346.Pergamon, Oxford. Shapley, R. and Kaplan, E. (1989).Responses of magnocellular LGN neurons and M retinal ganglion cells to drifting heterochromatic gratings. Investigative Ophthalmology and Visual Science Supplement, 30, 323. Shapley. R. and Perry, V.H. (1986).Cat and monkey retinal ganglion cells and their visual functional roles. Trends ln Neuroscience, 9, 229-235. Shapley, R.. Kaplan. E. and Soodak. R. (1981).Spatial summation and contrast sensitivity of X and Y cells in the lateral geniculate nucleus of the macaque. Nature, 292, 543-545. Shipp, S. and Zeki, S. (1985).Segregation of pathways leading from area V2 to areas V4 and V5 of macaque visual cortex. Nature, 316, 322-325. Smith, V.C. and Pokorny. J. (1975).Spectral sensitivity of the foveal cone photopigments between 400 and 500 nm. Vision Research, 16. 161 - 172. Sperling. H.G and Harwerth, R.S. (1971).Red-green cone interactions in the increment-threshold spectral sensitivity of primates. Science, 172. 180-184. Stromeyer, C.F., Cole, G.R., and Kronauer. R.E. (1987).Chromatic suppression of cone inputs to the luminance flicker mechanism. Vision Research, 27, 1 1 13-1137. Switkes, E.. Bradley, A., and DeValois, K.K. (1988). Contrast dependence and mechanisms of masking interactions among chromatic and luminance gratings. Journal of the Optical Soceity of America A, 7, 1149-1162. Tootell. R.B.H.. Silverman, M.S., DeValois. R.L., and Jacobs, G.H. (1983).Functional organization of the second visual cortical area in primates. Science, 220, 737-739. Tootell, R.B.H.. Silverman. M.S., Hamilton. S.L.. DeValois. R.L.. and Switkes. E. (1988). Functional anatomy of macaque striate cortex 111. Color. Journal of Neuroscience, 8, 1569-1593. Tootell. R.B.H., Hamilton, S.L., and Switkes, E. (1988). Functional anatomy of macaque striate cortex lV. Contrast and magno-parvo streams. Journal of Neuroscience, 8. 1594-1609. Troscianko. T. and Harris, J. (1988). Phase discrimination in chromatic compound gratings. Vision Research, 28. 1041- 1049. Wagner, G. and Boynton, R.M. (1972).Comparison of four methods of heterochromatic photometry. Journal of the Optical Society of America, 62. 1508-1515.
36
CHAPTER 1
Wiesel, T.N. and Hubel, D.H. (1966). Spatial and chromatic interactions in the lateral geniculate body of the rhesus monkey. Journal of Neurophysiobgy, 29,1 1 15-1156. Zrenner, E. and Couras. P. (1983).Cone opponency in tonic ganglion cells and its variation with eccentricity in rhesus monkey retina. In J.D. Mollon and L.T.Sharpe (Eds.). Colour Vision, Academic, London. p. 211-224.
A lications of Parallel Processing in Vision ~.Trarman(mitor) 8 1992 Elsevier Science Publishers B.V. All rights reserved
37
Parallel Processing in Human Vision: History, Review, and Critique BRUNO G. BREITMEYER
Introduction We live and move about in a visual world composed of richly varied surfaces, objects and events that can be characterized along a relatively small set of distinct perceptual dimensions or attributes. It would be reasonable to expect efficient information gathering systems like our brains to have incorporated into their functional design distinct subsystems, each specialized to process one ar a few of these limited dimensions and attributes. Hence, the system as a whole would comprise a set of parallel information processing channels. Recently, the existence of parvocellular (P) and magnocellular (M) streams of processing in the monkey visual system has provided a particularly useful and popular basis for models of parallel processing of form/color and depth/motion in primate vision (Livingstone and Hubel. 1987, 1988; DeYoe and Van Essen. 1988). However, the proposal that the visual system performs two broadly separable and parallel types of functions - one concerned with object recognition and identification (figure, form, color), the other with spatial and spatiotemporal relations (ground, position, depth, motion) - has been around for some time (Ingle, 1967; Held, 1968), and the neural and behavioral bases supporting these and related functional distinctions have been increasingly elaborated over the past 25 years (Schneider. 1967; Trevarthen, 1968; Weiskrantz. 1972; Humphrey, 1974; Breitmeyer and Ganz. 1976; Stone, Dreher and Leventhal, 1979; Ungerleider and Mishkin, 1982; Mishkin, Ungerleider and Macko, 1983; Ungerleider, 1985; Previc. 1990; Weisstein et al.. this volume). In particular, recent anatomical and physiological studies of cortical pathways (Shipp and Zeki, 1985; Ungerleider, 1985; Van Essen, 1985; Schiller, 1986; Maunsell, 1987; Livingstone and Hubel. 1988; DeYoe and Van Essen, 1988; Desimone and Ungerleider, 1989: for a review, see also Shapley, this volume) as well as midbrain pathways (Goldberg and Robinson, 1978; Wurtz and Albano, 1980; Schiller, 1986) have increased our understanding of the later processing stages in higher brain centers participating in these functions.
38
CHAPTER 2
Over the same time period, related developments also have revealed parallel processing early along afferent visual pathways. As noted by Stone (1983). the concept of parallel processing at early visual levels has had a long history. For instance, the existence of separate rod and cone systems was established during the latter half of the 19th century. More recently, the existences of different chromatic and achromatic (luminance) channels (de Lange. 1958; Kelly and van Norren, 1977) of direction, orientation, and spatial-frequency selective mechanisms (Sekuler and Ganz. 1963; Campbell and Kulikowski, 1966; Pantle and Sekuler, 1968; Blakemore and Campbell, 1969) and of separate pathways for perception of luminance increments (brightness) and decrements (darkness) (Jung. 1961, 1973) also have been established psychophysically and, still more recently, anatomically (Schiller, 1982. 1984; Schiller. Sandell and Maunsell, 1986; Tootell et al.. 1988a,c). While these and other developments already pointed to several forms of parallel processing of visual information, it was not until about two and a half decades ago that the discovery of X and Y ganglion cells in cat retina by Enroth-Cugell and Robson (1966) initiated, in the early and mid 1970s. concerted and extensive efforts among visual psychophysicists and perception psychologists to investigate in humans the existence and properties of analogues of these two neural pathways and their relation to the several other known types of distinct visual mechanisms. Hints of psychophysical analogues of these distinctions already existed a t and prior to this time; however, they were not yet expressed within a n explicitly articulated parallel-processing framework based on plausible neurophysiological substrates such as the X and Y pathways. For instance, in the early 1950s Saucer (1954) hypothesized that the human visual system contains motion-processing analyzers or channels which have properties distinct from channels processing form and pattern detail. In the late 1960s Pantle and Sekuler (1969) empirically supported such a distinction. They demonstrated via selective adaptation techniques that the response of human visual mechanisms sensitive to direction of motion saturated at a low contrast of about 0.2, whereas the response of mechanisms sensitive to orientation of form (Campbell and Kulikowski, 1966) increased monotonically up to a maximal contrast of 1.0. The distinction was given further empirical support in the late 1960s by Robson (1966). van Nes et al. (1967) and in the early 1970s by Tulunay-Keesey (1972). For instance, van Nes et al. (1967) reported separate form and flicker thresholds for drifting stimuli containing low spatial and high temporal frequencies. Similarly, Tulunay-Keesey (1972) showed that one can obtain separate thresholds for detecting the flicker component or else the pattern component of a line flashing on and off at varying temporal frequencies. Tulunay-Keesey (1972) found that flicker detectors were generally more sensitive over the entire range of temporal frequencies she used (0.3 - 30.0 Hz) than were pattern detectors. As we shall see below. the distinctions between flicker/motion perception and form/pattern perception on the one hand and transient and sustained channels on the other played a crucial role in the initial studies of parallel processing in human vision. Although it continues to play a n important role to this day, it has met with criticism and calls for revision (Harris, 1980; Derrington and Henning. 1981; Green, 1981,
HISTORY
39
1984: Kelly and Burbeck, 1984) and has been augmented by additional distinctions based on visual latency, various types of masking, and more recent distinctions drawn with respect to the processing of color, texture, as well as depth (Zeki. 1978; Cavanagh. Tyler and Favreau. 1984; Livingstone and Hubel, 1987, 1988; DeYoe and Van Essen, 1988; Cavanagh, 1989. Logothetis et al.. in press; Schiller, Logothetis and Charles, in press; Schiller and Logothetis, in press). In the following we will discuss the early developments in studies of parallel visual processing in humans, their promises and problems, and their more recent and current trends.
Developments in the sustained/transient dual channel approach
Form/pattem andflicker/ motion Despite attendant problems to be discussed below, the sustained/transient terminology used by some physiologists (Cleland et al.. 1971; Cleland and Levick. 1974; Bolz et al., 1982) was initially adopted broadly and extensively by psychophysicists investigating perceptual signs of underlying parallel pathways. Tolhurst (1973) and Kulikowski and Tolhurst (1973) were among the first investigators to adopt the sustained/transient distinction to describe pattern-sensitive and motion- or flicker-sensitive channels in human vision. Tolhurst (1973) compared thresholds for detecting stationary sinusoidal gratings to thresholds for detecting gratings which drifted or were temporally modulated in counterphase at a rate of 5 Hz. For spatial frequencies of 4 c/deg or less, Tolhurst (1973) found threshold sensitivities for temporally modulated gratings to be higher than sensitivities for stationary gratings; above 4 c/deg the two sensitivities were equal. Similar results were reported by Kulikowski and Tolhurst (1973) when comparing flicker to pattern threshold sensitivities for temporally counterphased gratings. Thus the pattern-sensitive sustained channels were characterized by preference for higher spatial frequencies and lack of preference for temporally modulated stimuli over stationary ones. Transient channels, in contrast, were characterized by preference for lower spatial frequencies and temporally modulated or moving stimuli. Moreover, Tolhurst (1973) and Kulikowski and Tolhurst (1973) drew explicit comparisons between these psychophysically defined sustained and transient channels and the X and Y classes of cells studied in cats.
Additional approaches and findings Related approaches and interpretations have been adopted by several other investigators (Breitmeyer and Ganz, 1976: Legge. 1978; Green, 1984). The common feature in these investigations is the repeated finding of differences between the spatiotemporal response characteristics of the visual system at low and high spatial frequencies. Besides being more sensitive to rapid motion and flicker, evidence indicates that the low spatial frequency transient channels, relative to the higher spatial frequency sustained ones, are also characterized by a higher sensitivity to abrupt a s compared to gradual stimulus onset
40
CHAPTER 2
(Breitmeyer and Julesz, 1975; Tolhurst. 1975 a.b; Wilson, 1978). briefer temporal summation (Brown and Black, 1976; Breitmeyer and Ganz, 1977; Watson and Nachmias, 1977; Legge, 1978). broader orientational tuning (Burbeck and Kelly, 1981; Gorea, 1979; Kelly and Burbeck. 1987). and a greater susceptibility t o spatiotemporal adaptation (Bowker and Tulunay-Keesey, 1983).
Visual latency In addition, the transient and sustained channels also differ in response latency. Breitmeyer (1975) demonstrated that simple reaction time (RT) to the onset of briefly flashed gratings set at a 60% contrast increased monotonically with spatial frequency. RT was roughly 200 ms at a spatial frequency of 0.5 c/deg and increased to a value ranging between 300 and 350 ms at a spatial frequency of 11.0 c/deg. The monotonic increase in RT with spatial frequency, although somewhat attenuated, was maintained even when the gratings were equated for subjective contrast (Breitmeyer, 1975). Similar RT findings were reported by other investigators for grating onset (Lupp, Hauske and Wolf, 1976; Vassilev and Mitov, 1976; Parker, 1980; Breitmeyer, Levi and Harwerth. 1981a). offset (Parker, 1980; Breitmeyer et al., 1981a: Long and Gildea, 1981). as well as contrast reversal (Parker, 1980).In a somewhat different paradigm, Tolhurst (1975a) showed that the RT distribution to gratings just above contrast threshold differed as a function of spatial frequency. For a 0.2 c/deg grating the RT distribution was bimodal, with each mode corresponding to the onset or offset of the grating. According to Tolhurst (1975a). these modes reflected the probabilistically distributed activities of transient channels to grating on- and offset. For a higher, 3.5 c/deg grating activating predominantly sustained channels the RT distribution was unimodal and, moreover, of longer latency than the onset RTs to the low frequency grating. Similar findings have been reported by Breitmeyer et al. (1981a) a s well a s by Schwartz and Loop (1982, 1983) in their study of transient luminance and sustained color-opponent channels.
Visual masking Various masking paradigms also have been employed to study the properties of, and interactions between, sustained and transient channels. Legge (1978) employed a technique of measuring contrast thresholds when the onsets and offsets of test gratings of variable spatial frequency and duration were transiently masked by the 20-ms presentation of a mask grating. Relative to the no-mask condition the effect of the mask was to increase threshold temporal summation for test gratings having spatial frequencies below but not above 3 c/deg. This was attributed to the transient channels' loss of sensitivity at low spatial frequencies when a transient mask is used; the brief mask failed to affect the higher spatial frequency sustained channels and thus did not alter their temporal summation properties (Legge. 1978). Related findings using uniform field flicker (UFF) masks (Breitmeyer et al., 1981a) and uniform flashes of light (Green 1981. 1984) have been reported. For instance, Breitmeyer et al. (1981a).showed that, relative to a steady background, UFF increased flicker thresholds and on- and
HISTORY
41
offset RTs at low but not high spatial frequencies. Similarly, Green (1981) showed that transient masking a t the onset and offset of a uniform conditioning flash (Crawford, 1947) was produced with low but not high spatial frequency test gratings. In both of these studies the transient masking produced by either the UFF or the uniform conditioning flash affected primarily the response of low spatial frequency transient channels while leaving that of the higher spatial frequency sustained channels unaltered. Several theories of visual masking and information processing based on the sustained/transient channel distinction have been proposed (Matin, 1975; Weisstein. Ozog and Szoc, 1975: Breitmeyer and Ganz. 1976). In their particular theory Breitmeyer and Ganz (1976) dealt not only with the masking phenomena discussed above, which tap effects occurring within one or the other of the two types of channels, but also with metacontrast masking (Stigler, 1910: Alpern. 1953; Weisstein, 1972). which indexes interactions between the two channels. According to the sustained/ transient theory of masking. the inverted, U-shaped function relating the magnitude of metacontrast to the onset asynchrony between the target stimulus and the following mask stimulus results from post-retinal inhibition of the target's long-latency sustained channels by the mask's short-latency transient channels. Psychophysical evidence for the reverse inhibition of transient channels by sustained ones has been obtained from studies of target disinhibition in metacontrast (Breitmeyer, 1978: Breitmeyer. Rudd and Dunn, 1981b). inhibition of the transient motion detecting channels by sustained pattern channels (von Gruenau, 1978: Banta and Breitmeyer, 1985) and asymmetric interference between the low and high spatial frequency components of compound gratings (Hughes. 1986). Inhibitory interactions between X and Y cells, the neural analogs of sustained and transient channels, have been reported in the lateral geniculate nucleus (LGN) and cortical area 17 of cat (Hoffmann. Stone and Sherman, 1972: Singer and Bedworth. 1973: Singer, 1976; Tsumoto and Suzuki. 1976). although, as noted by Lennie (1980a). in monkey such interactions probably occur no earlier than visual cortex. A more extensive and detailed discussion of the roles and interactions of sustained and transient channels in visual masking and their neural analogues can be found in Breitmeyer (1984).
Controversies and criticisms The above findings and interpretations are controversial and have met with substantial criticism. In the last 10 years. several articles have appeared which have questioned the validity of the sustained/transient distinction in human vision. The disputes can be regarded a s focusing on two broad and interrelated issues. One concerns the evidence and psychophysically defined properties used to establish the distinction between sustained and transient channels in humans (as well a s subhuman organisms): the second concerns the link between the psychophysics and the neurophysiology of parallel pathways and the attendant problem of naming, identifying and classifying channels or pathways defined via either physiological or psychophysical methods.
42
CHAPTER 2
Evidence for the psychophysical distinction between sustained and transient channels Turning to the first issue, several investigators -- among them Lennie ( 1980b), Burbeck ( 1981). Derrington and Henning ( 1981). Panish, Swift and Smith (1983), Green (1984). Kelly and Burbeck (1984) -- have argued that the evidence derived from studies such as Tolhurst's (1973) or Kulikowski and Tolhurst's (1973) provides neither a valid nor a consistent or clear set of criteria for distinguishing between transient and sustained channels. Their objections are based on methodological as well as definitional grounds.
M e thodoLogical controuersies The methodological objection is t h a t Tolhurst (1973). Kulikowski and Tolhurst (1973) and a number of similar studies (Tulunay-Keesey, 1972; Breitmeyer and Julesz, 1975: King-Smith and Kulikowski. 1975: Tolhurst, 1975b: Breitmeyer et al., 1981a) used psychophysical methods which relied on separate subjective threshold criteria for flicker/motion detection and for form/pattern detection. When supposedly "criterion-free'' or forced-choice methods are employed instead, differences between form/pattern and flicker/motion thresholds may be eliminated (Lennie, 1980b: Derrington and Henning. 1981) or, as in Burbeck's (1981) study, reversed a t all but the lowest spatial frequencies. As a first response to this criticism one should note that no psychophysical threshold measuring procedure is truly criterion-free. Perhaps one can reduce the use of two or more subjective criteria to a single detection criterion by using a forced-choice or similar procedure: but the problem inherent in using some criterion is not eliminated. In particular, in Burbeck's (1981) study a test grating, temporally modulated in counterphase a t threshold, was always compared to a reference stimulus which could be either a UFF of the same temporal frequency slightly above threshold or else a stationary grating of the same spatial frequency also slightly above threshold. This procedure hardly eliminates subjective criteria: rather, it forces the observer to substitute subjective or phenomenal criteria reflecting the experimenter's choices or standards of reference percepts for those of his/her own choosing. In addition, as noted by both Bowker and Tulunay-Keesey (1983) and Green (1984). Burbeck (1981) as well as Derrington and Henning (1981) overlook a n alternative interpretation of their data. Burbeck (1981) and Derrington and Henning (1981) argued that the channels most sensitive to low spatial frequency counterphase gratings not only responded to the temporal aspects of the gratings but also encoded their spatial orientation. However, since counterphase modulated gratings can be detected by mechanisms tuned to direction of motion (Levinson and Sekuler. 1975). the orientation discrimination could have been performed by such motion-selective as opposed to orientation-selective mechanisms. I have discussed additional concerns regarding Burbeck's (1981) criterion-free methods elsewhere (Breitmeyer, 1984). For now I would like to proceed to a second, more general consideration of the use of two separate
HISTORY
43
subjective criteria, such as flicker/motion and from/pattern. employed in the method of adjustment versus use of a single criterion, whatever it may be, presumably adopted in a forced-choice procedure. First, as shown by Pantle (1983).even when subjects are placed in a forced-choice paradigm, compelling evidence for a distinction between low spatial frequency transient channels and higher spatial frequency sustained channels can be obtained. Similarly, using "objective" forced-choice techniques, Watson and Robson (1981) obtained results consistent with the existence of two distinct sets of mechanisms. One of these mechanisms is selective for low, the other for high temporal frequencies. Watson and Robson (1981)believe these two mechanisms to correspond to the sustained and transient mechanisms investigated by Kulikowski and Tolhurst ( 1973),whose technique relied on observers shifting from one "subjective" threshold criterion to another. A second noteworthy issue is exemplified by Stone's (1983)more fundamental points made in his monograph entitled Parallel Processing in the Visual System. There Stone argues that the psychophysical data obtained with forced-choice methodology used by investigators such as Lennie (1980b)and, by implication, others (e.g., Derrington and Henning, 1981; Green, 1983) is of no direct relevance to the results obtained with adoption of two separate subjective criteria as in, say, KulikowsM and Tolhurst's (1973) study and, again by implication, other studies using similar methods (e.g.. Breitmeyer and Julesz. 1975; Casima. Blake and Lema. 1977: Essock and Lehmkuhle. 1982). Nor, Stone (1983)claims further, do data obtained with forced-choice methods provide alternative explanations to results obtained with the method of adjustment. Rather, they reveal the obvious (but worth repeating) point that different methods yield different results, and more specifically, that the method of adjustment and forced-choice methods do not provide equivalent tests of visual performance. As noted by Essock and Lehmkuhle (1982). this is particularly evident when comparing the pattern task criterion of "spatial structure" (e.g.. the discrimination of the distinct bars of a grating) used in the method of adjustment to, say, the two-alternative forced-choice "pattern" task criterion for detecting any spatial contrast on a n otherwise uniform field. Merely calling both tasks by the same name does not eliminate the fact that the perceptual contents to which the observers attend in the two tasks are not equivalent. Stone (1983) uses the following example to illustrate this important point. In studies of "blind sight." tests relying on subjective, conscious experience render the patient quite blind, whereas forced-choice methods, particularly those relying on visually guided motor responses, reveal residual visual function of which the patient has no subjective, conscious awareness. Although based on a different experimental rationale and patient population, a similar case for the use of subjective criteria can be supported by Brussell et a1.k (1984)study of pattern and flicker sensitivity in normal subjects as compared to multiple sclerosis (MS) patients and by Regan and Neima's (1984)related study comparing visual performance in normal observers to patients with MS, glaucoma, and ocular hypertension. The point raised by such studies, although often forgotten or neglected, is not new. It has been made by Kahneman (1968)and by Breitmeyer and Ganz (1976)with regard to visual masking and by
44
CHAPTER 2
Bridgeman et al. (1979) in their study of saccadic suppression. A shift of criterion can be effected not only via a quantitative shift (e g., making the criterion more or less conservative) along a given perceptual dimension but also via the choice of criterion content, Le., the choice of the qualitative informative aspect of a stimulus to which a n observer attends or responds in a detection or discrimination task. The two types of criterion shifts should not be confused. The prohibition of the latter, qualitative shifts when forced-choice procedures are dogmatically employed may compel observers to use a single criteria1 perceptual dimension and thus eliminate measurable differences between tasks such as "flicker" versus "pattern" detection. While the logic of such a procedure can effectively force a disconfirmation of the existence of separate flicker/motion and from/pattern detectors, it also results in an unfortunate loss of useful information. I t is interesting that a trained physiologist like Stone should see the importance for method as well as explanation of the use of subjective criteria based on conscious experience in psychophysical and perceptual investigations in addition to the putatively objective, forced-choice criteria. Along with him, I believe that rather than eliminating them in favor of a single criterion used in forced-choice methods it is far wiser to exploit and explore the richer information inherent in including subjective criteria tapping separate perceptual dimensions.
Controversies concerning the psychophysically defined distinctions between sustained and transient channels However, in choosing subjective criteria I am not claiming that the properties originally used to distinguish psychophysically between sustained and transient channels are not in need of reconsideration or revision. On the contrary, despite aforementioned problems with their own methodological rationale, results of several studies -- among them Burbeck (1981). Derrington and Henning (1981). Green (1983, 1984). Kelly and Burbeck (1984). and Badcock and Sevdalis (1987) -- indicate that such reconsideration and revision may be in order. For instance, while noting that most psychophysical studies of spatiotemporal vision do not require a two-mechanism model, Kelly and Burbeck (1984) concede that the effects of the masking of gratings by uniform conditioning flashes (Stromeyer. Zeevi and Klein. 1979) or UFF (Breitmeyer e t al.. 1981a) cannot be easily predicted by the spatiotemporal threshold function based on a single mechanism. The implication is that recourse to a two-mechanism, sustained/transient approach may be required to account for these data. Hence, the use of masking with either UFF or a uniform conditioning flash to distinguish between sustained and transient channels may be especially critical. Recently, Badcock and Sevdalis (1987)have taken issue with the use of UFF masking to distinguish between transient and sustained channels on the grounds that prior studies, e.g., Breitmeyer et al.(1981a). using this technique introduced a n artifact since the effective contrast of the target grating to be detected flickered in synchrony with the flickering uniform field. After controlling for this artifact, Badcock and Sevdalis (1987) indeed obtained weaker UFF masking than previously reported. However, it was significantly present a t spatial frequencies of 4 c/deg and lower, a finding consistent with
HISTORY
45
the existence of flicker sensitive, low spatial frequency transient channels. The sustained/transient channel distinction also survives Green's (1981, 1984) related analysis of masking by a uniform flash of light. A conditioning flash can produce two countervailing effects. On the one hand, it can produce transient masking at its on- and offset (Crawford. 1947): on the other, this effect is confounded with changes in contrast sensitivity accompanying changes of light adaptation level (Patel. 1966). Green's (1981, 1984) results and analysis show that while a uniform conditioning flash produces a facilitation effect on contrast sensitivity a t both high and low spatial frequencies, it additionally produces a masking effect at the on- and offsets of the conditioning flash which is specific to the low spatial frequency transient system. Another problematic issue is the psychophysical relationship and degree of correspondence between flicker/motion and form/pattern detection and separate transient and sustained channels, respectively. For instance, Kelly and Burbeck (1984) argue that although mechanisms for detection of low and high spatial frequencies must have transient and sustained temporal responses, respectively, it does not follow that there must exist two different underlying mechanisms. According to Kelly and Burbeck (1984). such a conclusion would follow only if the two mechanisms display spatiotemporal separability. However, there are problems with the assertion of this criterion. In specifying this criterion Kelly and Burbeck (1984) have posed the following conundrum. While some investigators have reported spatiotemporal decoupling in both X and Y cells in the lateral geniculate nucleus (Lehmkuhle et. al., 1980) and visual cortex (Tolhurst and Movshon, 1975) of cat, a majority (Victor and Shapley, 1979: Lee et. al., 1981; Derrington and Lennie. 1982, 1984: Enroth-Cugell et. al., 1983; Troy, 1983: Dawis et. al.. 1984) report evidence for spatiotemporal coupling in X and Y cells a t retinal and geniculate levels. If the spatiotemporal threshold surface is determined at retinal levels as suggested by Kelly and Burbeck (1984) and if retinal (and geniculate) X and Y cells do not show spatiotemporal separability, then the temporal frequency response of their psychophysical analogues most likely also will depend on spatial frequency. Since the human spatiotemporal frequency response as measured by contrast sensitivity does show this type of interaction, Kelly and Burbeck (1984) may be in the enviable position of 1) asking proponents of separate sustained and transient channels to meet an impossible physiological and, hence, psychophysical criterion while 2) arguing that a single mechanism with properties akin to the retinal X cells investigated by Enroth-Cugell et al. (1983) is sufficient to account for the human spatiotemporal threshold response. Although a single-mechanism account may hold for normal observers, Brussell et al. (1984) have presented data comparing normal observers to M S patients which are hard to reconcile with the existence of a single mechanism determining spatiotemporal vision. The different results obtained by normal observers and M S patients indicate that different mechanisms for processing flicker and pattern exist and that M S patients are specifically deficient in the flicker sensitive mechanism. A comparison of normal observers to a variety of other ophthalmological patients similarly supports the existence of separate form/pattern and flicker/motion channels in human vision (Enoch. 1978: Regan and Neima. 1984; Silverman, Trick and Hart, 1990).
46
CHAPTER 2
Another problem with Kelly and Burbeck's (1984) account is that although it has the desirable quality of being parsimonious, it is complicated by Wilson's (1980) finding, based on measures of line-spread rather than contrast sensitivity functions, that the human transient mechanism displays spatiotemporal decoupling. As noted by Wilson (19801, this result indicates that the human spatiotemporal response surface may, at least in part, be determined at cortical levels where spatiotemporal separability holds (Tolhurst and Movshon. 1975) rather than at retinal levels a s assumed by Kelly and Burbeck (1984). Kelly and Burbeck (1984) dismiss Wilson's (1980) finding as inconclusive. Nonetheless, they also point out that the issue of spatiotemporal decoupling within psychophysical and neural channels remains a moot point awaiting resolution. This point is underscored by Lee. Martin and Valberg (1989a), who found that activity of subcortical neurons does not correlate well with certain aspects of spatiotemporal vision and, on that basis, argued that central neural activity must additionally be involved. Besides this problem remaining to be worked out, others also need to be addressed. Burbeck (1981), Derrington and Henning (1981) and Green (1983. 1984) present results indicating that the human visual system cannot be strictly partitioned into sustained (pattern) and transient (flicker/motion) analyzing channels. In particular, Burbeck (1981) and Derrington and Henning (1981) present results indicating that the low spatial frequency transient channels can do some pattern analysis and use these findings a s critical evidence against the sustained-transient distinction. Above, we discussed problems with this interpretation of their results. However, even if it is correct, these investigators, as noted by Green (1984). may have misinterpreted previous authors, such a s Kulikowski and Tolhurst (1973). Tolhurst (1973). and Breitmeyer and Ganz (1976). in claiming that they denied the possibility of form analysis in transient channels. In fact, Breitmeyer and Ganz (1976) are quite explicit in claiming that the transient channels can perform a crude type of pattern analysis, a claim consistent not only with Watson and Robson's (1981) and King-Smith and Kulikowski's (1980) subsequent psychophysical results and interpretations but also with neurophysiological findings (Stone and Dreher, 1973: Lehmkuhle et al., 1980: Frascella and Lehmkuhle, 1984). This, of course, makes use of a stringent motion/pattern dichotomy at both the methodological a s well as explanatory levels impossible (King-Smith and Kulikowski. 1980: Essock and Lehmkuhle. 1982: Murray, MacCana and Kulikowski, 1983): and use of a less stringent dichotomy poses obvious problems if one relies exclusively on it (or any other single perceptual dichotomy) to draw distinctions between sustained and transient channels. A possibly more damaging problem for the sustained-transient channel distinction is Green's (1983, 1984) finding that the high spatial frequency sustained channels are also capable of discriminating flicker and motion at lower rates and velocities than the low spatial frequency transient channels. Although original versions of the sustained-transient distinction (Kulikowski and Tolhurst, 1973: Breitmeyer and Ganz, 1976). based on limited knowledge of underlying physiology, may have claimed that sustained channels respond only to stationary stimuli, more recent versions (Kulikowski. 1978; Murray et al., 1983:
HISTORY
47
Breitmeyer, 1984; Raymond & Darcangelo, 1990). taking into account new physiological findings of flicker and motion sensitivity in X as well as Y pathways of cat and monkey (Kulikowski, Bishop and Kato, 1977; Eckhorn and Poepel. 1981; Scobey. 1981: Cleland and Harding. 1983). have incorporated a sensitivity to low-velocity motion in sustained channels. This distinction between high velocity or high temporal frequency transient detectors and low velocity or low temporal frequency sustained detectors has been elaborated psychophysically by a number of investigators in the last decade (Burbeck and Kelly, 1981: Watson and Robson, 1981; Anderson and Burr, 1985: Ferrera and Wilson, 1985: Hess and Plant, 1985: Kelly and Burbeck. 1987). Among other things, these elaborations have revealed noticeable heterogeneity within both the transient as well as sustained systems. In his model of sustained and transient vision, Legge (1978) proposed that the transient system consists of a single, low-pass spatial frequency channel whereas the sustained system consists of multiple band-pass, spatial frequency specific channels (Blakemore and Campbell, 1969) cumulatively spanning a frequency range from as low as .375 c/deg (see also Stromeyer et al.. 1982) to the upper limit of spatial resolution. Wilson and Bergen (1979) and Wilson, McFarlane and Phillips (1983) subsequently were able to derive at least two transient mechanisms from observers' contrast sensitivity functions: and Ferrera and Wilson (1985) have extended the number of transient mechanisms to three. Two other aspects of Ferrera and Wilson's (1985) results are noteworthy. The three transient mechanisms are non-oriented and may correspond to the spatially broad-band transient mechanisms showing little or no orientation selectivity reported by Kelly and Burbeck (1987). Moreover, they possibly may also correspond to the three distinct sets of high temporal frequency, transient detectors reported by Watson and Robson (1981). These detectors, as noted by Watson and Robson (1981). are remarkably poor a t making spatial discriminations, which would be consistent with spatially broad-band and non-oriented response characteristics. Whereas Wilson and Bergen's (1979) data suggest the existence of only two, and Wilson et a1.k (1983) findings suggest the existence of at most four spatial frequency selective sustained channels, Watson and Robson's (1981) data indicate that a s many as seven may exist. The preceding discussions point out some of the problems and controversies that have arisen in attempts to specify psychophysically the response properties of sustained and transient channels. We can summarize the discussion by noting that it is unlikely, if not impossible, to make an unequivocal psychophysical distinction between sustained and transient on the basis of any single criteria1 dimension such as flicker/motion, form/pattern. susceptibility to UFF masking, etc.; rather, a meta-analysis based on a variety of results obtained with the use of several different criterial measures seems to be more informative and telling. This view is similar to that offered by Rowe and Stone (1977) and Stone (1983) in their proposal that neuronal naming and classification be based on as many dimensions as possible. This and related issues will be discussed more fully below.
48
CHAPTER 2
Links between the psychophysics and neurophysiology of sustained and transient channels For now, I would like to turn to some of the equally difficult problems (see Teller, 1980. 1984) concerning links between the psychophysics and the neurophysiology of parallel pathways. My opinions here share much in common with those offered by Stone (1983). If one views the correspondence between neurophysiology and psychophysics as tentative hypotheses in need of testing and correcting, then the enterprise of drawing links between the two domains can be fruitful without the danger, as noted by Uttal (1971, 1981),of reducing psychological theory to physiological data. To establish links, one works with the main assumption that the visual neurophysiology of organisms like the cat or monkey can be related to human psychophysics. With regard to the sustained/transient approach this assumption in turn carries with it two criteria that must be satisfied. First, one must find psychophysical indices of sustained and transient channel activities in these organisms which parallel similar indices found in humans; and a second requirement is that the psychophysics indeed relates to the known physiology. The first criterion has been met by several convergent lines of investigation. It has been met in studies of normal, behaving cat (Blake and Casima. 1977) and monkey (Harwerth. Boltz and Smith, 1980) in which psychophysical indices used in human studies -- e.g., threshold sensitivity to flickering and stationary gratings, reaction time to near-threshold and suprathreshold gratings, temporal summation at threshold -- were employed. However, the latter criterion has been challenged by Lennie (1980b) and more recently by Troy (1983) and Frascella and Lehmkuhle (1984). On the basis of their physiological results, they argue that X and Y cells in cat do not subserve the distinct function of fonn/pattern and flicker/motion detection, respectively. As discussed above, one may need to revise the distinction to allow for a crude form of pattern analysis in Y cells and for some sensitivity to motion in X cells. However, Stone (1983) offers a n alternative interpretation of these results. They were obtained from samples of cells, many of which fell outside the area centralis of cat. On the basis of his cat studies, Stone (1983: see also Cleland and Levick, 1974; Hochstein and Shapley, 1976) believes that X cells located in the area centralis will have a significantly poorer sensitivity at low spatial frequencies and high temporal frequencies than peripherally located ones. Hence, a comparison which includes X and Y cell responses outside the area centralis may not show differences that very likely exist for cells in the area centralis. Certainly comparison of responses from cells located in the area centralis would be more relevant for the psychophysics of cat since one would expect that, during training, cats typically learn to direct their gaze and, thus, their area centralis at the test stimuli (Blake and Casima, 1977). Moreover, if Wilson (1980) and Lee et al. (1989a) are correct in claiming that spatiotemporal vision requires cortical as opposed to the subcortical mechanisms suggested by Kelly and Burbeck (1984). then Lennie's (1980b). Troy's (1983) and Frascella and Lehmkuhle's (1984) findings would lose some of their critical impact since they are based on study of subcortical cells. In all fairness. it should be noted, however, that this loss of critical impact
HISTORY
49
would also apply to all the physiological studies of subcortical cells which in the past have been used to support the existence of separate flicker/motion and from/pattern channels in humans. Based on the above limited findings, the best we can say is that although the locus of neural substrates for detection is a n important issue (Teller, 1980, 19841, it remains to be resolved. The roles of the two pathways in cat psychophysics also have been investigated by studying the effects of pressure blocking of Y and X optic nerve fibers in cat (Burke, Burne and Martin, 1985; Burke, 1986; Burke et al., 1986.1987). With selective degenerative loss of Y fiber activity, acuity is not impaired. Hence, in cat, acuity and the perception of pattern detail is not mediated by the Y pathway. The X pathway appears to be necessary for visual acuity tasks in cat (see Waessle, 1986) since additional pressure-block induced degeneration of a majority of X fibers reduces acuity substantially (Burke et al., 1987). A visual function which was compromised but not eliminated by selective Y fiber degeneration was the ability to discriminate fast motion. According to Burke (1986) and Burke et al. (1987). this indicates that while the Y pathway is superior in its ability to discriminate fast motion, the X pathway's ability to discriminate such motion nonetheless overlaps considerably with the Y pathway's. While showing that each pathway is specialized for particular visual functions, these and particularly the latter results support Lennie's (1980b) and Frascella and Lehmkuhle's (1984) claim that in cat there are no clear-cut distinctions or sharp restrictions of function between the two pathways. The human visual system is more similar anatomically and physiologically to that of monkey than that of cat; and, as noted, the psychophysical performance of monkey provides evidence consistent with the existence of sustained and transient channels that parallels similar evidence found in humans (Harwerth et al.. 1980).A problem, however, is that the visual systems of monkey and cat are sufficiently different that the properties used to define X and Y cells in cat, which typically have been used to draw parallels with sustained and transient vision in humans, may not be applicable to monkey (Shapley and Lennie, 1985). For example, several investigators (Dreher, Fukuda and Rodieck. 1976; Schiller and Malpeli. 1978: Hicks, Lee and Vidyasagar, 1983; Derrington and Lennie, 1984; Maunsell, 1987) report that the magnocellular (M) and parvocellular (P) cells in monkey LGN have predominantly transient and sustained response characteristics, respectively. Using response latency to electrical stimulation of the optic chiasm and absence or presence of a sustained response to standing contrast a s classification criteria, Dreher et al. (1976) and Sherman et al. (1976) concluded that monkey M and P cells correspond to cat Y and X cells, respectively. However, when linearity of spatial summation is used to classify cells, not only are almost all P cells X-like but s o are many M cells (Blakemore and Vital-Durand, 1981; Shapley, Kaplan and Soodak, 1981; Kaplan and Shapley. 1982; Marrocco. McClurkin and Young, 1982). More recent evidence suggests that many M cells may correspond to X cells, while most P cell do not (see Shapley, this volume), Regarding such classification problems, Rowe and Stone (1977) and Stone (1983) recommend on philosophical and methodological grounds against the use of a single or essentialistic (see Popper, 1962) criterion, such as linear spatial summation across
50
CHAPTER 2
the receptive field, to identify cells, b u t rather advocate a multi-criterion classification system. With such a system, which relies on a number of physiological and anatomical criteria, the M and P cells of monkey are by and large transient and sustained (Schiller and Malpeli. 1978; Maunsell and Schiller. 1984). However, as with most analogies this one is imperfect, and it is further complicated by several findings which have questioned the equivalence of the originally proposed X/Y and sustained/ transient distinction. The temporal response characteristics of neurons are influenced by a number of stimulus variables (Shapley and Victor, 1978: Kaplan and Shapley, 1982) such as the wavelength composition of the stimulus (Marroccco, 1976: DeMonasterio, 1978a). the retinal eccentricity of neural receptive fields (Cleland and Levick. 1974; Cleland, 1983). and the state of adaptation (Zacks. 1975; Jakiela, Enroth-Cugell and Shapley. 1976; Saito and Fukuda. 1986).The effects of light or dark adaptation provide a particularly striking example as to why, according to Rowe and Stone (1977) and Stone (1983). a multi-criterion classification system should be employed. With dark adaptation Y cell responses become more sluggish or sustained until at scotopic levels one cannot distinguish between X and Y cells on the basis of absence or presence of a sustained response component (Zacks, 1975; Jakiela et al., 1976: Saito and Fukuda. 1986). However, as noted by Saito and Fukuda (1986). along with giving a sustained response at scotopic levels, Y cells also show linear spatial summation across the receptive field, a property which is absent a t photopic levels (Enroth-Cugell and Robson, 1966; Saito and Fukuda, 1986). Accordingly, not only do Y cells become sustained with dark adaptation but they also become X-like if one takes presence or absence of linear summation a s an absolute or essentialistic identification criterion. Besides raising obvious problems for cell classification, these results suggest a tighter link between absence or presence of a sustained response component on the one hand and linear spatial summation on the other than was previously believed (Zacks. 1975; Jakiela et al.. 1976; Kaplan and Shapley. 1982). It also raises the possibility that other stimulus parameters such as wavelength composition which affect the level of response transience and sustainedness (Marrocco, 1976: DeMonasterio, 1978a) may correspondingly affect linear spatial summation. Visual latency is another temporal response characteristic whose use in distinguishing sustained X from transient Y pathways (Singer and Bedworth. 1973: Ikeda and Wright, 1975; 1976) has been criticized. In particular, using near-threshold stimuli, Lennie (1980b) was able to eliminate response latency differences between X and Y cells. However, as noted by Maunsell (1987). although neither visual response latency nor transience or sustainedness provide absolutely reliable classification criteria, one would expect the most severe deviations from such a classification scheme for near-threshold stimuli and the clearest differences in response characteristics to emerge when suprathreshold stimuli are used. In particular. Maunsell (1987) reports that with clearly suprathreshold stimuli, M-pathway cells have a shorter visual response latency than P-pathway cells not only in LGN and cortical area V1 of macaque monkey but also a t later stages of processing, e.g., when comparing transient cells in area MT and sustained cells in area V4.
HISTORY
51
Similar results, when suprathreshold stimuli are used, have been reported in extrastriate regions of owl monkey (Petersen, Miezin and Allman, 1988) a s well as the retina (Bolz et al., 1982; Sestokas et al., 1987) and LGN (Sestokas and Lehmkuhle, 1986: Sestokas et al., 1987) of cat, although the latter results have been qualified by Troy and Lennie (1987).What these and the other findings discussed above point out is that since any single classification criterion may be unreliable it is important to extensively explore several stimulus dimensions and to use many convergent physiological response criteria to classify cells. As noted previously, a similar multidimensional approach should be taken in psychophysically studying human parallel channels.
The role of color-opponent P and broad-band M pathways in vision As Schiller and Malpeli's (1978) and Maunsell's (1987; Maunsell and Schiller, 1984) investigations indicate, it may be time to look increasingly to P and M pathways of monkey rather than X and Y pathways of cat for analogues of human sustained and transient channels. According to Livingstone and Hubel (1987. 1988). cells in the P and M pathways can be differentiated on the basis of color selectivity, contrast sensitivity, spatial resolution and temporal resolution. Specifically. single cell studies as well investigations of the perceptual effects of selectively lesioning the P or M pathway in monkey have revealed the following response properties. The color-opponent P cells are selective in their response to wavelength while the broad-band M cells are not (De Valois et al., 1958; Wiesel and Hubel, 1966: Schiller and Malpeli. 1978). The P pathway, therefore, is involved in visual analysis and coding of color while the M pathway typically is not (Tootell et al., 1988b: Merigan, 1989: Schiller. Logothetis and Charles, 1990). However, although not considered to be color-coded, a large proportion of cells in the M pathway do show some color selectivity in they have red-dominant receptive field surrounds (Wiesel and Hubel, 1966; De Monasterio, 1978a. b; De Monasterio and Schein, 1980: Derrington. Lennie and Krauskopf. 1984; Livingstone and Hubel, 1984: Marroccco. McClurkin and Young, 1988). which may be the basis for tonic suppression of their response by diffuse red light (Wiesel and Hubel. 1966; Dreher et al., 1976; Krueger, 1977; Schiller and Malpeli. 1978). P cells also differ from M cells in their contrast sensitivity, with M cells having a lower contrast thresholds (Kaplan and Shapley, 1982; Hicks et al.. 1983; Derrington and Lennie, 1984; Tootell, Hamilton and Switkes. 1988a). Moreover, although individual M cells can have a spatial resolution as high as that of P cells (Hicks et al.. 1983; Crook et al., 1988). the P pathway is crucial for higher spatial resolution whereas the M pathway is not (Merigan and Eskin. 1986: Tootell et al., 1988~; Merigan. 1989: Schiller and Logothetis. in press; Schiller et al., 1990). Besides these color and spatial response differences, P and M cells also show temporal response differences. In addition to having a longer latency and a sustained as compared to transient response, P cells are characterized by a poorer temporal resolution in that they prefer slow flicker or motion whereas M cells are crucial in visual analysis of fast flicker and motion (Merigan and Eskin. 1986; Schiller and Logothetis, in press: Schiller et al.. 1990).
52
CHAPTER 2
The geniculate P and M pathway which begins at the retina with the B and A classes of ganglion cells, respectively (Leventhal. Rodieck and Dreher. 1981). branches into a t least three identifiable cortical pathways which have different laminar and tangential distributions as revealed by patterns of staining for cytochrome oxidase and deoxyglucose (Tootell et al., 1983, 1988a,b,c). In cortical area V1. the P pathway splits into two anatomically distinct, P-blob and P-interblob, streams of processing which in turn project via the thin and pale stripes of area V2 to V4 and subsequently to inferotemporal cortex. The separate cortical M pathway originates in V 1 and projects via V3 and the thick stripes of V2 to area MT and subsequently to the parietal cortex. Interaction exists between the cortical M and the two P pathways. In particular, V 3 projects not only to MT but also to V4. and moreover, V4 is anatomically linked with both area MT and the parietal areas (DeYoe and Van Essen, 1988: Desimone and Ungerleider, 1989). These three cortical streams of processing and their interactions are discussed in greater detail in several reviews and play a prominent role in current models of visual perception (Cavanagh, 1987. 1989a,b; Livingstone and Hubel, 1987, 1988; DeYoe and Van Essen. 1988; Desimone and Ungerleider. 1989; Ramachandran, 1990). While there is general agreement among the various theoretical models and empirical findings that the P and M pathways are closely tied to visual analysis of color and motion, respectively, there is disagreement regarding the specifics of this color/motion distinction as well as the roles of these pathways in the analysis of form and depth. To set the stage for discussion, the approach recently taken by Livingstone and Hubel (1987. 1988) will be outlined and compared to other approaches to visual perception also based on the M and P pathways. According to Livingstone and Hubel (1987, 1988). information about depth and movement appear to be processed mainly by the M pathway whereas information about form and color is processed predominantly by the P system (for similar views, see Ramachandran. 1990).However, they also make the important point that while color and certain high-resolution aspects of form information are processed by the P-blob and P-interblob systems, respectively, other low-resolution and Gestalt-linking aspects of form information are processed by the M system. Although Livingstone and Hubel (1987. 1988) qualify their proposal for distinct M- and P-pathway functions, some investigators suggest that Livingstone and Hubel have proposed a stronger distinction by claiming that " ... the M-cell system alone can support virtually all aspects of vision except for color, and that it provides the exclusive basis not only for motion perception, but also for stereopsis. perception of the three-dimensionality of objects based on perspective and shading, and most of the Gestalt phenomena of 'linking operations'. According to this model, the only contribution of the P-cell system to perception are color and a two-fold increase in the resolution of simple achromatic patterns" (Desimone and Ungerleider, 1989. p. 278). Regardless of whether claims are made for a predominant or an exclusive contribution of a given pathway to a perceptual function, it seems that Livingstone and Hubel's (1987. 1988) proposals are in need of some clarification and revision. The need stems from two considerations. One deals with the anatomical, physiological and psychophysical evidence supporting common as well as distinct roles of
HISTORY
53
the M and P pathways in spatial and temporal vision: the other, with the use of isoluminant chromatic stimuli in humans (and monkeys) as a method of supposedly isolating P- from M-pathway functions in perception.
Evidence for shared and distinct spatial and temporal response properties of M and P pathways Based on a review of anatomical, physiological a n d psychophysical studies, DeYoe and Van Essen (1988) propose the following scheme. Although the P-blob pathway performs only analysis of color and the M pathway dominates motion perception, the M pathway has no exclusive or dominant role in stereopsis as proposed by Livingstone and Hubel (1987. 1988). According to DeYoe and Van Essen (1988). the P-interblob system, besides supporting color and form vision, also plays a crucial role in stereopsis. Since the P system additionally plays a critical role in high spatial resolution (Livingstone and Hubel. 1987,1988). one would in turn expect that its role in stereopsis is especially important for tasks requiring high resolution as suggested by DeYoe and Van Essen (1988). These differential roles of the M and P systems a s well as the role of the P system in high-resolution stereopsis has been corroborated by recent results reported by Schiller and co-workers (Schiller and Logothetis. in press; Schiller et al., 1990). These investigators looked a t the effects of selectively lesioning the P- or the M-cell layers of the LGN on the disruption of visual capacities in monkeys. Their findings indicate that the P pathway is essential for the processing of not only color, texture and fine pattern but also fine stereopsis while the M pathway is crucial for the perception of fast flicker and motion. Coarse shape discrimination and stereopsis could be supported by either pathway. Moreover, the P system was found to support flicker and motion perception at low temporal frequencies, indicating that the M pathway, in addition to not dominating all aspects of stereopsis. does not entirely dominate all aspects of flicker or motion perception. These spatiotemporal properties of the M and P pathways and their consequences for perception agree well with t h e updated psychophysical distinction between human transient and sustained channels mentioned above.
The roles of color and isoluminant stimuli in studies of parallel pathways The distinction between the processing of luminance and color has had an important impact on current theoretical and methodological developments in the study of visual perception (Anstis and Cavanagh. 1983; Livingstone and Hubel. 1987, 1988; Cavanagh, 1989a.b; Cavanagh and Mather, 1989; Ramachandran. 1990). However, the claimed effects of luminance and color on responses in the M and P pathways as well as the associated psychophysics in humans and monkeys provide grounds for significant controversy. The above models and findings (e.g., DeYoe and Van Essen, 1988: Schiller et al.. 1990) indicate that the color-opponent P pathway processes color information whereas the
54
CHAPTER 2
broad-band M pathway, except for color opponency shown in the receptive field surrounds of mainly type IV M cells (Wiesel and Hubel, 1966; Dreher et al., Krueger, 1977; Schiller and Malpeli. 1978; Livingstone and Hubel, 1984) does not. In view of this, isoluminant stimuli devoid of luminance variations but varying only in wavelength should be processed by the P but not the M pathway. Since the M pathway presumably is color blind, the perceptual functions which it critically supports should be those functions lost or severely compromised when isoluminant stimuli are used. This rationale has been used by several investigators (Livingstone and Hubel, 1987, 1988; Ramachandran. 1990) to supposedly isolate perceptual functions attributable to the P pathway from those attributable to the M pathway. There are several reasons for questioning the validity of such a rationale. Neither physiological nor psychophysical findings clearly support the use of isoluminant stimuli to distinguish M- from P-pathway function at early, precortical levels of visual analysis. Although some investigators report nulling of M-cell responses to isoluminant stimuli (Krueger. 1979; Hicks et al., 1983; Lee, Martin and Valberg, 1988, 1989 b,c) and a maintenance of P-cell responses to the same stimuli (Hicks et al.. 1983; Lee et al.. 1989b). others (Schiller and Colby. 1983) find the converse in that M cells could not be silenced a t any heterochromatic luminance ratios while many P cells, particularly those lacking color selectivity, could be silenced (Logothetis et al., 1990). Even those investigators who report silencing of M cells at isoluminance report that this does not hold under all conditions (Krueger, 1979; Lee et al.. 1988).As noted by Lee et al. (1989~1,a nonlinearity occurring at or before the summation of medium- and long-wavelength cone inputs to the M cells could provide a basis for their responses to red-green isoluminant borders. If responses to isoluminant stimuli occur in the M pathway at early levels of visual analysis, it would be reasonable also to expect such responses a t later cortical levels. Indeed, several investigators have reported responses of direction-selective cells in area MT to moving stimuli made of isoluminant-color or relative-motion borders (Albright. 1987; Charles and Logothetis, 1989; Saito et al.. in press). Even if a given M cell could be perfectly silenced a t isoluminance. among M cells there is significant variation of the luminance ratios at which isoluminance is obtained (Schiller and Colby. 1983; Derrington et al.. 1984; Logothetis et al., 1990). Hence, as noted by Cavanagh (1989b: Cavanagh and Anstis, 1986). in psychophysical studies no single luminance ratio would be expected to silence all cells and, thus, all activity in the M pathway. The residual perceptual abilities a t isoluminance could therefore be attributed to such a weakened residual response in the M pathway. However, a n alternative interpretation based on properties of cells in the P pathway is equally plausible. Schiller and Colby (1983) and Logothetis et al. (1990) report that many P cells also are unresponsive at isoluminance. indicating that impairment or compromise of visual capacities a t isoluminance cannot be attributed to only one of the two pathways. In particular, Logothetis et al. (1990) show that high-resolution form perception in monkey, which presumably can be ascribed to the P (interblob) system (Livingstone and Hubel. 1987, 1988). is compromised a t isoluminance, as are motion and depth perception, which presumably can be ascribed to the M pathway.
HISTORY
55
In humans similar ambiguities and inconsistencies exist regarding visual performance with isoluminant stimuli. For instance, while Lu and Fender (1972) and Gregory (1977) found that depth perception was absent in isoluminant random-dot stereograms. de Weert and Sadza (1983)found that observers could judge depth in such stereograms. although their ability to do s o was impaired. On the assumption that the M pathway dominates stereopsis (Livingstone and Hubel. 1987, 19881, such residual abilities could be due to the aforementioned residual activity in the M pathway at isoluminance; however, as Schiller et a1.k (1990) findings suggest, they could also be due to the P pathway's contribution to the processing of random dot stereograms. Similar considerations apply when isoluminant stimuli eliminate or impair the perceptions of shape from shading, of shadow a s part of a spatially unbroken object or surface, of the related Gestalt-linking involved in the perception of subjective contours, and of static pictorial depth based on occlusion or perspective cues (Cavanagh. 1985, 1987. 1989b: Livingstone and Hubel, 1987, 1988; Cavanagh and Leclerc, 1989; Ingling and Grigsby. 1990; Ramachandran, 1990). The fact that the response of a substantial proportion of the P cells can be silenced or minimized at isoluminance (Logothetis et al.. 1990) makes claims such as Livingstone and Hubel's (1987, 1988) about an exclusive or predominant role of the M pathway in these perceptual functions questionable (see also Cavanagh. 1989a; Ingling and Grigsby, 1990). As another instance of ambiguities with the use of isoluminant stimuli, recall that the masking produced by the on- and offsets of a uniform luminance flash (Crawford, 1947) is found in the low spatial frequency transient channels but not the high spatial frequency sustained channels (Green. 1981, 1984). Since the transient M pathway supposedly does not respond well or at all to isoluminant wavelength or hue substitution. suddenly substituting one background hue for another equiluminant one should produce little or no masking of a test flash at the onset of the hue substitution. Although this is true when the background wavelengths differ from each other only slightly, large wavelength differences produce a powerful transient masking effect (Glass and Sternheim. 1973). Similar results hold when target and mask stimuli consisting of hue substitutions against a n isoluminant background are employed in metacontrast: when hue differences between stimuli and background are relatively small (e. g.. orange on white), little or no metacontrast is obtained (Bowen. Pokorny and Cacciato. 1977): however, when the differences are large (red on green), metacontrast is obtained (Reeves, 1981; Breitmeyer, May and Williams, 1989; Breitmeyer. May and Scott, in preparation). If isoluminance eliminated or severely weakened the response in the transient M pathway one should obtain no or only little onset-transient and metacontrast masking. The presence of these two types of masking effects a t isoluminance agrees with Schiller and Colby's (1983) and Derrington et a1.k (1984) finding that M cells are either not silenced at isoluminance or that no isoluminance value silences all M cells. Several lines of investigation have shown that the mechanisms underlying metacontrast may also contribute to motion perception in humans (Kahneman. 1967: Weisstein and Growney, 1969; Breitmeyer, Love and Wepman. 1974; Matin. 1975; Stoper and Banffy, 1977). If so,
56
CHAPTER 2
then based on the above findings one would expect motion perception to exist at isoluminance. Although Ramachandran and Gregory (1978) reported a n absence of motion perception in isoluminant random-dot cinematograms (RDCs; Julesz, 1971). subsequent investigations have shown that motion signals can be generated in such RDCs as well as other isoluminant moving stimuli (Cavanagh. Tyler and Favreau. 1984: Cavanagh. Boeglin and Favreau, 1985; Cavanagh and Favreau, 1985: Derrington and Badcock, 1985: Sato. 1988; Cavanagh, 1989b: Cavanagh and Mather, 1989). Cavanagh and Favreau (1985) and Derrington and Badcock (1985) argue on the basis of these findings that a common motion pathway receiving convergent input from luminance and color patterns exists in the human visual system. This is a plausible notion since a significant proportion of simple and complex cells in monkey striate cortex receive convergent inputs form M and P layers of the LGN (Malpeli, Schiller and Colby, 1981). However, the notion of such a common motion pathway in human vision has been challenged recently by Gorea and Papathomas (1989a.b). who propose instead that one can psychophysically identify a t least three distinct mechanisms, a luminance mechanism, a chromatic-plus-luminance mechanism, and a pure chromatic mechanism, each of which independently provides motion information. Further evidence for modulation of motion perception by the addition of color to luminance contrast has been reported in a number of recent studies. For instance, Logothetis et al. (1990) showed that monkeys‘ perceptions of motion and depth were enhanced when color contrasts were added to luminance contrasts. While Cavanagh and Favreau (1985) report that adding chromatic to a constant luminance contrast reduces its ability to generate or null motion after effects in human observers, other studies of humans (Ramachandran. Anstis and Rogers, 1987: Gorea and Papathomas, 1989b: Papathomas, Gorea and Julesz, 1989; Ramachandran. 1990) have shown that adding color to luminance contrast can resolve directional ambiguities in a number of directionally multistable apparent-motion (AM) displays. In addition, aspects of form such a s orientation (Green, 1986: Gorea and Papathomas, 1989 a.b; Mack et al., 1989) and texture (Ramachandran. Rao and Vidyasagar. 1973; Cavanagh. Arguin and von Gruenau. 1989) also can contribute to directional signals and disambiguition in AM displays. Rather than supporting the existence of a common motion pathway onto which form, color and motion information converge as suggested by some investigators (Cavanagh and Favreau. 1985; Derrington and Badcock, 1985; Cavanagh, 1989b). these effects of color or form on motion, a s alternatively proposed by Papathomas et al. (1989), could be due to interactions between the cortical M and P pathways (DeYoe and Van Essen, 1988; Desimone and Ungerleider, 1989). Such interactions may give rise to what DeYoe and Van Essen (1988) have termed a concealed or indirect contribution of color or form to, in this instance, motion perception or vice versa (Desimone and Ungerleider, 1989). They also could provide a basis of what Ramachandran (1987) terms “motion capture”. a phenomenon in which moving, luminance varying RDCs or illusory contours induce a sense of co-directional motion in a stationary isoluminant color border. In addition, they could explain the dependence of the perceived position of motion-segregated edges on the absence a n d presence of luminance-segregated edges (Anstis. 1989).
HISTORY
57
An additional property of M neurons, mentioned previously, is the red- or long-wavelength-dominant surround mechanism characterizing many of their receptive fields (Wiesel and Hubel, 1966; De Monasterio, 1978a.b: De Monasterio and Schein, 1980: Livingstone and Hubel, 1984; Marrocco et al., 1982). This property may be the basis for the tonic suppression of activity in M-pathway neurons produced selectively by diffuse red light (Dreher et al., 1976: Livingstone and Hubel, 1984; Van Essen, 1985). Along with several collaborators I have recently looked at the implications of these findings for human psychophysics. Since a red background selectively suppresses the transient M pathway, one would predict, among other things, that stimuli presented against a red as compared to isoluminant green or neutral background should yield weaker metacontrast and motion a s well as longer reaction time to stimulus onset. So far each of these predictions h a s been confirmed (Breitmeyer et al.. 1989, in preparation; Breitmeyer and Williams, in press: Breier and Breitmeyer. in preparation). Of particular significance is the finding (Breitmeyer et al.. 1989, in preparation) that stimuli consisting of green hue substitutions 6n a n isoluminant red background yield weaker metacontrast and motion than red stimuli on green backgrounds. These asymmetries pose obvious problems for claims made by Kelly (1983, 1989) that a single pathway able to carry luminance a s well as opponent-color information, such a s the red-green X-cell channel proposed by Ingling and Martinez-Uriegas (1985). can fully account for spatiotemporal aspects of color vision. It is likely that further exploration of the suppressive effects of diffuse red light on M neurons will reveal additional properties of the M and P pathways and their respective contributions to spatial and temporal aspects of human vision.
Extensions of the parallel pathway approach
The control of visual orienting and attention In their discussion of possible functional aspects of the Y pathway in cat, Waessle. Peichl and Boycott (1981) suggest that any change occurring in the visual environment would initially be signalled by the transient and fast responding cells in that pathway. Hence, they could provide a system for triggering and directing visual orienting and attention. This is consistent with the projection of Y fibers to the superior colliculus (Hoffman, 1973) where, to use Schiller and Koerner's (1971) terminology, recipient Y cells could serve as "event detectors" signalling the events that trigger saccades and shifts of attention (Albano and Wurtz. 1981).In monkey, cells in the M pathway could serve a s event detectors since, as shown by Schiller and Colby (1983). they are particularly well suited by their fast and transient response for detection of any spatially localized change. This role of transient channels was incorporated into Breitmeyer and Ganz's (1976) parallel channel model and has been corroborated by a number of psychophysical studies showing the prepotency of abrupt onsets in triggering and controlling spatial attentional shifts (Todd and Van Gelder. 1979; Jonides, 1981; JSrumhansl. 1982; Yantis and Jonides,
58
CHAPTER 2
1984, 1990; Jonides and Yantis. 1988; Yantis and Johnson, in press). Moreover, it is consistent with a number of clinical studies implicating the tectal midbrain areas in triggering and directing the movement of spatial attention (Singer. Zihl and Poeppel. 1977; Zihl and von Cramon, 1979; Posner, Cohen and Rafal. 1982). Deployment of eye movements and spatially selective attention during inspection of the visual environment also requires the posterior parietal cortex, a major recipient site of M-pathway projections (Mountcastle, 1978; Lynch, 1980; Robinson, Bushnell and Goldberg, 1980; Bushnell, Goldberg and Robinson, 1981). Clinical studies of humans have also revealed the crucial role of the posterior parietal cortex in disengaging attention from a current target location so that it can be free to shift to another target locus (Posner et al.. 1984, 1987; Posner. 1988; Farah et al.. 1989; Petersen. Robinson and Cunie, 1989). Moreover, as suggested by Desimone and Ungerleider (1989) and Posner and Petersen (1990). the posterior parietal attentional system may effect spatial selectivity in ventral pattern recognition areas, such as V4 and IT (Desimone and Moran, 1985). via the interactive linkages between the dorsal M and the ventral P pathways. Posner and Petersen (1990) suggest that this particular interaction is communicated through the pulvinar of the thalamus (Petersen. Robinson and Morris, 1987), consistent with the finding that the pulvinar does modulate spatial selectivity of receptive fields in the ventral pattern recognition system (Gross, Bender and Rocha-Miranda. 1974).
The ventral/dorsal cortical streams of processing in object/spatial and farlnear vision Despite linkage between the M and P pathways (DeYoe and Van Essen, 1988; Desimone and Ungerleider. 1989). the two processing streams by and large take separate anatomical routes in visual cortex. The cortical M pathway originating in layer 4C-alpha of area V1 projects dorsally via layer 4B of V1 through area V3 and the thick stripes of area V2 to MT and additional areas in the superior temporal sulcus on its way to area PG in posterior parietal cortex. The cortical P pathway originating mainly in layer 4C-beta of area V1 projects ventrally along the P-blob and P-interblob routes via area V2 to area V4 on its way to area TE in inferotemporal cortex. Mishkin. Ungerleider and coworkers (Ungerleider and Mishkin. 1982; Mishkin et al.. 1983; Ungerleider. 1985; Desimone and Ungerleider, 1989) propose that the anatomically and physiologically distinguishable dorsal and ventral streams of processing comprise two functionally distinct cortical systems. Based on an extensive review of research on the differential impairments to vision produced by selective lesions of the two pathways in monkey and of related visual impairments associated with damage to parietal and temporal cortex in humans, they argue that the dorsal and ventral pathways support various aspects of spatial perception and object perception, respectively. The proposal that the visual system performs separate functions of spatial perception and object recognition is not new (Ingle. 1967; Held, 1968). However, the work of Mishkin. Ungerleider and coworkers elaborates and extends the proposal by showing that the two visual functions, originally relegated to the tectum and cortex, respectively (Schneider. 1967; Trevarthen, 1968). are
59 additionally supported by distinct cortical pathways. More recent investigations of intact a s well as brain-damaged humans not only support the existence of these two distinct cortical pathways (Zihl, Von Cramon and Mai. 1983; Hess. Baker and Zihl. 1989; Lueck et al.. 1989; Vaina, 1989) but also indicate cross-linkage between them (Vaina. 1989). A different proposal for the roles of the two pathways h a s recently been made by Previc (1990). Previc's (1990) proposal rests on review of anatomical, physiological, behavioral and clinical data suggesting anatomical and functional differences between the upper and lower visual field representations in the visual system. In particular, it is argued that the lower visual field is biased toward perception of near or "peripersonal" space whereas the upper visual field is biased toward perception of far or "extrapersonal" space. In so far as near space is involved with manipulative and consumatory behavior and f a r space is involved with exploratory and orienting behavior, this is somewhat of a departure from other schemes, e.g., Trevarthen (1978). in which visual space is partitioned into central focal vision concerned with object recognition and consumatory behavior and ambient peripheral vision concerned with spatial exploration and orientation. By implication, since focal vision is concerned with object recognition, one would expect the ventral P pathway to play a predominant role in near vision whereas the dorsal M pathway would be more crucial in ambient far vision. In contrast to this scheme, Previc (1990) argues that the functional differences between near and far visual space instead are correlated with the disproportionate representations of the lower and upper visual hemifields in the dorsal M and ventral P divisions of the visual association areas. Indeed. the dorsal M pathway in monkey cortex does show a bias not only toward lower hemifield representation (Van Essen. 1985; Maunsell, 1987) but also for near or crossed disparities (Maunsell and Van Essen, 1983; Komatsu, Roy and Wurtz, 1988),which, as hypothesized by Maunsell and Van Essen (1987) may be related to control of hand movements during reaching. Binocular neurons in the ventral P pathway seem to be tuned to stimuli in the fixation plane, i. e., at zero disparity (Burkhalter and Van Essen. 1986). These findings are consistent with the behavioral strategies of primates scanning and fixating objects to be visually identified before being grasped or manipulated by hands reaching along a trajectory typically found in the lower visual field. Similarly, as proposed by Levick (1977) and Pettigrew and Dreher (1987). in the cat the bias of the cortical Y system for crossed disparities may render it particularly useful for analysis of near space whereas the X system is more useful for analysis of objects in the fixation plane. Human vision also shows a bias for crossed disparities in the lower visual field and for uncrossed disparities in the upper visual field (Breitmeyer. Julesz and Kropfl. 1975; Julesz, Breitmeyer and Kropfl, 1976). Such biases would be consistent with Previc's (1990) proposal that the lower and upper fields are in turn biased toward perception of near and far space. However, as noted by Breitmeyer (in press: Breitmeyer, Battaglia and Bridge, 1976). the differences between the disparity biases of the upper and lower hemifields could additionally correlate with locomotion on a horizontal ground plane.
60
CHAPTER 2
Sustained and transient channels in reading and reading disability Possible roles of sustained and transient channels in dynamic vision characterized by a variety of eye movements have been discussed by several investigators (Breitmeyer, 1980, 1984: Barlow, 1981). In particular, it has been proposed (Matin. 1974: Breitmeyer. 1980. 1984; Volkmann. 1986) that the inhibition which the transient channels exert on sustained channels in metacontrast and, analogously, which Y cells exert on X cells (Singer and Bedworth. 1973) serves as a mechanism of saccadic suppression. Saccadic suppression clears the retinotopically organized sustained channels between fixations so that the pattern information carried by these channels from a prior fixation does not carry over and mask the pattern information picked up by the same channels during the succeeding fixation. Hence, by inhibiting the sustained channels' response persistence to pattern stimulation from a prior fixation, saccadic suppression expedites the pick-up of information during foveal scanning of spatially extended patterns such as reading material. In addition to this function, Matin (1974) notes that saccadic suppression also prevents the perception of retinal image smear during saccadic eye movements and, additionally, is important in maintaining constancy of visual direction and a stable visual world despite the continually changing retinal images while scanning. Over the past decade a series of studies reported separately by Bill Lovegrove and Mary Williams and their coworkers (see Lovegrove. Martin and Slaghuis. 1986: Williams and LeCluyse, 1990: Williams and Lovegrove. this volume) indicates that about 70% of dyslexic or specifically reading disabled (SRD)subjects suffer from a deficit in transient channel activity. Compared to normal subjects, SRDs have poorer temporal resolution as shown by lower flicker sensitivity and longer visual persistence to low spatial frequency stimuli (Lovegrove et al., 1986: Williams and LeCluyse. 1990) as well as poorer double-flash temporal order judgments (Williams and LeCluyse, 1990). More significantly. as expected from a transient channel deficit, SRDs also show a pronounced attenuation of metacontrast (Williams and LeCluyse, 1990). These findings clearly are relevant to our understanding of SRD. They indicate that saccadic suppression is substantially weaker in SRD than in normal subjects. Although data relying on direct experimental tests of this conjecture are needed, I would like to follow up some of its consequences. First, a weaker saccadic suppression would result in greater persistence of pattern activity generated during a prior fixation. Relative to similar activity generated by the following fixation, this would constitute a source of noise impeding or masking efficient pick-up of sequentially scanned information. In reading tasks, weaker saccadic suppression would thus contribute to a primary visual deficit in SRD subjects. Moreover, following Matin's (1974) reasoning, defective saccadic suppression also would lead to an interrelated set of secondary visual problems in SRD. including retinal image smear, a loss of visual direction constancy, and instability of the visual world. Such deficits have been reported to exist in about 60-70% of dyslexic subjects studied by Stein, Riddell and Fowler (1989). which is also the percentage of transient channel deficits reported to exist in SRDs by
HISTORY
61
Lovegrove et al. (1986). Further research is required to determine if this concordance is mere coincidence or reflects a common basis as conjectured here.
Applications of sustained and transient channels to the study of psychological abnormalities Previously I noted that the distinction between sustained, form/pattern channels and transient, flicker/motion channels already has found application to the study of visual abnormalities in multiple sclerosis and other ophthalmological abnormalities such as open-angle glaucoma, ocular hypertension, and optic neuritis (Enoch. 1978; Brussell et al., 1984: Regan and Neima, 1984: Silverman et al., 1990; Ghilardi et al., this volume). In recent years a number of investigators also have reported deficits of visual information processing in schizophrenic and schizotypal subjects (Steronko and Woods, 1978: Saccuzzo and Schubert. 1981: Merritt and Balogh. 1990. in press: Balogh and Merritt, 1985, 1987; Nakano and Saccuzzo. 1985). Specifically, Merritt and Balogh (1990, in press) present and review evidence based on backward masking which is consistent with the hypothesis that schizotypal subjects are characterized by aberrant transient channel activity. However, although these preliminary findings are intriguing and suggestive, the extent to and manner in which transient activity in schizophrenics and schizotypics is abnormal must, a s noted by Merritt and Balogh (1990). still be determined.
Summary and conclusions Although several loosely related lines of research on parallel pathways in human vision can be traced back for decades, it was the discovery of X and Y cells in cat in the 1960s that stimulated a n acceleration of psychophysical investigations specifically focused on defining the properties of analogous pathways in humans. Most of the early research was framed in the context of explicitly a s well as implicitly articulated distinctions hypothesized to exist between sustained and transient channels. These distinctions were incorporated into models of spatiotemporal vision, visual information processing and visual masking. Although a significant portion of this research was subsequently criticized on theoretical and methodological grounds, the main tenets on which the distinctions were based have survived in revised form incorporating some of the major theoretical criticisms and more recent empirical clarifications and elaborations. The revisions of the sustained-transient channel approach have been accompanied by a shift from drawing analogies with the X and Y pathways of cat to more recent analogies with the opponent-color P and broad-band M pathways in monkey. The latter development h a s broadened the psychophysically defined distinctions between parallel pathways in human vision by including, along with temporal and spatial response differences, differences of chromatic sensitivity. These distinctions in t u r n have been criticized on theoretical and methodological grounds and are in need of revision. Nonetheless, they have provided and, in revised form, can continue to provide a useful
62
CHAPTER 2
framework for investigating a number of visual functions such as the perceptions of form, color, depth and motion in humans. They also are closely related to recent distinctions made between pathways for object recognition and spatial vision and similarly for visual function in near (peripersonal) and f a r (extrapersonal) space. Moreover, extensions of the parallel-channels approach help inform u s about a number of other phenomena such as selective spatial attention and reading, and they provide a means of investigating a number of visual abnormalities associated with specific reading disability, ophthalmological disorders, and, possibly, also schizophrenia.
References Albano, W. R. & Wurtz. R. H. (1981).The role of primate superior colliculus. pretectum and posterior-medial thalamus in visually guided eye movements. In A. F. Fuchs & W. Becker (Eds.) Progress in Oculomotor Research. pp. 153-160.Amsterdam: Elsevier. Albright. T. D. (1987).Isoluminant motion processing in macaque visual area MT. Society of Neuroscience Abstracts, 13, 1626. Alpern. M. (1953).Metacontrast. Journal of the Optical Society of America, 43, 648-657. Anderson, S . J. & Burr. D. C. (1985).Spatial and temporal selectivity in the human motion detection system. Vision Research, 25, 1147-1154. Anstis. S. (1989).Kinetic edges become displaced, segregated, and invisible. In D. M.-K.Lam & C. D. Gilbert (Eds.) Neural Mechanisms of Visual Perception: From Single Cells to Perception , pp. 247-260. Houston: Gulf Publishing. Anstis. S. M. & Cavanagh, P. (1983).A minimum motion technique for judging equiluminance. In J. D. Mollon & L. T. Sharpe (Eds.), Colour Visiox Physiology and Psychophysics, pp. 156-166.London: Academic Press. Badcock, D. R. & Sevdalis. E. (1987).Masking by uniform-field flicker: Some practical problems. Perception. 16. 641-647. Balogh. D. W. & Merritt. R. D. (1985).Susceptibility to type A pattern masking among hypothetically psychosis-prone college students. Journal of Abnormal Psychology, 94. 377-383. Balogh. D. W. & Merritt. R. D. (1987).Visual masking and the schizophrenia spectrum: Interfacing clinical and experimental methods. Schizophrenia Bulletin. 13. 679-698. Banta. A. R. & Breitmeyer. B. G. (1985).Stationary patterns suppress the perception of stroboscopic motion. Vision R e s e a r c h , 25.1501- 1505. Barlow. H. B. (1981).Critical limiting factors in the design of the eye and visual cortex. Proceedings of the Royal Society, London, 212 B. 1-34. Blake, R. & Casima. J. M. (1977).Temporal aspects of spatial vision in the cat. Experimental Brain Research, 28, 325-333. Blakemore, C. & Campbell, F. W. (1969).On the existence of neurones in the human visual system selectively sensitive to orientation and size of retinal image. Journal of Physiology. 203,237-260. Blakemore. C. & Vital-Durand, F. (1981).Distribution of X- and Y-cells in monkey's lateral geniculate nucleus. Journal of Physiology. 320, 17-18P.
HISTORY
63
Bolz, J., Rosner. G. & Waessle, H. (1982).Response latency of brisk-sustained (XI and brisk-transient (Y) cells in the cat retina. Journal of Physiology, 328, 171 - 190. Bowen. R. W., Pokorny. J. & Cacciato. D. (1977).Metacontrast masking depends on luminance transients. Vision Research, 17. 971-975. Bowker. D. 0. & Tulunay-Keesey, U . (1983). Sensitivity to countermodulating gratings following spatiotemporal adaptation. Journal of the Optical Society of America, 73. 427-435. Breier. J. & Breitmeyer, B. G. (in preparation). Effects of isoluminant-background color on visual reaction time and bistable motion. Breitmeyer, B. G. (1975).Simple reaction time as a measure of the temporal response properties of transient and sustained channels. Vision Research, 15. 1411-1412. Breitmeyer. B. G. (1978).Disinhibition of metacontrast masking of Vernier acuity targets: Sustained channels inhibit transient channels. Vision Research, 18, 1401-1405. Breitmeyer, B. G. (1980).Unmasking visual masking: A look at the 'why' behind the veil of the 'how'. Psychological Review, 87. 52-69. Breitmeyer, B. G. (1984).Visual Masking: An Integrated Approach. New York: Oxford University Press. Breitmeyer. B. G. (1990).Ups and downs of the visual field: 'Manipulation and locomotion. Behavioral and Brain Sciences, 13,544545. Breitmeyer. B., Battaglia, F. & Bridge, J. (1976).Existence and implications of a tilted binocular disparity space. Perception, 6. 161 - 164. Breitmeyer. B. G. & Ganz. L. (1976): Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing. Psychological Review , 83. 1-36. Breitmeyer, B. G. & Ganz. L. (1977).Temporal studies with flashed gratings: Inferences about human transient and sustained channels. Vision Research, 17. 861-865. Breitmeyer. B. G. & Julesz, B. (1975). The role of on and off transients in determining the psychophysical spatial frequency response. Vision Research, 15,41 1-415. Breitmeyer, B. G..Julesz. B. & Kropfl, W. (1975).Dynamic random-dot stereograms reveal up-down anisotropy and left-right isotropy between cortical hemifields. Science, 187. 269-270. Breitmeyer, B. G.. Levi, D. M. & Harwerth. R. S. (1981a). Flicker-masking in spatial vision. Vision Research, 21, 1377-1385. Breitmeyer. B. G.. Love. R. & Wepman. B. (1974).Contour suppression during stroboscopic motion and metacontrast. Vision Research, 14. 1451- 1456. Breitmeyer, B. G., May, J. G. 81 Williams, M. C. (1989). Asymmetries in metacontrast and motion with red/green isoluminant stimuli. Paper presented at the annual meeting of the Psychonomic Society, Atlanta, Georgia, November 17-19. Breitmeyer, B. G.. May. J. G. & Scott, S.(in preparation). Metacontrast and motion reveal asymmetries at red/green isoluminance. Breitmeyer, B. G.. Rudd. M. & Dunn, K. (1981b).Spatial and temporal
64
CHAPTER 2
parameters of metacontrast disinhibition. Journal of Experimental Psychology: Human Perception and Performance, 7,770-779. Breitmeyer. B. G. & Williams, M. C. (in press). Effects of isoluminant-background color on metacontrast and stroboscopic motion: Interactions between sustained (P) and transient (M) channels. Vision Research. Bridgeman, B., Lewis, S.. Heit, G. & Nagle. M. (1979). Relation between cognitive and motor-oriented systems on visual position perception. Journal of Experimental Psychology: Human Perception and Performance, 5 . 692-700. Brown, J. L & Black, J. E. (1975). Critical duration for resolution of acuity targets. Vision Research, 15.309-315. Brussell. E. M., White, C. W., Mustillo. P. & Overbury, 0. (1984). Inferences about mechanisms that mediate pattern and flicker sensitivity. Perception & Psychophysics, 35,301-304. Burbeck. C. (1981). Criterion-free pattern and flicker thresholds. Journal of the Optical Society of America, 71. 1343-1350. Burbeck. C. & Kelly, D. H. (1981). Contrast gain measurements and the transient/sustained dichotomy. Journal of the Optical Society of America. 71, 1335-1342. Burkhalter. A. & Van Essen, D. C. (1986). Processing of color, form and disparity information in visual areas VP and V2 of ventral extrastriate cortex in the macaque monkey. Journal of Neuroscience, 6, 2327-235 1. Burke, W. (1986). The function of optic nerve fibre groups in the cat studied by means of selective block. In J. D. Pettigrew. K. J. Sanderson & W. R. Levick (Eds.) Visual Neuroscience, pp. 97-110. Cambridge, England: Cambridge University Press. Burke, W., Burne. J. A. & Martin, P. R. (1985). Selective block of Y optic nerve fibres in the cat and the occurrence of inhibition in the lateral geniculate nucleus. Journal of Physiology, 364,8 1-92. Burke, W.. Cottee. L. J., Garvey, J. Kumarasinghe. R. & Kyriacou, C. (1986). Selective degeneration of optic nerve fibres in the cat produced by a pressure block. Journal of Physiology. 376. 461-476. Burke, W.,Cottee. L. J.. Hamilton, K.. Kerr, L., Kyriacou, C. & Milosavljevic. M. (1987). Function of the Y optic nerve fibres in the cat: Do they contribute to acuity and the ability to discriminate fast motion? Journal of Physiology, 392.35-50. Bushnell. M. C.. Goldberg. M. E. & Robinson, D. L. (1981). Behavioral enhancement of visual responses in monkey cerebral cortex. I. Modulation in posterior parietal cortex related to selective visual attention. Journal of Neurophysiology. 46, 755-772. Campbell, F. W. & Kulikowski, J. J. (1966). Orientation selectivity of the human visual system. Journal of Physiology, 187,437-445. Casima, J. M.. Blake, R. & Lema S. (1977). the effects of temporal modulation on the oblique effect in humans. Perception, 6,165-171. Cavanagh, P. (1985). Subjective contours signalled by luminance, vetoed by motion or depth. Bulletin of the Psychonornic Society, 23,273. Cavanagh, P. (1987). Reconstructing the third dimension: Interactions between color, texture, motion, binocular disparity and shape. Computer Vision, Graphics and Image Processing, 37. 171-195. Cavanagh. P. (1989a). Multiple analyses of orientation in the visual system. In D. M.-K. Lam and C. D. Gilbert (Eds.) Neural Mechanisms of
.
HISTORY
65
Visual Perception: From Single Cells to Perception, pp. 261-279. Houston: Gulf Publishers. Cavanagh, P. (1989b). Pathways in early vision. In Z . Pylyshyn (Ed.) Computational Processes in Human Vision: An Interdisciplinary Perspective, pp. 254-289. Nonvood, New Jersey: Ablex. Cavanagh, P. & Anstis. S. (1986). The contribution of color to motion in normal and color-deficient observers. Investigative Ophthalmology and Visual Science ISuppl.), 27.291. Cavanagh. P., Arguin, M. & von Gruenau. M. (1989). Interattribute apparent motion. Vision Research, 29, 1197-1204. Cavanagh. P.. Boeglin. J. & Favreau. 0. E. (1985).Perception of motion in equiluminous kinematograms. Perception, 14, 151- 162. Cavanagh, P. & Favreau. 0. E. (1985). Color and luminance share a common motion pathway. Vision Research. 25, 1595-1601. Cavanagh, P. & Leclerc. Y . G. (1989). Shape from shadows. Journal of Experimental Psychology: Human Perception and Performance, 15, 3-27. Cavanagh, P. & Mather, G. (1989). Motion: The long and short of it. Spatial Vision. 4, 103-129. Cavanagh, P., Tyler, C. W. & Favreau, 0. E. (1984). Perceived velocity of moving chromatic gratings. Journal of the Optical Society of America, Al. 893-899. Charles, E. R. & Logothetis, N. K. (1989). The responses of middle temporal (MT] neurons to isoluminant stimuli. Znuestigatioe Ophthalmology and Visual Science (Suppl.),30.427. Cleland. B. G. (1983). Sensitivity to stationary flashing spots of the brisk classes of ganglion cells in the cat retina. Journal of Physiology, 345, 15-26. Cleland. B. G.. Dubin, M. W. & Levick, W. R. (1971). Sustained and transient neurones in the cat's retina and lateral geniculate nucleus. Journal of Physiology, 217. 473-496. Cleland. B. G. & Harding, T. H. (1983). Response to the velocity of moving visual stimuli of the brisk classes of ganglion cells in the cat retina. Journal of Physiology, 345. 47-63. Cleland, B. G. & Levick, W. R. (1974). Brisk and sluggish concentrically organized ganglion cells in the cat's retina. Journal of Physiology, 240, 42 1-456. Crawford. B. H. (1947). Visual adaptation in relation to brief conditioning stimuli. Proceedings of the Royal Society, London, 129B, 94- 106. Crook, J. M., Lange-Malecki, B.. Lee, B. B. & Valberg, A. (1988).Visual resolution of macaque retinal ganglion cells. Journal of Physiology, 396,205-224. Dawis, S . . Shapley, R., Kaplan, E. & Tranchina. D. (1984). The receptive field organization of X-cells in the cat: Spatiotemporal coupling and asymmetry. Vision Research, 24, 549-564. de Lange. H. (1958). Research into the dynamic nature of the human fovea-cortex systems with intermittent and modulated light. 11. Phase shifts in brightness and delay in color perception. Journal of the Optical Society of America, 48. 784-789. De Monasterio. F. M. (1978a). Properties of concentrically organized X and Y ganglion cells in macaque retina. Journal of Neurophysiology, 41. 1394-1417.
66
CHAPTER 2
De Monasterio, F. M. (1978b). Center and surround mechanisms of opponent-color X and Y ganglion cells of retina of macaques. Journal of Neurophys iology, 4 1, 14 18- 1434. De Monasterio. F. M. & Schein, S. J. (1980). Protan-like spectral sensitivity of foveal Y ganglion cells of the retina of macaque monkeys. Journal Of Physiology, 299. 385-396. Derrington, A. M. & Badcock. D. R. (1985). The low level motion system has both chromatic and luminance inputs. Vision Research, 25, 1879-1884.
Derrington, A. M. & Henning, G . B. (1981). Pattern discrimination with flickering stimuli. Vision Research 21. 597-602. Derrington, A. M., Krauskopf, J. & Lennie, P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology. 357. 24 1-265. Derrington. A. M. & Lennie. P. (1982). The influence of temporal frequency and adaptation level on receptive field organization of retinal ganglion cells in cat. Journal of Physiology. 333. 343-366. Derrington. A. M. & Lennie. P. (1984). Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. Journal of Physiology, 367. 219-240. Desimone, R. & Ungerleider, L. (1989). Neural mechanisms of visual processing in monkeys. In F. Boller & J. Grafman (Eds.) Handbook of Neuropsychology, Vol. 2. pp. 267-299. Amsterdam: Elsevier. De Valois. R. L.. Smith, C. J., Kanoly. A. & Kitai. S. T. (1958). Electric responses of primate visual system: I. Different layers of macaque lateral geniculate nucleus. Journal of Cornparatiue and Physiological Psychology, 61, 662-668. de Weert, C. M. M. & Sadza. K. J. (1983). New data concerning the contribution of colour differences to stereopsis. In J. D. Mollon & L. T. Sharpe (Eds.) Colour Vision: Physiology a n d Psychophysics, pp. 553-562. London: Academic Press. DeYoe. E. A. & Van Essen. D. C. (1988). Concurrent processing streams in monkey visual cortex. Trends in Neuroscience, 11. 2 19-226. Dreher. B., Fukuda, Y. & Rodieck. R. W. (1976). Identification, classification and anatomical segregation of cells with X-like and Y-like properties in the lateral geniculate nucleus of old-world primates. Journal of Physiology. 258. 433-452. Eckhorn. R. & Poepel. B. (1981). Responses of cat retinal ganglion cells to the random motion of a spot stimulus. Viston Research, 2 1 , 435-443.
Enoch, J. M. (1978). Quantitative layer-by-layer perimetry. Inuestigatiue Ophthalmology & Visual Science, 17. 205-257. Enroth-Cugell. C. & Robson, J. G. (1966). The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology, 187, 517-552. Enroth-Cugell. C.. Robson, J. G.. Schweitzer-Tong. D. E. & Watson, A. B. (1983). Spatiotemporal interactions in cat retinal ganglion cells showing linear spatial summation. Journal of Physiology, 3 4 1 , 279-307.
Essock. E. A. & Lehmkuhle. S. (1982). The oblique effects of pattern and flicker sensitivity: Implications for mixed physiological input. Perception, 11. 441-455. Farah. M. J., Wong. A. B., Monheit. M. A. & Morrow, L. A. (1989). Parietal lobe mechanisms of spatial attention: Modality-specific or supramodal. Neuropsychologia, 27. 46 1-470.
HISTORY
67
Ferrera. V. P. & Wilson, H. R. (1985). Spatial frequency tuning of transient non-oriented units. Vision Research, 25. 67-72. Frascella, J. & Lehmkuhle. S. (1984). An electrophysiological assessment of X and Y cells as pattern and flicker detectors in the dorsal lateral geniculate nucleus of the cat. Experimental Brain Research, 55. 117-126. Ghilardi, M.F.. Onofrj. M., and Brannan, J.R. (1991). How can the concept of parallel channels aid clinical diagnosis? In J. R. Brannan (Ed.), Applications of Parallel Processing in Vision. Amsterdam: Elsevier. Glass, R. A. & Sternheim. C. E. (1973). Visual sensitivity in the presence of alternating monochromatic fields of light. vtsfon Research, 13. 689-699. Goldberg, M. E. & Robinson, D. L. (1978). Visual system: Superior colliculus. In R. B. Masterton (Ed.) Handbook of Behauioral Biology, pp. 119-164. New York: Plenum. Gorea, A. (1979). Directional and nondirectional coding of a spatio-temporal modulated stimulus. Vision Research, 19, 545-549. Gorea, A. & Papathomas. T. V. (1989). Form and surface attributes in motion perception studied with a new class of stimuli: A basic asymmetry. Bell Laboratories Technical Memorandum Gorea. A. & Papathomas, T. V. (1989). Motion processing by chromatic and achromatic visual pathways. Journal of the Optical Society of America, A6. 590-602. Green, M. (1981). Spatial frequency effects in masking by light. Vision Research, 21. 861-866. Green, M. (1984). Masking by light and the sustained-transient dichotomy. Perception & Psychophysics, 35, 519-535. Green, M. (1986). What determines correspondence strength in apparent motion? Vision Research, 26, 596-607. Gregory, R. L. (1977). Vision with isoluminant colour contrast: 1. A projection technique and observations. Perception. 6. 113- 119. Gross, C. G.. Bender, D. B. & Rocha-Miranda. C. E. (1974). Inferotemporal cortex: A single-unit analysis. In F. 0. Schmitt & F. G. Worden (Eds.) The Neurosciences Third Study Program pp. 229-238. Cambridge, Massachusetts: MIT Press. Harris, M. G. (1980). Velocity specificity of the flicker to pattern sensitivity ratio in human vision. Vision Research, 20, 687-69 1. Harwerth. R. S.. Boltz, R. L. & Smith, E. L. (1980). Psychophysical evidence for sustained and transient channels in the monkey visual system. Vision Research, 20. 15-22. Held, R. (1968). Dissociation of visual functions by deprivation and rearrangement. Psychologische Forschung, 3 1, 338-348. Hess. R. F.. Baker, C. L. J r . & Zihl. J. (1989). The "motion-blind'' patient: Low-level spatial and temporal filters. Journal of Neuroscience, 9. 1628-1640. Hess, R. F. & Plant, G. T. (1985). Temporal frequency discrimination in human vision: Evidence for an additional mechanism in the low spatial and high temporal frequency region. Visbn Research, 25, 1493-1500. Hicks. T. P., Lee, B. B. & Vidyasagar, T. R. (1983). The responses of cells in macaque lateral geniculate nucleus to sinusoidal gratings. Journal of Physiology, 337, 183-200.
68
CHAPTER 2
Hochstein, S . & Shapley, R. M. (1976). Quantitative analysis of retinal ganglion cell classifications. Journal of Physiology, 262. 237-264. Hoffmann. K.-P., Stone, J. & Sherman, S. M. (1972). Relay of receptive field properties in the dorsal lateral geniculate nucleus of the cat. Journal of Neurophysiology, 35, 518-531. Hughes, H. C. (1986). Asymmetric interference between components of suprathreshold compound gratings. Perception & Psychophysics, 40. 24 1-250. Humphrey, N. K. (1974). Vision in a monkey without striate cortex: A case study. Perception. 3. 241-255. Ingle. D. (1967). Two visual mechanisms underlying the behavior of fish. Psychologische Forschung, 31.44-51. Ingling. C. R. J r . & Grigsby, S. S. (1990). Perceptual correlates of magnocellular and parvocellular channels: Seeing form and depth in afterimages. Vision Research 30.823-828. Ingling, C. R. & Martinez-Uriegas. E.. (1985). The spatiotemporal properties of the r-g X-cell channel. Vision Research, 25, 33-38. Jakiela, H. G., Enroth-Cugell, C. & Shapley, R. (1976). Adaptation and dynamics in X-cells and Y-cells of the cat retina. Experimental Brain Research 24. 335-342. Jonides, J. (1981). Voluntary vs. automatic control over the mind's eye's movement. In J. B. Long & A. D. Baddeley (Eds.) Attention and Performance Ur, pp. 187-203. Hillsdale. New Jersey: Erlbaum. Jonides, J. & Yantis, S. (1988). Uniqueness of abrupt visual onset as an attention-capturing property. Perceptton & Psychophysics, 43. 346-354. Julesz, B. (197 1). Foundations of Cyclopean Perception. Chicago: University of Chicago Press. Julesz, B.. Breitmeyer. B. & Kropfl. W. (1976). Binocular-disparity-dependent upper-lower hemifield anisotropy and left-right isotropy as revealed by dynamic random-dot stereograms. Perception, 5, 129-141. Jung, R. (1961). Korrelationen von Neuronentaetigkeit und Sehen. In R. Jung & H. H. Kornhuber (Eds.) Neurophysiologie und Psychophysik des uisuellen Systems, pp. 410-435. Berlin: Springer. Jung. R. (1973). Visual perception and neurophysiology. In R. Jung (ed.) Handbook ofSensory Physiology , Vol. VII/3A. Central Processing of the Visual System, pp. 1-152. Berlin: Springer. Kahneman. D. (1967). An onset-onset law for one case of apparent motion and metacontrast. Perception & Psychophysics, 2. 577-584. Kahneman. D. (1968). Method, findings, and theory in studies of visual masking. Psychological Bulletin. 70,404-425. Kaplan. E. & Shapley. R. M. (1982). X and Y cells in the lateral geniculate nucleus of macaque monkeys. Journal of Physiology, 330. 125-143. Kelly, D. H . (1983). Spatiotemporal variation of chromatic and achromatic contrast thresholds. Journal of the Optical Society of America, 73, 742-750. Kelly, D. H. (1989). Spatial and temporal interactions in color vision. Journal of Imaging Technology. 15,82-89. Kelly, D. H. & Burbeck. C. A. (1984). Critical problems in spatial vision. CRC Critical Reviews in Biomedical Engineering. 10, 125-177. Kelly, D.H. & Burbeck, C. A. (1987). Further evidence for a broadband,
HISTORY
69
isotropic mechanism sensitive to high-velocity stimuli. Vision Research, 27, 1527-1537. Kelly, D. H. & van Norren. D. (1977). Two-band model of heterochromatic flicker. Journal of the Optical Society of America, 67. 1081- 1091. King-Smith, P. E. & Kulikowski. J. J. (1975). Pattern and flicker detection analyzed by subthreshold summation. Journal of Physiology, 249. 5 19-548. King-Smith, P. E. & Kulikowski. J. J. (1980). Pattern and movement detection in a patient lacking sustained vision. Journal of Physiology, 300, 60P. Komatsu, J., Roy, J. P. & Wurtz. R. H. (1988). Binocular disparity sensitivity of cells in area MST of the monkey. Society for Neuroscience Abstracts, 14 202. Krueger, J. (1977). Stimulus dependent color specificity of monkey lateral geniculate neurones. Experimental Brain Research, 30, 297-3 1 1. Krueger, J. (1979). Responses to wavelength contrast in the afferent visual systems of the cat and the rhesus monkey. VisionResearch, 19, 1351-1358. Krumhansl, C. L. (1982). Abrupt changes in visual stimulation enhance processing of form and location information. Perception & PSyChophySicS, 32. 511-523. Kulikowski, J. J. (1978). Spatial resolution for the detection of pattern and movement (real and apparent). Vision Research, 18, 237-238. Kulikowski. J. J., Bishop, P. 0. & Kato. H. (1977). Sustained and transient responses by cat striate cells to stationary flashing light and dark bars. Brain Research, 170, 362-367. Kulikowski, J. J. & Tolhurst, D. J. (1973). Psychophysical evidence for sustained and transient detectors in human vision. Journal of Physiology. 232. 149-162. Lee, B. B., Elepfandt. A. & Virsu, V. (1981). Phase of responses to moving sinusoidal gratings in cells of cat retina and lateral geniculate nucleus, Journal of Neurophysiology, 45. 807-817. Lee, B. B.. Martin, P. R. & Valberg. A. (1988).The physiological basis of heterochromatic flicker photometry demonstrated in the ganglion cells of the macaque retina. Journal of Physiology, 404, 323-347. Lee, B. B., Martin, P. R. & Valberg. A. (1989a). Sensitivity of macaque retinal ganglion cells to chromatic and luminance flicker. Journal of Physiology, 414, 223-243. Lee, B. B., Martin, P. R. & valberg, A. (1989b). Amplitude and phase of responses of macaque retinal ganglion cells to flickering stimuli. Journal of Physiology, 414, 245-263. Lee, B. B., Martin, P. R. & Valberg, A. (1989~). Nonlinear summation of M- and L-cone inputs to phasic retinal ganglion cells of the macaque. Journal of Neuroscience, 9. 1433-1442. Legge. G. M. (1978). Sustained and transient mechanisms in human vision: Temporal and spatial properties. Vision Research, 18, 69-81. Lehmkuhle, S., Kratz, K. E., Mangel. S. C. & Sherman S. M. (1980). Spatial and temporal sensitivity of X- and Y-cells in dorsal lateral geniculate nucleus of the cat. Journal of Neurophysiology. 4 3 , 520-541. Lennie. P. (1980a). Parallel visual pathways: A review. Vision Research, 20, 561-594.
70
CHAPTER 2
Lennie, P. (1980b). Perceptual signs of parallel pathways. Philosophical Transactions of the Royal Society, London, 290B.23-37. Leventhal, A. G.. Rodieck, R. W. & Dreher. B. (1981). Retinal ganglion cell classes in the Old World monkey: Morphology and central projections. Science, 213, 1139-1142. Levick. W. R. (1977). Participation of brisk-transient retinal ganglion cells in binocular vision -- an hypothesis. Proceedings of the Australian Physiological and Pharmacological Society. 8, 9-16. Levinson, E. & Sekuler, R. (1975). The independence of channels in human vision selective for direction of movement. Journal of Physiology, 250,347-366. Livingstone. M. S . & Hubel. D. H. (1984). Anatomy and physiology of a color system in the primate visual cortex. Journal of Neuroscience, 4. 309-356. Livingstone. M. S. & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7 , 34 16-3468. Livingstone. M. & Hubel. D. (1988). Segregation of form, color, mo&ment. and depth: Anatomy, physiologf. and perception. Science. 240. 740-749. Logothetis, N. K., Schiller. P. H.. Charles, E. R. & Hulbert. A. C. (1990). Perceptual deficits and the activity of the color-opponent and broad-band pathways a t isoluminance. Science, 247.2 14-217. Long, G. M. & Gildea. T. J. (1981). Latency for the perceived offset of brief target gratings. Vision Research, 21, 1395-1399. Lovegrove, W., Martin, F. & Slaghuis. W. (1986). A theoretical and experimental case for a visual deficit in specific reading disability. Cognitiue Neuropsychology, 3. 225-267. Lu. C. & Fender, D. H. (1972). The interaction of colour and luminance in stereoscopic vision. Investigative Ophthalmology, 11, 482-489. Lueck. C. J., Zeki, S., Friston. K. J., Deiber, M.-P., Cope, P.. Cunningham, V. J., Lammertsma, A. A.. Kennard, C. & Frackowiack, R. S. J. (1989).The colour centre in the cerebral cortex of man. Nature, 340. 386-389. Lupp. U., Hauske. G. & Wolf, W. (1976). Perceptual latency to sinusoidal gratings. Vision Research, 16, 969-972. Lynch, J. C. (1980). The functional organization of posterior parietal association cortex. Bruin and Behavioral Sciences, 2,485-499. Mack, A.. Klein. L.. Hill J. & Palumbo. D. (1989). Apparent motion: Evidence of the influence of shape, slant, and size on the correspondence process. Perception & Psychophysics, 46. 201-206. Malpeli. J. G., Schiller, P. H. & Colby, C. L. (1981). Response properties of single cells in monkey striate cortex during reversible inactivation of individual lateral geniculate laminae. Journal of Neurophysiology, 46, 1102-1119. Marrocco. R. T. (1976). Sustained and transient cells in monkey lateral geniculate nucleus : Conduction velocities and response properties. Journal of Neurophysiology, 40.840-853. Marrocco. R. T.. McClurkin, J. W. & Young, R. A. (1982). Spatial summation and conduction latency classification of cells of the lateral geniculate nucleus of macaques. Journal oJ’ Neuroscience, 2, 1275-1291.
HISTORY
71
Matin, E. (1974).Saccadic suppression: A review and analysis. Psychological Bulletin, 81,899-917. Matin. E. (1975).The two-transient (masking) paradigm. Psychological Review, 82. 451-461. Maunsell, J. H. R. (1987).Physiological evidence for two visual subsystems. In L. M. Vaina (Ed.) Matters of Intelligence: Conceptual Structures in Cognitive Neuroscience, pp. 59-87.Dordrecht: Reidel. Maunsell. J. H. R. & Schiller, P. H. (1984).Evidence for the segregation of parvo- and magnocellular channels in the visual cortex of macaque monkey. Neuroscience Abstracts, 10,520. Maunsell. J. H. R. & Van Essen, D. C. (1983).Functional properties of neurons in middle temporal visual area of the macaque monkey. 11. Binocular interactions and the sensitivity to binocular disparity. Journal of Neurophysiology. 49. 1148-1167. Maunsell, J. H. R. & Van Essen, D. C. (1987).The topographic organization of the middle temporal visual area in the macaque monkey: Representational biases and the relationship to callosal connections and myeloarchitectonic boundaries. Journal of Comparative Neurology, 266. 535-555. Merigan, W. H. (1989).Chromatic and achromatic vision of macaques: Role of the P pathway. Journal of Neuroscience, 9,776-783. Merigan, W. H. & Eskin. T. A. (1986).Spatio-temporal vision of macaques with severe loss of PB retinal ganglion cells. Vtsion Research 26. 1751-1761. Merritt, R. D. & Balogh, D. W. (1990).Backward masking as a function of spatial frequency: A comparison of MMPI-identified schizotypics and control subjects. Journal of Nervous and Mental Disease, 178. 186-193. Merritt. R. D. & Balogh, D. W. (in press). Backward masking spatial frequency effects among hypothetically schizotypal individuals. Schizophrenia Bulletin. Mishkin. M., Ungerleider, L. G. & Macko, K. A. (1983).Object vision and spatial vision: l b o cortical pathways. Trends in Neuroscience, 6. 414-417. Mountcastle. V. B. (1978).Brain mechanisms for directed attention. Journal of the Royal Society of Medicine, 71. 14-27. Murray, I., MacCana, F. & Kulikowski, J. J. (1983).Contribution of two movement detecting mechanisms to central and peripheral vision. Vision Research, 23, 151-159. Nakano, K. & Saccuzzo, D. P. (1985).Schizotaxia. information processing and the MMPI 2-7-8code type. British Journal ofClinica2 Psychology, 24. 217-218. Panish. S. C.. Swift, D. J. & smith, R. A. (1983).Two-criterion threshold techniques: Evidence for separate spatial and temporal mechanisms? Vision Research, 23. 1519-1525. Pantle, A. J. (1983).Temporal determinants of spatial sine-wave masking. Vision Research, 23, 749-757. Pantle, A. J. & Sekuler, R. W. (1968).Size detecting mechanisms in human vision. Science, 162, 1146-1148. Pantle. A. & Sekuler. R. (1969).Contrast response of human visual mechanisms sensitive to orientation and direction of motion. Vision Research, 9. 397-406. Papathomas, T. V., Gorea, A. & Julesz. B. (1989).Color does resolve
72
CHAPTER 2
ambiguities in apparent motion perception. Bell Laboratories Technical Memorandum Parker, D. M. (1980).Simple reaction times to onset, offset and contrast reversal of sinusoidal grating stimuli. Perception & Psychophysics, 28.365-368. Patel. A. S. (1966).Spatial resolution in the human visual system: Effect of mean retinal illuminance. Journal of the Optical Society ofAmerica, 56, 689-694. Petersen. S. E.. Miezin, F. M. & Allman, J. M. (1988).Transient and sustained responses in four extrastriate visual areas of the owl monkey. Experimental Brain Research, 70,55-60. Petersen. S. E.. Robinson, D. L. & Currie, J. N. (1989).Influences of lesions of parietal cortex on visual spatial attention in humans. Experimental Brain Research, 76,267-280. Petersen. S. E.. Robinson, D. L. & Morris, J. D. (1987).Contributions of the pulvinar to visual spatial attention. Neuropsychologia, 25. 97-105. Pettigrew. J. D. & Dreher, B. (1987).Parallel processing of binocular disparity in the cat's retinogeniculocortical pathways. Proceedings of the Royal Society. London, 2328, 297-321. Popper, K. (1962).The Open Society and Its Enemies, Vol. 1. London: Routledge & Kegan Paul. Posner. M. I. (1988).Structures and functions of selective attention. In T. Boll & B. Bryant (Eds.) Master Lectures in Clinical Neuropsychology, pp. 173-202. Washington. D. C.: American Psychological Association. Posner, M. I.. Cohen, Y. & Rafal. R. D. (1982).Neural systems control of spatial orienting. Philosophical Transactions of the Royal Society, London, 298B. 187-198. Posner. M. I. & Petersen. S. E. (1990).The attention system of the human brain. Annual Review of Neuroscience, 13,25-42. Posner. M. I., Walker, J. A., Friedrich. F. J. & Rafal, R. D. (1984). Effects of parietal injury on covert orienting of attention. Journal of Neuroscience, 4. 1863-1874. Posner. M. I., Walker, J. A.. Friedrich, F. A. & Rafal, R. D. (1987).How do the parietal lobes direct covert attention? Neuropsychologia, 25, 135-145. Previc. F. H. (1990).Functional specialization in the lower and upper visual fields in humans: Its ecological origins and neurophysiological implications. Behavioral and Brain Sciences, 13,519-541. Ramachandran. V. S..( 1987).Interaction between colour and motion in human vision. Nature, 328,645-647. Ramachandran. V. S. (1990). Visual perception in people and machines. In A. Blake & T. Troscianko (Eds.) AZ and the Eye, pp.21-77.New York: Wiley. Ramachandran. V. S.. Anstis, S. M. & Rogers, D. (1987). Correspondence strength in apparent motion. Investigative Ophthalmology and Visual Science (Suppl.). 28,299. Ramachandran. V. S. & Gregory, R. L. (1978).Does colour provide an input to human motion perception? Nature, 275. 55-56. Ramachandran, V. S. , Rao, V. M. & Vidyasagar. T. R. (1973).Apparent motion with subjective contours. Vision Research, 13, 1399-1401. Raymond, J. E. & Darcangelo. S. M. (1990).The effect of local luminance contrast on induced motion. Vision Research, 30, 751-756.
HISTORY
73
Reeves, A. (1981). Metacontrast in hue substitution. Vision Research, 21. 907-912. Regan, D. & Neima. D. (1984). Balance between pattern and flicker sensitivities in the visual fields of ophthalmological patients. British Journal of Ophthalmology, 68.310-315. Robinson, D. L., Bushnell, M. C. & Goldberg. M. E. (1980). Role of posterior parietal cortex in selective visual attention. In A. F. Fuchs & W. Becker (Eds.) Progress in Oculomotor Research, pp.203-2 10. Amsterdam: Elsevier. Robson, J. (1966). Spatial and temporal contrast sensitivity functions of the eye. Journal of the Optical Society of America, 56. 1141-1142. Rodieck. R. W.(1979). Visual pathways. Annual Review of Neuroscience, 2, 193-225. Rowe. M. H. & Stone, J. (1977). Naming of neurones: Classification and naming of cat retinal ganglion cells. Brain, Behauior and Evolution, 14, 185-216. Saccuzzo, D. P. & Schubert, D. L. (1981). Backward masking as a measure of slow processing in schizophrenia spectrum disorders. Journal of Abnormal Psychology, 90. 305-312. Saito, H.-A. & Fukuda, Y. (1986). Gain control mechanisms in X- and Y-type retinal ganglion cells of the cat. Vision Research, 26. 391-408. Saito, H.. Tanaka. K.. Isono,H., Yasuda. M. & Mikami, A. (in press). Directionally selective response of cells in the middle temporal area (MT) of the macaque monkey to the movement of equiluminous opponent color stimuli. Experimental Brain Research. Sato. T. (1988). Direction discrimination and pattern segregation with isoluminant chromatic random-dot patterns. I n u e s t i g a t i u e Ophthalmology and Visual Science (Suppl.), 29. 449. Saucer, R. T. (1954). Processes of motion perception. Science, 120, 806-807. Schiller. P. H. (1982). Central connections of the ON and OFF pathways. Nature, 297. 580-583. Schiller, P. H. (1984). The connections of the retinal on and off pathways to the lateral geniculate nucleus of the monkey. Vision Research, 24. 923-932. Schiller, P. H. (1986). The central visual system. Vision Research, 26. 1351-1386. Schiller, P. H. & Colby. C. L. (1983). The responses of single cells in the lateral geniculate nucleus of the rhesus monkey to color and luminance contrast. Vision Research, 23. 1631-1641. Schiller, P. H. & Koerner. F. (1971). Discharge characteristics of single units in superior colliculus of alert rhesus monkey. Journal of Neurophysiology, 35. 920-936. Schiller, P. H. & Logothetis, N. K. (in press). The color-opponent and broad-band channels of the primate visual system. Trends in Neuroscience. Schiller. P. H., Logothetis, N. K. & Charles, E. R. (1990).Functions of the color-opponent and broad-band channels of the visual system. Nature, 343, 68-70. Schiller, P. H. & Malpeli. J. G. (1978). Functional specificity of lateral geniculate nucleus laminae of the rhesus monkey. Journal of Neurophysiology, 41. 788-797. Schiller, P. H., Sandell. J. H. & Maunsell. J. H. R. (1986). Functions of the ON and OFF channels of the visual system. Nature, 322, 824-825.
74
CHAPTER 2
Schneider. G. E. (1967).Contrasting visuomotor functions of tectum and cortex in the golden hamster. Psychologische Forschung, 3 1, 52-62. Schwartz. S. H. & Loop. M. S. (1982).Evidence for transient luminance and quasi-sustained color mechanisms. Vision Research, 22. 445-447. Schwartz. S. H. & Loop, M. S. (1983).Differences in temporal appearance associated with activity in the chromatic and achromatic systems. Perception & Psychophysks. 33. 388-390. Scobey, R. P. (1981).Movement sensitivity of retinal ganglion cells in monkey. Vision Research, 21, 181-190. Sekuler. R. W. & Ganz, L. (1963).Aftereffect of seen motion with a stabilized retinal image. Science, 139. 419-420. Sestokas, A. K. & Lehmkuhle, S. (1986).Visual response latency of Xand Y-cells in the dorsal lateral geniculate nucleus of the cat. V i s m Research, 26. 1041-1054. Sestokas, A. K.. Lehmkuhle. S. & Kratz. K. E. (1987).Visual latency of ganglion X- and Y-cells: A comparison with geniculate X- and Y-cells. Vision Research, 27, 1399-1408. Shapley. R. M. (1991).Parallel retinocortical channels: X and Y and P and M. In J. R. Brannan (Ed.). Applications of Parallel Processing in Vision. Amsterdam: Elsevier. Shapley. R.. Kaplan, E. & Soodak,R. (1981).Spatial summation and contrast sensitivity of X and Y cells in the lateral geniculate nucleus of the macaque. Nature, 292. 543-545. Shapley, R. & Lennie, P. (1985).Spatial frequency analysis in the visual system. Annual Review of Neuroscience, 8. 547-583. Shapley. R. & Victor, J. D. (1978).The effect of contrast on the transfer properties of cat retinal ganglion cells. Journal of Physiology. 286, 275-298. Sherman, M. S. (1985).Functional organization of the W-. X-, and Y-cell pathways in the cat: A review and hypothesis. In J. M. Sprague & A. N. Epstein (Eds.) Progress in Psychobiology and Physiological Psychology, Vol. 11, pp.233-324.New York: Academic Press. Sherman, S. M., Wilson, J. R., Kaas. J. H. & Webb, S . V. (1976). X- and Y-cells in the dorsal lateral geniculate nucleus of the owl monkey (Aotus trivirgatus). Sclence, 192. 475-477. Shipp. S . & Zeki, S. (1985).Segregation of pathways leading from area V2 to areas V4 and V 5 of macaque monkey visual cortex. Nature, 315, 322-325. Silverman. S. E., Trick, G. L. & Hart, W. M. Jr. (1990).Motion perception is abnormal in primary open-angle glaucoma and ocular hypertension. Investigative Ophthalmology & Visual Science, 3 1 , 722-729. Singer, W. (1976).Temporal aspects of subcortical contrast processing. Neuroscience Research Program Bulletin, 15,358-369. Singer, W. & Bedworth. N. (1973).Inhibitory interaction between X and Y units in cat lateral geniculate nucleus. Brain Research, 49.291-307. Singer, W.. Zihl. J. & Poeppel. E. (1977).Subcortical control of visual thresholds in humans: Evidence for modality specific and retinotopically organized mechanisms of selective attention. Experimental Brain Research, 29. 173-190. Stein, J.. Riddell, P. & Fowler, S. (1989).Disordered right hemisphere
HISTORY
75
function in developmental dyslexia. In C. Von Euler, I. Lundberg & G. Lennerstrand (Eds.) Brain and Reading, pp. 139-157. New York: Stockton Press. Steronko, R. J. & Woods, D. J. (1978). Impairment in early stages of visual information processing in nonpsychotic schizotypic individuals. J o ~ n a Ol f Abnormal Psychology, 87,48 1-490. Stigler. R. (1910). Chronotouische S t u d i e n ueber d e n Ukgebungskontrast. Pfluegef s Archiv der gesamten Physiologie, 135, 365-435. Stone, J. (1983). Parallel Processing in the Visual System New York: Plenum. Stone, J. & Dreher, B. (1973). Projection of X- and Y-cells of the cat's lateral geniculate nucleus to areas 17 and 18 of visual cortex. Journal of Neurophysiology, 36, 551-567. Stone, J.. Dreher, B. & Leventhal, A. G. (1979). Hierachical and parallel mechanisms in the organization of the visual cortex. BrainResearch Review, 1. 345-394. Stoper. A. E. & Banffy. S . (1977). Relation of split apparent motion to metacontrast. Journal of Experimental Psychology: Human Perception and Performance, 3, 21 1-227. Stromeyer, C.. Klein. S . , Dawson. B. & Spillmann, L. (1982). Low spatial-frequency channels in human vision: Adaptation and masking. VisionResearch, 22. 225-234. Stromeyer. C.. Zeevi, Y. & Klein. S. (1979). Response of visual mechanisms to stimulus onsets and offsets. Journal of the Optical Society of America, 69. 1350-1354. Teller, D. Y. (1980). Locus questions in visual science. In C. S . Harris (Ed.) Visual Coding and Adaptability, pp. 151- 176. Hillsdale. New Jersey: Erlbaum. Teller, D. Y. (1984). Linking propositions. Vision Research, 24. 1233-1246. Todd, J. T. & Van Gelder, P. (1979). Implications of a transient-sustained dichotomy for the measurement of human performance. Journal of Experimental Psychology: Human Perception and Performance, 5 . 625-638. Tolhurst, D. J . (1973). Separate channels for the analysis of the shape and movement of a moving stimulus. Journal of Physiology, 231. 385-402. Tolhurst, D. J . (1975a). Reaction times in the detection of gratings by human observers: A probabilistic mechanism. Vision Research, 15, 1143-1149. Tolhurst, D. J. (1975b). Sustained and transient channels in human vision. Vision Research, 15. 1151-1155. Tolhurst. D. J. & Movshon, J. A. (1975). Spatial and temporal contrast sensitivity of striate cortical neurones. Nature, 257,674-675. Tootell. R. B. H., Hamilton, S . L. & Switkes, E. (1988b). Functional anatomy of macaque striate cortex. IV. Contrast and magno-parvo streams. Journal of Neuroscience, 8, 1594- 1609. Tootell. R. B. H., Silverman, M. S . & De Valois, R. L. (1983). Topography of cytochrome oxidase patterns in extrastriate cortex of the owl monkey. Society for Neuroscience Abstracts, 7,356. Tootell. R. B. H., Silverman. M. S . . Hamilton, S . L.. De Valois. R. L. & Switkes, E. (1988a). Functional anatomy of macaque striate cortex. 111. Color. Journal of Neuroscience. 8. 1569-1593.
76
CHAPTER 2
Tootell. R. B. H., Silverman. M. s., Hamilton, S. L.. Switkes, E. & De Valois. R. L. (1988~). Functional anatomy of macaque striate cortex. V. Spatial frequency. Journal of Neuroscience, 8 . 1610- 1624. Trevarthen, C. B. (1968). Two mechanisms of vision in primates. Psychologische Forschung, 31, 299-337. Trevarthen, C. B. (1978). Manipulative strategies of baboons and origins of cerebral asymmetry. In M. Kinsbourne (Ed.) Asymmetrical Function of the Brain, pp. 329-39 1. Cambridge, England: Cambridge University Press. Troy, J. B. (1983). Spatio-temporal interaction in neurones of the cat's dorsal lateral geniculate nucleus. Journal of Physiology, 344, 419-432. Troy, J. B. & Lennie, P. (1987). Detection latencies of X and Y type cells in the cat's dorsal lateral geniculate nucleus. Experimental Brain Research, 65, 703-706. Tsumoto. T. & Suzuki, D. A. (1976). Effects of frontal eye field stimulation upon activities of the lateral geniculate body of the cat. Experimental Brain Research, 25. 29 1-306. Tulunay-Keesey. U. (1972). Flicker and pattern detection: A comparison of thresholds. Journal of the Optical Society of America. 62. 446-448. Ungerleider, L. G. (1985). The corticocortical pathways for object recognition and spatial perception. In C. Chagas. R. Gattas & C. Gross (Eds.) Pattern Recognition Mechanisms, pp. 2 1-37. Vatican City: Pontifical Academy of Sciences. Ungerleider, L. G. & Mishkin. M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale & R. J. W. Mansfield (Eds.) Analysis of Visual Behavior, pp. 549-586. Cambridge, Massachusetts: MIT Press. Uttal, W. R. (1971). The psychobiologically silly season-or-what happens when neurophysiological data become psychological theories. Journal of General Psychology, 84. 151-166. Uttal, W. R. (1981). A Taxonomy of Visual Processes. Hillsdale. New Jersey: Erlbaum. Vaina. L. M. (1989). Selective impairment of visual motion interpretation following lesions of the right occipito-parietal area in humans. Biological Cybernetics, 61, 347-359. Van Essen. D. c. (1985). Functional organization of primate visual cortex. In A. Peters & E. G. Jones (Eds.) Cerebral Cortex, Vol. 3, pp. 259-329. New York: Plenum. van Nes. F. L., Koenderick. J. J.. N a s , H. & Bouman. M. A. (1967). Spatio-temporal modulation transfer function in the human eye. Journal of the Optical Society of America, 57, 1082-1088. Vassilov, A. & Mitov, D. (1976). Perception time and spatial frequency. Vision Research, 16, 86-92. Victor, J. D. & Shapley, R. M. (1979). Receptive field mechanisms of cat X and Y retinal ganglion cells. Journal of General Physiology, 74, 275-298. Volkmann, F. C. (1986). Human visual suppression. Vision Research, 26, 1401-1416. von Gruenau. M. W. (1978). Interaction between sustained and transient channels: Form inhibits motion in the human visual system. Vision Research. 18, 197-201. Waessle, H. (1986). Sampling of visual space by retinal ganglion cells. In J. D. Pettigrew. K. J. Sanderson & W. R. Levick (Eds.) Visual
HISTORY
77
Neuroscience, pp. 19-32. Cambridge, England: Cambridge University Press. Waessle. H.. Peichl, L. & Boycott, B. B. (1981). Morphology and topography of on- and off-alpha cells in the cat retina. Proceedings of the Royal Society, London, 212B. 157-175. Watson, A. B. & Robson. J. G. (1981). Discrimination at threshold: Labelled detectors in human vision. Vision Research, 21, 1115-1122. Weiskrantz. L. (1972). Behavioral analysis of the monkey's visual system. Proceedings of the Royal Society, London, 182B.427-455. Weisstein. N. (1972). Metacontrast. In D. Jameson & L. M. Hurvich (Eds.) Handbook of Sensory Physiology, Vol. 7 / 4 , Visual Psychophysics, pp. 233-272. New York: Springer. Weisstein. N., Maguire, W.. and Brannan. J.R. (1990). M and P pathways and the perception of figure and ground. In J. R. Brannan (Ed.), Applications of Parallel Processing in Vision. Amsterdam: Elsevier. Weisstein, N. & Growney, R. (1969). Apparent movement and metacontrast: A note on Kahneman's formulation. Perception & Psychophysics, 6, 321-328. Weisstein, N.. Ozog, G. & Szoc, R. (1975). A comparison and elaboration of two models of metacontrast. Psychological Review, 82, 325-343. Wiesel, T. N. & Hubel. D. H. (1966). Spatial and chromatic interactions in the lateral geniculate body of the rhesus monkey. Journal of Neurophysiology, 29, 1115-1156. Williams, M. C., and Lovegrove, W. (1990). Temporal processing deficits in specific reading disability. In J. R. Brannan (Ed.), Applications of Parallel Processing in Vision. Amsterdam: Elsevier. Williams, M. C. & LeCluyse, K. (1990). The perceptual consequences of a temporal processing deficit in reading disabled children. Journal of the American Optometric Association, 61, 111-121. Wilson, H. R. (1978). Quantitative characterization of two types of line spread functions near the fovea. Vision Research, 18.971-982. Wilson, H. R. (1980). Spatiotemporal characterization of a transient mechanism in the human visual system. Vision Research, 2 0 , 443-452. Wilson, H. R. & Bergen, J. R. (1979). A four mechanism model for threshold spatial vision. Vision Research, 19, 19-32. Wilson, H. R.. McFarlane. D. K. & Phillips, G. C. (1983). Spatial frequency tuning of orientation selective units estimated by oblique masking. Vision Research, 23. 873-882. Wurtz. R. H. & Albano, J. E. (1980). Visual-motor functions of the primate superior colliculus. Annual Review of Neuroscience, 3 , 189-226. Yantis, S. & Johnson, D. N. (in press). Mechanisms of attentional priority. Journal of Experimental Psychology: Human Perception and Performance. Yantis, S. & Jonides, J. (1984). Abrupt visual onsets and selective attention: Evidence from visual search. Journal of Experimental Psychology: Human Perception and Performance, 10,60 1-62 1. Yantis, S. & Jonides. J . (1990). Abrupt visual onsets and selective attention: Voluntary versus automatic allocation. Journal of Experimental Psychology: Human Perception and Performance, 16, 121- 134. Zacks, J . L. (1975). Changes in response of X and Y type cat retinal
78
CHAPTER 2
ganglion cells produced by changes in background illumination. Paper presented at the annual meeting of the Association for Research in Vision and Ophthalmology, Sarasota, Florida, April. Zeki. S. M. (1978).Functional specialization in the visual cortex of the rhesus monkey. Nature, 274, 423-428. Zihl, J. & von Cramon. D. (1979). The contribution of the 'second' visual system to directed visual attention in man. Brain,102. 835-856. Zihl. J., von Cramon, D. & Mai. N. (1983).Selective disturbance of movement vision after bilateral brain damage. Brain, 106. 313-340.
Parallel Processing and Visual Development
This Page Intentionally Left Blank
Applications of Parallel hocessing in Vision J. Brannan (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved
81
Parallel Processes in Human Visual Development ADFUANA FIORENTINI
Introduction It has long been known that the human visual system is largely immature at birth, but until some twenty years ago not much was known about the visual functional properties of the newborn. or about the rate at which the visual system develops in the early period of infant life. The introduction of behavioral and electrophysiological techniques that could successfully be applied to study infant visual capacities has considerably increased our knowledge of the visual improvement that occurs after birth. A number of papers have appeared that review recent achievements in this field (for instance Aslin, 1987; Atkinson and Braddick. in press: Banks and Dannemiller, 1987; Gwiazda et al., 1989a; Teller and Bornstein. 1987). Only recently, however. it has become apparent that the time course of visual development may be quite different for different aspects of vision, even during the first year of life. This fact may reflect the different rates of maturation of classes of neurons which process in parallel various aspects of visual information. Unfortunately it is often impossible to assign the result of a human developmental study, either psychophysical or electrophysiological, to the maturation of a specific neural structure. In addition one has to take into account that visual development occurs both serially, a t various peripheral and central levels, and in parallel for various visual functions. In some cases the factors that limit infant vision are imposed in the eye by the physical and anatomical properties of the photoreceptors. b u t further constraints derive from immaturities of the neural structures in the retina and/or in the brain. In this chapter some recent findings are reviewed which provide the opportunity to speculate about the development of parallel neural pathways in the human visual system both a t peripheral and at central levels. Some inferences will be made regarding the possible contribution of the two major neural streams of the primate visual system, the parvocellular (P) and magnocellular (MIpathways, to some properties of infant vision and of its improvement in the early life period.
82
CHAPTER 3
Structural development The infant visual system undergoes profound structural changes in the early postnatal period. Some developmental modification continues during childhood (see Hickey and Peduzzi, 1987,for a recent review). Apart from an obvious increase in the size of the eyeball, there is in the retina a long maturation process, mainly in the macular region (Yuodelis and Hendrickson. 1986). The cones increase dramatically in length after birth and become increasingly thinner and more closely packed in the very center of the macula. At the same time the ganglion cells, a t birth still present in front of the receptors even in the center of the retina, migrate to occupy more eccentric positions allowing the foveal pit to take its adult shape. This process takes a few years to complete. Similar modifications have been reported to occur in the retina of macaque monkeys (Hendrickson and Kupfer, 1976). The optic nerve fibers, almost completely unmyelinated a t birth, acquire a myelin sheet that increases progressively in width. This process proceeds from the orbital portion of the nerve towards the eye. Almost all fibers are myelinated by 7 months of age, but the width of the myelin sheet continues to increase thereafter, especially during the first two years of life (Magoon and Robb, 1981). The Lateral Geniculate Nucleus (LGN)is structurally adultlike at birth, clearly differentiated into six layers, with two magnocellular ventral layers and four parvocellular dorsal layers (Hickey and Guillery, 1979). The cell bodies of the newborn are considerably smaller than those of the adult, both in the ventral and dorsal layers, and about two years are required to complete the process of cell body growth in the human LGN (Hickey, 1977). The rate of maturation is different in the parvo and magnocellular layers: while cells in the parvocellular layers (P cells) approach adult size by the end of the sixth month post term. those of the magnocellular layers (M cells) do not reach a comparable development until the end of the first year. However, the dendritic morphology of LGN cells seems to be mature by the end of the ninth month, both in the parvo and magnocellular layers (Garey and de Courten. 1983). Not much is known about the structural development of the human visual cortex in infancy. Synaptic density increases considerably from around two months after term till around eight months, when it reaches its maximum. Thereafter the number of synapses decreases. to stabilize at about eleven years (Garey and de Courten, 1983). A postnatal growth of dendritic branching has been described in layers 3 and 5 of the striate cortex, with layer 5 neurons maturing earlier (within 5 months from birth) than layer 3 neurons (Becker et al.. 1984). The latter take about two years to complete maturation. There is also some evidence that intracortical horizontal connections between columns develop mainly after birth (Burkhalter and Bernardo. 1989). One of the more important modifications functionally, which is known to occur in the striate cortex of infant monkeys, is the segregation of monocular LGN inputs to layer 4C. underlying the formation of ocular dominance columns. At birth the inputs from the two eyes overlap extensively in layer 4C while in the adult macaque
HUMAN VISUAL DEVELOPMENT
83
monkey there is almost complete segregation (Hubel et al., 1977). Ocular dominance columns are known to be present in the human visual cortex, where they form a pattern similar to the monkey, although the single columns are considerably wider in the human adult than in the macaque (Horton and Hedley-Whyte, 1984). There is some evidence that ocular dominance columns in the human cortex form during the first six months of life. The columns have been found to be well formed in the cortex of 6 month old infants, but only poorly defined in the brain of a 4 month old infant (Hickey and Peduzzi, 1987: Horton and Hedley-Whyte, 1984).
Spatial characteristics: central vision The spatial characteristics of the adult visual system are best described by the contrast sensitivity function (CSF). which relates contrast sensitivity, Le., the reciprocal of the contrast threshold for resolving sinusoidal gratings, to stimulus spatial frequency (Campbell and Robson. 1968). Contrast thresholds can b e evaluated psychophysically or can be extrapolated from visual evoked potentials WEP) (Campbell and Maffei. 1970). A
0.2
H3 @Q 2 A-A 1
0.5
1
2
5
10
2 0 30
0.1
0.2
0.5
1
2
months months month
I
1
5
10
SPATIAL FREQUENCY ( d d e g )
Figure 1. A Contrast sensitivity functions obtained from VEP responses to contrast-reversed sinusoidal gratings in one infant at three ages (in months) and in one adult subject (from Pirchio et al.. 1978). Mean luminance: 7 cd/m2; square-wave contrast reversal: 8 Hz. B: Average contrast sensitivity curves of infants of different ages obtained behaviourally with the preferential looking technique. Stimulus: stationary gratings, mean luminance 55 cd/m2. Both psychophysical and electrophysiological methods have been applied to study contrast sensitivity of newborns and its development in early life period (see Banks and Dannemiller. 1987. and Mohn and van Hof-van Duin. in press, for recent reviews). The two methods agree in showing that contrast sensitivity is poor in very young infants, and restricted to a band of low spatial frequencies
84
CHAPTER 3
(Figure 1, A and B) (Atkinson et al.. 1977: Banks and Salapatek, 1978; Pirchio et al., 1978; Norcia et al. 1988, 1990). During the first six months of life there is a rapid increase in contrast sensitivity, especially a t medium to high spatial frequencies, with a related improvement in visual acuity and a n increase in the optimal spatial frequency (Figure 1. A). Contrast sensitivity at low spatial frequencies (below 1 c/degl remains relatively unchanged with age. Similar findings have been obtained behaviorally in infant monkeys (Boothe et al.. 1980 and 1988). although the time scale of development is different between the two species: one postnatal week for the monkey corresponds to about 4 weeks for the human infant. In spite of the qualitative agreement between the behavioral and the VEP findings on age related changes in the CSF, there are large quantitative differences among the available sets of data. In particular, VEP contrast sensitivities evaluated recently by Norcia et al. (1988,1990)with the swept-contrast technique (Norcia et al., 1985) are much higher than those obtained by others in VEP experiments (Pirchio et al.. 1978; Morrone and Burr, 1986; Atkinson and Braddick, 1989) or using behavioral techniques (Atkinson et al.. 1974; 1977; Banks and Salapatek. 1978). I t is not clear why the swept-contrast technique yields peak contrast sensitivities that consistently exceed those obtained with other methods: psychophysical thresholds coincide with VEP thresholds obtained in the same infant with the Campbell and Maffei ( 1970) extrapolation technique (Atkinson and Braddick. 1989). One possibility is that the function relating VEP amplitude to log contrast in infants is composed of two regression lines of different slopes and that the swept-contrast technique extrapolates to the lower threshold. The two regression lines could represent the activity of two populations of neurons with different contrast sensitivity and contrast gain as described in the monkey (see Kaplan et al., 1990, for review). Other discrepancies can be ascribed to differences in the temporal properties of the stimuli: stationary stimuli were used in two behavioral experiments (Atkinson et al.. 1977; Banks and Salapatek, 1978) and contrast reversal stimuli in the VEP experiments. A temporal modulation of stimulus contrast can facilitate the detection of low spatial frequency stimuli (thus reducing or eliminating the low-frequency fall off). but impair the detectability of high spatial frequencies. If the same temporally modulated stimuli are used in the same infant, the contrast thresholds evaluated behaviorally and with the VEP contrast extrapolation technique coincide (Atkinson and Braddick, 1989). Differences in optimal contrast sensitivity can also be due to differences in mean luminance of the stimuli employed in different experiments: optimal contrast sensitivity can be expected to increase in proportion to the square root of mean luminance in the photopic range and it does so in the adult. The high contrast sensitivities reported by Norcia et al. (1990) have been obtained at luminances exceeding 200 cd/m2, while most of the previous experiments employed luminances of 10 cd/m2 or less. Early behavioral studies on human infants (Atkinson et al., 1977; Banks and Salapatek. 1978) seemed also to indicate that the CSF of one month olds is a low-pass function and that the low-frequency fall off
H U M A N VISUAL DEVELOPMENT
85
typical of adult CSFs shows up later and becomes steeper with age. The adult CSF is thought to result from the sum of several detecting mechanisms with narrower tuning curves and different preferred spatial frequency (see Braddick et al., 1978. for review). The change in the shape of the infant CSF with age has accordingly been ascribed to the progressive maturation of detectors tuned to higher and higher spatial frequencies. This would be accompanied by an increase in contrast sensitivity of the low spatial frequency detectors present already at an early age and possibly by a change from low-pass to band-pass tuning properties. The low-pass shape of the neonatal CSF. however, seems not to be a firmly established fact, a t least for stationary stimuli. [For temporally modulated stimuli, a s typically employed in VEP experiments, the adult CSF has little or no low-frequency decline (Robson, 19661.1 Note that the infant curves of Figure 1A represent VEP amplitudes normalized a t peak contrast sensitivity. No low-frequency fall off has been found in CSF curves obtained by Norcia et al. (1988, 1990) from VEP extrapolated thresholds (see Figure 4. squares) Movshon and Kiorpes (1988) have reanalyzed contrast sensitivity data of human and monkey infants for stationary stimuli and have argued that the reported change in the shape of the CSF with age is probably a n artifact due to group-averaging. By separately analyzing the CSFs of single subjects of the same age they come to the conclusion that the data can be fitted by a function of constant shape, at each age, and that in order to fit the data at different ages it is sufficient to shift the function horizontally and vertically in a log-log plot. If so, it would be unnecessary to invoke the differential development of mechanisms tuned at different spatial frequencies. The development would consist of a scale change brought about primarily by the increase in focal length of the eye and by the change in the spacing of the foveal cones, accompanied by an increase in sensitivity. A similar hypothesis has been advanced by Wilson (1988). who assumes that the development of cortical inhibition also plays a role in sharpening the spatial frequency tuning of single detectors. During infant development, retinal and cortical acuity appear to have a common limiting factor. This is shown by the data reported in Figure 2. The acuities reported in this figure were evaluated from pattern electroretinograms (PERG) and pattern VEPs recorded simultaneously in infants two to six months old. PERG acuities (Figure 2. open symbols) improve with age in parallel with the improvement of acuities extrapolated from VEPs (Figure 2. closed symbols) (Fiorentini et al., 1984). It is generally accepted that limits to visual acuity in infants are mainly imposed by the maturational state of the fovea, and in particular of the foveal cones (Banks and Bennett, 1988: Wilson, 1988; Brown et al.. 1987). although there is no agreement about whether the size and spacing of cones is the only limiting factor. The disagreement derives from slightly different hypotheses on the quantum efficiency of cones in the infant retina. It is difficult at present to resolve this controversial point, because of the few anatomical data available of infant human retinae. Therefore one h a s to consider the possibility that postreceptoral factors also contribute to limit visual acuity, either in the
.
CHAPTER 3
86
retina or in the brain, or both. The data of Figure 2 indicate that postretinal developmental processes possibly involved in the improvement of spatial resolution proceed at the same rate as retinal development during the first six month of life.
14
10
0
A
21 0
I
1
2
D
0
I
I
4 AGE (months)
I
I
6
Figure 2. Infant acuities estimated from pattern ERG (open symbols) and VEP (closed symbols) in nine infants. Different symbols represent different subjects. Stimulus: sinusoidal gratings, mean luminance 50 cd/m2, contrast 50%, square-wave contrgst reversal: 6 Hz. Pattern-reversal VEPs are believed to reflect the activity of cortical neurons. In adults, the amplitude of the potentials in response to gratings depend upon the orientation of the grating: it is larger for vertical and horizontal gratings than for oblique gratings (Maffei and Campbell, 1970). This oblique effect is present also in infants starting from 3 months from birth (Sokol et al.. 1987). Since retinal and geniculate neurons of monkeys are not selective for orientation, it is generally assumed that orientational effects in human visual responses indicate cortical processes. Thus, it seems very likely that at three months of age, and possibly before, pattern-reversal VEPs reflect at least in part the activity of cortical neurons and not merely the LGN input to the visual cortex. If so, then the similar trend in the improvement of acuity evaluated from pattern ERG and VEP indicates that acuity of cortical neurons does not lag behind retinal development of acuity. This is consistent with findings on the monkey reported next. In the monkey LGN. the "acuity" of single cells in the foveal representation is low at birth, not exceeding 5 c/deg. During the first
HUMAN VISUAC DEVELOPMENT
87
year of life there is a gradual increase in the spatial resolution of foveal LGN cells (both in the parvo and magnocellular layers) until the mean "acuity" and the "acuity" of the best cells reach adult values of about 30 c/deg (Blakemore and Vital-Durand. 1986). The changes in spatial resolution of LGN cells seem to be related to a progressive decrease in the size of receptive field centers. Interestingly, the spatial resolution of cells in the monkey striate cortex also improves with age, and the improvement in the cortex parallels the improvement in the LGN (Blakemore and Vital-Durand, 1983).Behavioral visual acuity of infant monkeys (Teller et al., 1978; Boothe et al., 1988) increases at a slightly lower rate compared with the acuity of the best cells during the first three months from birth (Jacobs and Blakemore. 1988). Thus in the infant monkey, like in human infants, the development of cortical acuity seems to not lag behind the improvement proceeding at a more peripheral stage in the visual pathway. No data are available so far for the functional development of the retinal ganglion cells of infant monkeys. It has to be noted that in the LGN of the adult monkey, those M cells that show linear spatial summation seem to have similar spatial resolution (on average) as P cells, although cells with the highest acuities (exceeding 25 c/deg) may be more numerous in parvocellular t h a n in magnocellular layers (Blakemore a n d Vital-Durand, 1986). In the parafoveal region of the monkey retina, ganglion cells with sustained response properties seem not to exceed in acuity the resolution of phasic cells (Crook et al., 1988). which project almost exclusively to the magnocellular layers of the LGN. On the other hand, experiments on monkeys with selective degeneration of P cells suggest that the integrity of the P pathway is crucial for reaching a normal behavioral acuity (Merigan and Eskin. 1986; Merigan, 1989: Schiller et al.. 1990). In the newborn monkey the limits to optimal acuity are largely imposed by peripheral factors. The full development of foveal acuity takes a relatively long time in the monkey, as in humans. If the P pathway is mainly responsible for visual acuity in adult monkeys, it is reasonable to assume that in so far as the improvement in acuity reflects changes in the retino-cortical pathway, these changes should eventually involve the P system. As to the contrast sensitivity of P and M cells in infant monkeys. the data available s o far in the literature (Blakemore and Hawken, 1985) indicate that in the LGN the peak contrast sensitivity of the most sensitive P cells approaches adult values even in the neonate. The best M cells are more sensitive than the best P cells, but the difference is less marked than in the adult. Thus, cells in the magnocellular layers must undergo a relatively greater increase in contrast sensitivity during development than cells in the parvocellular layers. These findings might have a bearing on human visual development, as will be discussed at the end of this chapter.
Spatial characteristics: eccentric vision Static and kinetic perimetry show that the visual field of the young infant is small, compared with the adult. In the human newborn the orienting reaction to an object introduced in its peripheral visual
88
CHAPTER 3
field is restricted horizontally to within 20 - 30 deg from the fixation point and the vertical visual field is even narrower. The s u e of the visual field remains practically unchanged during the first two months of life (Schwartz et al.. 1987). then increases rapidly to approach adult levels by the end of the first year (see Mohn and Van Hof-Van Duin. in press). Morphologically, the extrafoveal retina of the newborn is relatively more mature than the fovea and its development seems to be complete by the end of the first year of life (Abramov et a1.,1982; Drucker and Hendrickson, 1989). There is unequivocal evidence, however, that peripheral spatial resolution improves after birth (Spinelli et al.. 1983; Sireteanu et al.. 1984; Sireteanu et al.. 19881, rapidly during the first 3 - 4 months and then more slowly. By 3 months of age, but not earlier, acuity is better in the temporal than in the nasal visual field at 20 deg eccentricity (Courage and Adams. 1990) as it is in adults (Rovamo and Virsu, 1979). In the LGN of the adult monkey, spatial resolution of P and M cell declines with eccentricity (Blakemore and Vital-Durand. 1986).At each eccentricity the mean resolution of X-type P and M cells (those that show linear spatial summation) are similar, while Y-type M cells (with non-linear spatial summation) have lower resolution (Blakemore and Vital-Durand. 1986). It seems therefore that the acuity of single neurons depends more on the functional properties of their receptive field (linear vs non-linear summation) than on the P - M classification. On the other hand one has to consider that the retinal ganglion cells that project to the parvocellular LGN layers (defined morphologically as P-beta cells) form the large majority of ganglion cells, while the cells that project to the magnocellular layers (P-alpha cells), are only lW?o of the total population (Perry and Cowey, 1985). Thus the sampling density of P cells largely exceeds that of M cells, and this may assign a predominant role in pattern resolution to the P system. In the newborn monkey, resolution of U ; N cells varies little with eccentricity. The subsequent improvement in resolution with age is prominent in the foveal and parafoveal LGN region, but small at larger eccentricities (Blakemore and Vital-Durand, 1986). This compares well with the larger increase in visual acuity for central vision than for peripheral vision in human infants during the first year (see above).
Temporal characteristics The temporal characteristics of infant vision have not been extensively investigated. Regal (1981) evaluated behaviorally the critical fusion frequency of infants by a forced-choice preferential looking (FPL) method (Teller, 1979). The stimulus was a uniform field square-wave modulated in luminance a t various temporal frequencies, to be discriminated from a non-modulated field of the same mean luminance. The critical fusion frequency (highest discriminable frequency of modulation) was found to increase with age after birth and to reach adult values within three months of age. This is an interesting finding. since in the monkey the sensitivity for fast flickering lights seems to be subserved by the M system (Schiller et al., 1990). More important for vision in a natural environment is the sensitivity to a temporal modulation of contrast in pattern stimuli.
HUMAN VISUAL DEVELOPMENT
89
There has been so far no systematic study of the development of spatio-temporal contrast sensitivity in infants. Some preliminary reports indicate that contrast sensitivity for a fixed spatial frequency is highly dependent upon the temporal frequency modulation of the pattern contrast. The temporal contrast sensitivity function of young infants, however, differs from the adult function, having both lower peak sensitivity and lower optimal frequency. Moreover, the low-frequency fall off in sensitivity characteristic of adult functions does not appear until 3 - 4 months of age (Hartmann and Banks, 1984; Swanson and Birch, 1989). Again, it is of interest to investigate whether the temporal characteristics of the neonatal visual system are constrained mainly at a retinal or at a higher level. Some information can be obtained from simultaneous recording of the pattern ERG and VEP (Fiorentini and Trimarchi, 1989). Temporal resolution evaluated from the PERG for gratings of low spatial frequency (0.5 c/deg) sinusoidally reversed in contrast, improves with age between 2 and 5 months of age, as it does for the pattern VEP. The function relating PERG amplitude to temporal frequency of contrast reversal is practically low-pass at 6 weeks of age and tends to become more band pass between 2 and 5 months from birth. The same is true for the temporal tuning function of the pattern VEP (Moskovitz and Sokol, 1980). [It has to be noted that these functions do not describe contrast sensitivity, but the dependence of response amplitude from temporal frequency for a constant stimulus contrast.] At each age there is a tendency for the PERG to peak at a higher frequency and to have a hlgher temporal resolution than the pattern VEP (Fiorentini and Trimarchi. 1989) as occurs for the adult (Plant, Hess and Thomas, 1986). Thus the development of temporal frequency characteristics for contrast reversal seems to be constrained by postretinal limiting factors, in addition to the limits imposed by retinal immaturity. If we had better knowledge of the complete spatio-temporal CSF in infants and of its changes with age, it would be possible to compare the development of sensitivity for temporal contrast modulation with the development of visual acuity. This might be relevant to the question of possible differential development of P and M pathways. In view of the findings obtained from behaving monkeys with selective destruction of P-beta ganglion cells (Merigan and Eskin. 1986: Merigan. 1989) and also on t h e basis of electrophysiological properties of P and M cells, there seems to be general consent that contrast sensitivity for temporally modulated patterns of low spatial frequencies is subserved by M cells, while spatial resolution tasks are mediated by P cells (Kaplan et al., 1990: Lennie et al., 1989). Unfortunately the developmental data about temporal frequency characteristics available so far are still incomplete. The data obtained from VEP experiments which typically employ gratings reversed in contrast at 5-8 Hz. indicate that contrast sensitivity for low spatial frequencies matures quite early compared with spatial resolution (Pirchio et al.. 1978. Norcia et al., 1990). It would be of interest to know how these flndings compare with the development of contrast sensitivity at low spatial frequencies for stationary patterns, but the data on psychophysical CSF available so far cover only the earliest postnatal months (Atkinson et al., 1977;
CHAPTER 3
90
Banks and Salapatek, 1978) or a much later range of preschool ages (Beazley et al.. 1980; Atkinson et al.. 1981). At what age the CSF for stationary stimuli is fully developed is still unknown. Several studies have been devoted to the development of VEP responses to transient contrast reversal. In the adult, these transient VEPs have a rather complex waveform, with a main positive deflection that peaks with a delay of about 100-110 ms with respect to stimulus reversal. In the newborn infant, the waveform is much simpler and the positive wave peaks with a much longer delay (around 250 ms or more). The peak latency shortens rapidly after birth and for checkerboard patterns with large checks it levels off at adult values towards the end of the first year. For small checks the peak latency decreases at a lower rate and takes longer to reach adult values (Moskowitz and Sokol, 1980). 300
r J
0
P-VEP
B
gj
A
D
n
P-ERG 0
I
0
I
L
I
I
8
I
I
12
I
16
I
I
20
I
I
24
I
28
' I
I
ADULT
AGE (weeks) Figure 3. Peak latency of the pattern VEP (solid symbols) and ERG (open and stippled symbols) as a function of age. Different symbols represent different subjects. Stimulus: sinusoidal grating, 0.5 c/deg, contrast 50%. square-wave reversed in contrast a t 1 Hz, mean luminance 50 cd/m2. The long latencies observed in the transient VEPs of young infants are likely to reflect the sluggish response properties of neonatal visual neurons (Blakemore and Vital-Durand, 1986) in addition to the slower conduction velocities of visual nerve fibers. Comparison
~
HUMAN VISUAL DEVELOPMENT
91
with the PERG in response to transient contrast reversal of low spatial frequency gratings is indicative of both these facts. The peak latencies of the neonatal ERG (Fulton and Hansen, 1982. 1989) and PERG (Fiorentini and Trimarchi. 1989) are also much longer than the adult's, suggesting that the neonatal photoreceptors and retinal neurons are also sluggish. However, PERG latency decreases during early infancy (Figure 3, open and gray symbols) and approaches adult values earlier than the VEP latency (Figure 3,closed symbols) (Fiorentini and Trimarchi, 1989). This suggests that retinal circuitry develops more rapidly than cortical circuitry. Latencies of the responses of single neurons to visual stimulation have been measured in the LGN of macaque monkeys of various ages from birth to adulthood (Blakemore and Vital-Durand, 1986). The latency of the responses is much longer in newborn monkeys compared with adults. It extends u p to 150 ms and even the shortest latencies are longer than the longest latencies of adult LGN
A
B
adult
500
--
10
weeks
-
4
K I-
10
=-
05
1
2
5
10
20
SPATIAL FREQUENCY (cy/deg)
Figure 4. Contrast sensitivity a t three luminance levels estimated from VEP recorded from adults (A) and infants 10 weeks old (B).Data obtained at 0.06 and 6 cd/m2 from one adult and one infant subject have been replotted from Fiorentini et al. (1980) Stimulus: sinusoidal gratings square-wave reversed in contrast a t 8 Hz. The data a t the highest luminance are means of five adults and ten 10-week-old infants, Stimulus: sinusoidal gratings reversed in contrast at 6 Hz.
92
CHAPTER 3
cells. Then latencies decrease rapidly and consistently and approach adult levels around 70 days of age. There is a tendency for cells in the magnocellular layers to have shorter latencies than parvocellular cells a t all ages. This fact probably reflects the higher conduction velocities of P-alpha cell axons.
Scotopic vision The properties of scotopic vision of young infants indicate that the rods, although not yet morphologically mature (Drucker and Hendrickson. 1989). are functional in the human neonate. The scotopic spectral sensitivity function of 1 and 3 month old infants practically coincides with the adult function and the absolute sensitivity is only 1.7-2 log units below adult sensitivity at 4 weeks and 0.7-1 log unit at 3 months (Powers et al., 1981; Hansen and Fulton, 1987). So far, there has been only one study of infant contrast sensitivity a t low luminance levels (Fiorentini a t al., 1980). These VEP data (Figure 4 B. filled circles) indicate that a t 10 weeks of age, contrast sensitivity for sinusoidal gratings of low luminance (0.06 cd/m2) reversed in contrast at 8 Hz is lower than the adult sensitivity (Figure 4 A. filled circles) and the same is true for acuity. Psychophysical data also show that a t 2 months visual acuity is lower than adult acuity at all luminance levels (Brown et al.. 1987). The difference in contrast sensitivity between infants and adults, however, is rather small (a factor of 2 - 3) and adult values are reached within 4 months from birth (Fiorentini et al., 1980). Figure 4 also compares CSFs obtained in adults and 10 week old infants a t a low photopic (open circles) and a high photopic level (closed squares) with the low-luminance CSF. Interestingly, the optimal contrast sensitivity of adults increases in proportion of the square-root of mean luminance. For infants, the contrast sensitivities of the two extreme sets of data (obtained in different laboratories) are also in agreement with the square-root law, while the data for the intermediate luminance deviate consistently from this law. This point is of interest and will be reconsidered in the Discussion. Summation properties of the infant scotopic visual system are also different from the adult, both in space and time. Area summation is about 12 times the adult's a t 4 weeks and 4 times the adult's a t 11 weeks (Hamer and Schmeck, 1984). Temporal summation also extends over much longer stimulus durations in 10 week old infants than in adults (Hansen and Fulton, 1990) and the temporal summation function is very shallow in young infants, suggesting that the inhibitory components of the temporal response function are delayed or less pronounce than in the adult (Fulton, 1988). In conclusion, receptoral and preneural factors seem insufficient to explain the immaturity of scotopic vision at birth and its subsequent development. Most of the developmental processes are likely to be due to changes occurring in visual structures central to the photoreceptors.
Spatial frequency selectivity In the adult monkey, a large proportion of cells in the LGN and in the striate visual cortex have band pass spatial frequency
93
HUMAN VISUAL DEVELOPMENT
characteristics (Kaplan and Shapley, 1982: Derrington and Lennie, 1984; Blakemore and Vital-Durand, 1986; Poggio et al., 1977: DeValois et al., 1982: Foster et al. 1985). In human subjects, psychophysical and electrophysiological evidence indicates that visual detectors are selective to limited bands of spatial frequencies (see Braddick et al.. 1978. for review). There are few data that describe the development of spatial frequency tuning of single cells in the monkey visual system. Blakemore and Vital-Durand (1986) report various examples of response curves
1 0.2 a
5
adult
-
0.0
1
I
.
0.8' Q6-
I
I
I 1 1 1 1
0
t
+
0
0
Q2 -
0
ST 1%month
I
I I I 1 1 1 1
I
I
I 1 1 1 1
A
V
0
-
0.4 -
t
.I
V
A
f
V A
CP
0
3%rnonths I
I
I IIII
I
]
MASK SPATIAL FREQUENCY (ddeg)
Figure 5 . Spatial frequency channels: effects on the amplitude of the VEP in response to a sinusoidal grating of constant spatial frequency (arrow) and moderate contrast ( 15-20%). square-wave reversed in contrast at 7 Hz. in the presence of a masking grating of high contrast, reversed in contrast at 6 or 9 Hz. Mean luminance: 6 cd/m2. A adult data for 4 different spatial frequencies of the test stimulus. B: data from an infant 1 1/2 months old. C: data from an infant 3 1/2 months old. Different symbols indicate different experimental sessions.
94
CHAPTER 3
of LGN cells as a function of stimulus spatial frequency, recorded from infant monkeys at different times from birth. In very young animals the responses of single cells peak at relatively low spatial frequencies and the high frequency cut off does not exceed 5 c/deg. However, the response functions show a clear low-frequency attenuation. Thus at least some LGN neurons have band pass spatial frequency characteristics even in the newborn monkey. The tuning then sharpens with age and the optimal spatial frequency a s well as the cut-off of the best resolving cells move toward higher spatial frequencies. These findings are relative to P cells. No data are available for the tuning characteristics of M cells in the infant monkey. Very little is known of the tuning characteristics of cortical neurons of infant monkeys. There seems to be some indication that in very young animals the spatial frequency tuning characteristics of cortical neurons have little low frequency attenuation (Blakemore and Vital-Durand, 1983). The same is true in young kittens (Derrington and Fuchs, 1981). There have been two attempts to find evidence for spatial frequency channels in human babies, both using a masking procedure. One study (Fiorentini et al., 1983) reported spatial frequency selective effects of masking on the amplitude of VEPs in response to sinusoidal gratings of fxed spatial frequency, reversed in contrast a t 7 Hz. The masking grating had a variable spatial frequency, either lower or higher than the test grating, and was reversed in contrast a t a slightly different temporal rate. The amplitude of the VEP in response to the test stimulus was reduced in the presence of the masking stimulus by a n amount that depended upon the difference in spatial frequency between the two stimuli (Figure 5). The second study (Banks. Stephen and Hartmann. 1985) applied the psychophysical preferential looking technique to investigate the effects of a narrow-band noise masker on the detectability of sinusoidal gratings of three different spatial frequencies. The two studies agree in showing that spatial frequency selectivity is present in infants 3 months old. The bandwidth of tuning at 1 c/deg at this age (Figure 5, C)is comparable to that of adult tuning for higher spatial frequencies (Figure 5, A). This finding can be understood in terms of the different spatial scales in the infant and adult foveae (Wilson, 1988). For younger infants, there is disagreement between the electrophysiological and the psychophysical studies. While in the former the data from one infant 6 weeks old show band pass tuning a t 0.3 c/deg (Figure 5. B). in the latter the average results of five 6 week old infants indicate low pass tuning. Whether this discrepancy is due to the small sample tested, to group averaging or to methodological differences remains to be investigated. One possible reason could be found in the different temporal properties of the test stimuli used in the two experiments. Possibly, band pass spatial frequency tuning may become manifest at an earlier age with temporally modulated than with stationary stimuli because of different developmental rates of mechanisms with different temporal response properties. It has also to be noted that, because of the contrast gain of the visual system (see for instance Figure 8), the effects of a masking stimulus on the contrast threshold (FPL experiment), may be expected to be considerably smaller than the effects on the response to a stimulus of suprathreshold contrast (VEP experiment).
HUMAN VISUAL DEVELOPMENT
95
Vernier acuity Vernier acuity (the ability to detect the misalignment of two abutting lines or gratings) is a type of hyperacuity. In foveal vision, adults vernier thresholds can be an order of magnitude better than thresholds for grating resolution. In peripheral vision on the contrary, vernier acuity drops much more steeply than grating acuity with increasing eccentricity (Westheimer, 1982: Levi et al., 1985). Having in mind the very small foveal thresholds for hyperacuity tasks in adults, one may be surprised to learn that in young infants vernier acuity evaluated behaviorally is lower than grating acuity (Shimojo and Held, 1987). This situation reverses rapidly, however, because vernier acuity develops at a higher rate than grating acuity (Figure 6. A). Already a t 3 to 4 months of age vernier acuity exceeds grating acuity (see Gwiazda et al.. 1989a for review). A difference in the rate of increase of grating acuity and vernier acuity can be expected merely on the basis of preneural factors, in particular of the quantum efficiency of the photoreceptors (Geisler, 1989: Banks and Bennet, 1988). That this is not the whole story, however, is suggested by two interesting facts about the development of vernier acuity. First, there is a sex difference in the rate of improvement of vernier acuity. Between 3 and 5 months females are better in vernier acuity than males (Held et al.. 1984). No sex difference is observed for the development of grating acuity. Second, vernier acuity continues to improve in children up to 7 years of age (Figure 6B, squares). while grating acuity levels off much earlier (Figure 6B, circles). Apparently, the development of vernier acuity requires the maturation of structures or the development of processes beyond those responsible for the age related increase in grating acuity. Perhaps all these factors mature simultaneously during an early life period, so that grating acuity and vernier acuity appear to have the same limiting factors. A differential time course in the development of vernier acuity and grating acuity has been found also in infant monkeys (Kiorpes and Movshon, 1989).These findings parallel those in human infants, apart from the different time scale.
Binocular function and stereoacuity Like most spatial acuities that develop gradually after birth, stereopsis seems to emerge abruptly between 3 and 4 months of age. After this sudden onset, stereoacuity increases very rapidly during the next few weeks, to reach thresholds as low as 60 arcsec around six months of age (see van Sluyters et al., 1989, for review). Several years seem then to be required for stereoacuity to match adult values (Gwiazda et al.. 1989a). Simultaneously with the onset of stereopsis there is evidence for the onset of another binocular function: infants start to prefer binocularly fusible stimuli to stimuli that in the adult produce binocular rivalry (such as vertical stripes in one eye and horizontal stripes in the other) (Shimojo et al.. 1986: Gwiazda et al., 198913). Other forms
CHAPTER 3
96
AGE
240
-p 0
-0”
(months)
-
120-
\
A
>
c
600
4: VERNIER A-A
30 ; d //
STEREO
UGRATING I
I
I
I
I
I
I
Figure 6. Development of grating acuity, stereoacuity and vernier acuity in infants (A1 (top panel) and children (B) (bottom panel). Vernier acuity and stereoacuity for some older infants (A). older children and adults (B)were limited by the maximum resolution of the display. Copyright 1989. Canadian Psychology Association. Reprinted by permission.
HUMAN VISUAL DEVELOPMENT
97
of binocular function have been investigated in infants, for instance the preference for random-dot stimuli correlated in the two eyes with respect to non-correlated stimuli. There is some controversy on whether this preference appears concomitantly with the onset of stereopsis (Smith et al., 1988) or appears earlier (Einzeman et al., 1989).
As for vernier acuity, there is a sex difference for the development of stereoacuity and of fusion preference (Bauer et al. 1986: Gwiazda et al., 1989a). Females show evidence for stereopsis and for fusion preference around 9 - 10 weeks, while males do not before 12 - 13 weeks. The sudden appearance of stereopsis and fusion around 3 months of age has been suggested to reflect the process of segregation of monocular inputs to layer IV in the striate cortex (Held. 1985). There is some evidence that this should occur between 4 and 6 months in human infants (Hickey and Peduzzi. 1987). It seems rather unlikely however that the segregation process is confined to a very brief period of time. Possibly segregation of monocular inputs is a necessary prerequisite for binocular stereopsis, but other factors are involved.
DF
age 7 w e e k s
Figure 7 . VEP responses to orientation reversal of a grating pattern in a 7 weeks old infant (b). The lower trace represent the timing of stimulus reversals. The upper trace (a) represents for comparison VEPs recorded in response to appearance of a pattern. Reprinted by permission from N a t u r e , Vol. 320, p.618. Copyright (C) 1986 Macmillan Magazines Ltd .
98
CHAPTER 3
Discrimination of orientation In the newborn monkey, cells in the visual cortex show a considerable degree of orientation specificity a n d a system of orientation columns is already established a t birth (Wiesel and Hubel. 1974). Since in higher mammals specificity for orientation is a property of cortical neurons that is not shared by neurons at lower levels of the visual system, it is of interest to know whether the human visual system also shows some kind of sensitivity for orientation of lines or contours. Two lines of research have been followed to investigate orientation discrimination in infants. Behavioral experiments based on the habituation paradigm (Maurer and Martello. 1980) provide evidence that human neonates can discriminate square wave gratings oriented a t 90 deg from each other (Slater et al., 1988). The selectivity for orientation however is probably rather poor at birth. In the adult it is possible to evaluate the width of orientation channels using a masking procedure (Campbell and Kulikowski. 1966). Experiments applying the masking technique to babies of various ages suggest that tuning for orientation is very poor at one month of age, but that it improves between 2 and 4 months and remains constant thereafter (Held et al.. 1989). On the whole, these behavioral experiments indicate that orientation selectivity is a t least to some degree innate and that it probably reaches adult values much earlier than other visual functions. Somewhat different findings have been obtained following another line, namely by recording VEPs in response to patterns that periodically change in orientation (Braddick et al.. 1986). Responses correlated with 90 deg shifts in orientation of the stimulus grating, occurring 8 times per second (Figure 71, could be recorded in infants 6 weeks old, but not in younger infants. It appears, however, that the age of onset of orientation-specific VEPs depends on the temporal rate of orientation-reversals. For reversals occurring 3 times per second, VEP responses could be obtained in infants 3 weeks old, earlier than for a 8 Hz rate of reversal (Braddick et al., 1989). This was confirmed using the habituation paradigm: at one months of age infants are sensitive to 90 deg shifts of orientation if these occur at a rate of 3 Hz. but not a t a rate of 8 Hz. These findings seem to recompose the apparent controversy between behavioral and electrophysiological studies of orientation discrimination in very young infants. Probably the VEP responses to a change in stimulus orientation reveal the relative immaturity of temporal characteristics of orientation selective neural mechanism. Both sets of results are consistent with the presence of some form of orientation discrimination very early in life. And both provide evidence that orientation selectivity improves rapidly after birth, approaching adult values by 3 - 4 months of age (Atkinson et al.. 1988; Held et al., 1989). It is possible, however, that the psychophysical experiment, based on masking effects, reveals the development of inhibitory interactions that are not necessarily involved in the VEP experiment.
HUMAN VISUAL DEVELOPMENT
99
Motion Perception Motion perception has been studied in infants both psychophysically and electrophysiologically. Psychophysical studies using the preferential looking procedure indicate that infants of 3 months of age and older show a preference for a moving with respect to a stationary pattern, provided the pattern velocity exceeds a threshold (see Dannemiller and Freedland. 1989). In 3 - 4 months old infants the minimum velocity threshold for drifting gratings of low spatial frequency is of the order of 3 - 5 deg/s (Aslin. 1988: Dannemiller and Freedland. 1989). Infants of 2 months or younger either do not show any preference for a moving stimulus (Dannemiller and Freedland, 1989) or have very poor sensitivity to motion. at least at low velocities (Kaufmann et al., 1985). VEP studies agree with the behavioral findings indicating a relatively late development of motion sensitivity. A VEP response to motion can be obtained by reversing at a fixed temporal frequency the direction of motion of a random dot pattern (which also jumps incoherently a t and between reversals a t a high temporal frequency). Responses time-locked to the reversals of motion direction, and not to the intervening jumps, are considered to reflect the activity of mechanisms sensitive to the direction of motion, and not simply to pattern change (Wattam-Bell, 1987).In adults, motion specific VEPs are recordable in a large range of stimulus velocities (5 to 30 deg/s). and peak around 15 deg/s (Wattam-Bell. in press). In infants younger than 10 weeks, no motion specific responses are recordable, even at low velocities (5 deg/s), although responses to the pattern jumps are clearly present. Motion specific VEPs emerge around 10 weeks of age at low velocities, but still later for stimull of higher velocity. The highest velocity at which a motion-specific VEP is obtained increases with age (Wattam-Bell, in press). Oculomotor responses are also rather immature in very young infants. For instance, smooth pursuit can be observed around 10 weeks of age for linear motion of a target at low velocities (Shea and A s h , 1988) but not at higher velocities, where it is replaced by a series of saccadic eye movements (Aslin. 1987). Optokinetic responses are immature at birth: monocular optokinetic nystagmus (OKN) can be elicited by stimuli moving in the temporal-nasal direction, but not in the opposite direction (Atkinson. 1979). I t is not until after 3 months of age that the monocular OKN can be driven in either direction. The immaturity of smooth pursuit can be explained at least in part by the lack or immaturity of motion perception. The OKN asymmetry has been ascribed to the lack of appropriate cortical inputs to the motor centers responsible for the optokinetic response (Atkinson, 1984; van Hof-van Duin.1978). In conclusion, both sensory and oculomotor responses to moving stimuli seem to be immature a t birth and to emerge somewhat later in comparison with other visual responses. If a longer age span is considered, however, it appears that motion specific responses may complete their maturation years in advance to some pattern specific responses (De Vries et al., 1989).
CHAPTER 3
100
Inhibitory interactions Inhibition plays a crucial role in shaping the response of single visual neurons and in controlling the interplay of stimulus evoked activity in different neurons. Signs of inhibitory phenomena in the intact visual system of adult human subjects are found for instance in subthreshold interactions, in masking phenomena and in the low-frequency cut off of the CSF. In cats, surround inhibition in retinal and LGN receptive fields is present, but weak, a t birth and it develops gradually during the early postnatal period (Hamasaki and Flynn. 1977;Rusoff and Dubin. 1977:Berardi and Morrone. 1984). Inhibition must be present at least to some degree in the LGN neurons of the neonatal monkey, because their spatial frequency tuning characteristics are band pass in shape (Blakemore and Vital-Durand, 1986).Apparently, this type of inhibition is less mature in the neonatal visual cortex (Blakemore and Vital-Durand. 1983). 1 16
-
w
10
-
n
I
L
5 n 3 k -
zn w
>
10 MONTHS
3.6 MONTHS
: -
--
6:
-
do
, , ,d , a/! ,
0 -
0.03
0.1
0.3
0.01
0.03
0.1
0. 3
0.01
0.03
0.1
0.3
CONTRAST OF TEST STIMULUS
Figure 8. Development of cross-orientation inhibition in one infant at three different ages. VEPs in response to contrast-reversal of a sinusoidal grating of low spatial frequency are plotted against the stimulus contrast (circles). The other symbols indicate VEPs in response to the same stimulus in the presence of a masking grating reversed in contrast at a different temporal frequency and either parallel (squares) or orthogonal (triangles) to the test grating. Note that at 4 months the parallel mask, but not the orthogonal mask, attenuates significantly the VEP, indicating orientation selectivity at that age. At 10 months, both the parallel and the orthogonal masks affect the VEP amplitudes, though in different ways, as occurs in the adult. Reprinted by permission from Nature, Vol. 321, p.235. Copyright (C) 1986 Macmillan Magazines Ltd. In human infants we have seen that some type of inhibitory phenomena are present at an early age. For instance, it is possible to suppress the response to a grating of a certain spatial frequency by
HUMAN VISUAL DEVELOPMENT
101
a mask of a different spatial frequency, but the masking effect is less strong than in the adult (Fiorentini et al., 1983; Banks et al., 1985). The CSF of young infants was reported to be low pass in shape, but probably this is an artifact due to not having used sumciently low spatial frequencies. Effects revealed by VEPs in response to complex visual stimuli, attributed in the adult to lateral inhibitory interactions, seem to be present in 8 week old infants (Sokol, Zemon and Moskowitz, personal communication). There is evidence, however that more subtle inhibitory effects, such as those that occur between orthogonal gratings at suprathreshold contrasts, do not emerge until six to eight months of age (Figure 8) (Morrone and Burr, 1986). This phenomenon, known a s "cross-orientation inhibition." is present in single cells of the cat visual cortex (Morrone et al.. 1982) and is believed to reflect GABA-mediated interactions among cells tuned to different orientations (Morrone. Burr and Speed, 1987). Also VEPs evoked by windmill-dartboard stimuli, that in the adult have been attributed to short-range lateral interactions (Zemon and Ratliff. 1982) do not appear earlier than 5 months from birth, and are still very immature at this age (Moskowitz and Sokol, 1989). The development of interactions between orthogonal stimuli and other types of complex stimulus interactions probably require the refinement and progressive selectivity of horizontal connections in the visual cortex, like those observed in visual cortical areas of the monkey (see Gilbert, 1985 for a review). These have been found to develop after birth (Burkhalter and Bernardo. 1989) and may rely upon the development of dendritic trees in the upper cortical layers (Becker at al. 1984) a s well a s in the progressive selectivity of intracortical synaptic connections, likely to start around the 8th month of age (Garey and de Courten. 1983).
Color vision The development of color vision in infants has been recently reviewed by Teller and Bornstein (1987). They conclude their overview of all the relevant literature with a few important established facts. First, photopic and scotopic spectral sensitivities are mature within the first or second postnatal month. Second, infants in the second month of life have trichromatic color vision, since they can do both Rayleigh discriminations (and therefore are neither protanopes nor deuteranopes) and tritan discriminations (and therefore must have a third type of cones, sensitive to short wavelengths). The three cone types with highest sensitivity in the long (L),medium (M) and short (S) range of wavelengths, respectively, are likely to have spectral absorption properties not dissimilar to adult photopigments. Third, infants 4 months old categorize wavelengths of the visible spectrum much in the same way as adults. Infant categorization is based on a habituation-dishabituation paradigm and grouping occurs for four ranges of wavelengths corresponding to the spectral regions that adults categorize a s blue, green, yellow and red, by hue naming (Bornstein et d., 1976: Boynton and Gordon, 1965). However, neonates and very young infants (3 weeks old) seem to have an immature S cone system Warner et al., 1985; Adams et al..
102
CHAPTER 3
1986). Moreover, 3-week-old infants fail to discriminate monochromatic lights (at mesopic luminances) that are discriminated by 7-week-olds (Clavadetscher et al.. 1988). Thus the ability to discriminate colors, a t least under mesopic conditions, emerges between 3 and 7 weeks from birth. Color vision seems therefore to develop early in infants and to have the main characteristics of trichromatic vision. This general statement is supported by further experimentation, that has also
f
L
10'
I
I
I
55 0
600
650
test wavelength (nm) Figure 9. Detection thresholds for various monochromatic lights on a monochromatic (580 nm) adapting background, obtained with the preferential looking procedure in 3 months old infants (circles) and with the yes-no procedure in adult subjects (triangles). The 8 deg circular stimulus was either sharply focussed (closed symbols) or blurred (open symbols). The arrows indicates the adapting wavelength.
HUMAN VISUAL DEVELOPMENT
103
uncovered other important aspects of color vision. Brown and Teller (1989) report a n interesting experiment in which the spectral sensitivity of 3 month old infants was evaluated a t five different wavelengths in the range 540 - 650 nm. In the middle of this range the spectral sensitivity curve presents a notch, like in adults, that reflects the non-additivity of responses of L and M cones (Figure 9). These findings are consistent with a color-opponent model. Therefore they provide evidence that at 3 months from birth, color-opponent mechanisms can be functional (with the caveat that infant color opponency has not been proven identical to adult opponencyl. Heterochromatic flicker photometry, a means to evaluate photopic spectral sensitivity, is usually performed with uniform field illumination. Anstis and Cavanagh (1983) have devised a technique for matching the luminances of gratings of different colors. The gratings are presented temporally in such a way to produce apparent motion in one or the other direction, according to their relative luminances. In adults this method yields spectral sensitivity curves equivalent to those evaluated with flicker photometry. Its usefulness for infant testing comes from the fact that patterns that are not isoluminant produce optokinetic nystagmus in either direction, while a t isoluminance the pattern appears practically stationary, and no OKN is elicited. Thus isoluminance can be determined by observing the presence and direction of optokinetic eye movements of infants. Application of this method (Maurer et al., 1989) and a variant of it (Teller and Lindsey, 1989) have confirmed that relative spectral sensitivities of 1 to 3 months old infants are remarkably similar to those of adults. It is important to recall, however, that the conditions of the motion-nulling OKN technique are likely to reveal the properties of a photopic mechanism of the peripheral retina. Previous disagreement about a difference between infant and adult spectral sensitivity curves in the short wavelength region of the spectrum may derive from methodological differences. Another possible reason is the lower density of ocular media of infants compared with adults. Hansen and Fulton (1989) measured absolute sensitivity a t the short wavelength end of the spectrum (401 nm) as well as a t 561 nm, in 10 week old infants. Comparison of the sensitivities at the two wavelengths in infants and adults showed that infants are relatively more sensitive than adults at 401 nm. Since human rhodopsin absorbs these two wavelengths equally well, the difference between infants and adults has to be ascribed to a higher optical density of the adults' eye at 401 nm. Most of this difference is probably due to the lens and is likely to decrease progressively with age. In conclusion, there is convergent evidence that the photopic spectral sensitivity of young infants is very similar to that of the adult and that in the second month from birth three types of cones are active and some color discriminations are possible. This does not a t all mean that color vision is completely developed at this age. For instance, infant color discriminations require large color contrast and also stimuli of large size (Packer et al., 1984; Adams et al.. in press). This suggests that the spatial characteristics of the infant color system are different from those of the adult.
104
CHAPTER 3
Spatial and temporal characteristics of color contrast sensitivity Visual potentials evoked by chromatic stimuli have offered a means to investigate the development of color contrast sensitivity in infants, both in the spatial and temporal frequency domain (Morrone et al., 1989). The stimulus was a periodic pattern obtained by superimposing two sinusoidal gratings of the same spatial frequency at crossed orientations. It appeared as a plaid pattern with very blurred contours. Two such patterns were generated by the red and the green guns of a T V monitor, and either presented separately or superimposed with a 180 deg spatial phase shift, to form a red-green plaid. The relative luminances of the red and green components could be varied at will, from 100% red to 100% green, while keeping constant the total mean luminance and the equal contrasts of the two components. If the proportion of red to green is varied continuously, a value of this ratio has to be such that the red and green components are matched in brightness. This is the so-called isoluminant point.
C
6
A adult
z2weeks
8.5 weeks lo
1
h
>
=l
W
n 3
.\.; -..__, - - - ..... 0
0.25 0.5
0.75
1
RATIO OF RED TO TOTAL LUMINANCE
Figure 10. VEP amplitude as a function of the ratio of red-to-total mean luminance in the stimulus pattern. A: adult subject, 1 c/deg stimulus, mean luminance: 16 cd/m2, contrast: 30%. square-wave reversal: 7.5 Hz. B and C: infant PAB, at two ages. Stimulus spatial frequency: 0.1 c/deg. mean luminance: 16 cd/m2, contrast: 90%, square-wave reversal rate: 2 Hz (B),3 Hz (C). In adults, plaid patterns reversed in contrast (at a temporal frequency of say 5 - 6 Hz) evoke a second harmonic VEP response for any value of the ratio of the red luminance to the total luminance. At relatively low contrasts, the VEP amplitude has a minimum for the ratio corresponding to the isoluminant point, a s determined by flicker photometry (Figure 10. A). Similarly to what is done with the
HUMAN VISUAL DEVELOPMENT
105
pattern-reversal VEPs in response to luminance contrast, it is also possible to evaluate a contrast threshold by extrapolation of VEP amplitude against chromatic (isoluminant) contrast. This threshold coincides with the psychophysical contrast threshold. Not surprisingly, therefore, application of this method to adults yields contrast sensitivity curves for chromatic contrast that are very similar in shape to those obtained psychophysically (Mullen, 1985). This method has been applied recently to investigate the development of chromatic contrast sensitivity in infants (Morrone et al.. 1989). Chromatic VEP responses to patterns of very low spatial frequency (0.1 c/deg) reversed in contrast a t a low temporal frequency do not emerge prior to 5 to 7 weeks of age (Figure 10, B and C). Responses at higher spatial and temporal frequencies have a later onset. Chromatic contrast sensitivity increases progressively with age between 2 and 6 months and approaches adult values before contrast sensitivity to luminance modulated patterns of the same spatial and temporal low frequencies (Figure 11. top). The same is true for chromatic acuity, i.e. the highest spatial frequency a t which a chromatic (isoluminant) VEP can be obtained: with the emergence of chromatic responses it increases more rapidly than VEP acuity for isochromatic, luminance modulated gratings (Figure 11, bottom). Sensitivity for chromatic contrast with isoluminant patterns is relatively low because of the broad and largely overlapping action spectra of the photopigments. This fact together with the immaturity of photoreceptors in the infant retina has been considered to be the main factor responsible for the difference between contrast sensitivity for luminance- and color-contrast in young infants (Banks and Bennett, 1988). However, other factors must play a role in the development of infant chromatic contrast sensitivity, since the rate of increase in sensitivity with age is different for chromatic and luminance contrast. It is likely that mechanisms responsible for encoding and processing color information a t post receptoral levels, e.g. color opponent receptive fields, are immature a t birth and develop later, at least in part independently from developmental processes involved in the increase of luminance contrast sensitivity with age. A recent report seems to disagree with this conclusion, however. Using the VEP swept contrast technique (Norcia et al., 1985) for evaluating contrast sensitivity, Allen et al. (1990) report that the contrast sensitivity for isoluminant stimuli has the same ratio as the sensitivity for luminance contrast in young infants and adults, suggesting that the reduced infant sensitivity results entirely from preneural factors. Further investigation seems to be required to resolve this controversy.
Discussion Not all possible aspects of human visual development have been covered in this chapter. For instance, visual attention, recognition of complex shapes or other cognitive aspects of vision have not been considered. Still, the picture that emerges from the previous sections is rather complex. The reader will be easily convinced that, as anticipated in the introduction, there is usually little ground for ascribing this or that aspect of human visual development to the maturation of one or
CHAPTER 3
106
A
cn z W
(I]
301
1
d.'
10
4
0
"31 2
0
3
0
1
Y R i
B
0 A
0
10
20
30
40
A G E (weeks)
Figure 11. Contrast sensitivity for a low spatial frequency (A)and acuity (B),evaluated from VEPs in response to red-green isoluminant gratings (open symbols) or to isochromatic (red-black or green-black) gratings (closed symbols) reversed in contrast at a low temporal frequency, plotted against age, for a group of infants 5 to 30 weeks old. Different symbols indicate different infants. Each point was obtained by extrapolation to zero amplitude of VEP amplitudes plotted against stimulus contrast (top) or spatial frequency (bottom). Points below the unit contrast sensitivity in the top graph indicate infants and ages at which no significant VEP could be obtained with isoluminant stimuli, although a t the same age VEPs in response to luminance-contrast reversals were clearly present.
HUMAN VISUAL DEVELOPMENT
107
another neuronal population. I t is clear, however, that different visual capabilities may emerge at different times after birth and that different visual functions may show different developmental trends. And there is convergent evidence from morphological and behavioral findings that postnatal visual development in human infants proceeds both serially, at subsequent stages of the visual pathways, and in parallel, along dffferent neural streams. There have been recent proposals to differentiate an early phase of visual development, dominated by retino-tectal structures, from a later phase, when cortical functions become predominant and the visual activity mediated by the cortex starts to control the subcortical visuomotor functions (see Atkinson. 1984). This two-stage developmental process accounts for some facts, such as the separate emergence of the two opposite directions of OKN. but it comes u p against the fact that newborns can discriminate patterns of different orientations. This behavioral performance has to rely upon neural mechanisms that respond selectively to different orientations. Orientational selectivity is a property of cortical neurons that is not shared, at least in non-human primates, by tectal neurons. Thus it has to be recognized that the visual cortex is active to some degree even in the newborn, though possibly very immature. Before discussing the possible functional effects of maturation at a cortical level, let u s consider those aspects of vision that may be determined primarily by the development of structures peripheral to the visual cortex. This is the case, for instance, for visual acuity and spatial contrast sensitivity. There seems to be general consent that contrast sensitivity for temporally modulated patterns of low spatial frequency is mediated by the M retino-cortical pathway, while color contrast sensitivity is subserved by cells in the P pathway (Merigan. 1989: Kaplan et al., 1990; Schiller et al.. 1990: Mollon, 1990). Scotopic pattern detection may imply primarily the activity of the M stream (Kaplan et al.. 1990). Now, the contrast sensitivity for temporally modulated patterns of spatial frequency less than 1 c/deg develops early (Atkinson et al.. 1974: Harris et al.. 1976: Pirchio et al., 1978: Norcia et al., 1988, 1990) and at low luminances contrast sensitivity matches adult values earlier than at higher luminances (Fiorentini et al., 1980).This may be ascribed to a n early functional maturation of M cells. On the other hand, if we consider how the contrast sensitivities for luminance and color contrast increase with age (Figure 10) it is tempting to jump to the conclusion that the development of the P system is somewhat delayed, but more accelerated with respect to the M system. While the findings of Figure 10 are not inconsistent with this interpretation, other factors have to be taken into account. First, as mentioned in the previous section, the delayed onset of responses to color contrast compared with luminance contrast (Morrone et al., 1990) may be a consequence of the spectral properties of the photoreceptors: the largely overlapping action spectra of the L and M cones impose a limit to the maximum attainable color contrast. Thus, what might appear to be a delayed emergence of the neural system mediating color contrast sensitivity can largely be accounted for by preneural factors (Banks and Bennett, 1988). The different slopes of the two developmental curves of Figure 10. on the contrary, cannot be
108
CHAPTER 3
accounted for by preneural factors, because these should affect by the same amount the sensitivities for luminance and color contrast. Secondly, a word of caution h a s to be said about the interpretation of contrast reversal VEPs in terms of separate contributions from the P and M systems. The steady-state VEP responses to contrast reversal, either luminance contrast or color contrast, contain only the even harmonics of contrast modulation and therefore represent non-linear components of the response. Second harmonic non-linearities have been observed in the responses of monkey M ganglion cells to flickering red-green uniform lights matched in luminance and have been ascribed to non-linear interactions between the L- and M-cone inputs to the non-color opponent M cells (Lee et al.. 1989). Before more is known about the neural origin of contrast-reversal VEPs, it would be premature to ascribe the isoluminant contrast-reversal VEPs exclusively to the P system. Some promising results for the differentiation of pure color-contrast VEPs from responses to luminance-contrast are being obtained from patterns modulated in contrast in the on-off mode a t low temporal frequencies (Fiorentini et al., 1990). The preliminary findings of these experiments are consistent with those obtained with pattern-reversal isoluminant V E P s indicating a differential development of color- a n d luminance-contrast responses. As to contrast sensitivity a t medium and high spatial frequencies, there is no general agreement whether in the adult this is subserved mainly by the M or the P system. I t is in this range that the infant data obtained under different experimental conditions differ mostly (see for example Figs. 1 and 4). In particular, VEP contrast sensitivities obtained with the swept-contrast technique are much higher than those obtained in a number of different laboratories with the extrapolation method of Campbell and Maffei (1970).The latter are generally in good agreement with each other. Whether this is a peculiarity of the swept-contrast technique, or is due to the much higher luminance employed (see Figure 4) remains to be clarified. I t is noteworthy, however, that peak contrast sensitivity at low photopic luminance (around 10 a t 10 weeks, according to most published data) deviates from the square-root law, in contrast with the contrast sensitivities found at low mesopic and high photopic luminances, which stand approximately in the same ratio a s the square-root of the respective luminances. One may speculate that contrast sensitivity results from the combined activity of different neuronal populations with different sensitivities, and that different experiments may reveal preferentially the contribution of one or the other of these populations. In summary, the data on infant CSF suggest that the M system. considered to be responsible for contrast detection of temporally modulated patterns of low spatial frequency, develops early, possibly during the first few months of life. This may appear difficult to reconcile with the fact that in the monkey, cells of the magnocellular LGN layers have to undergo a greater increase in contrast sensitivity during development than the cells of the parvocellular layers (Blakemore and Hawken. 1985). On the other hand, anatomical data on the plasticity of segregation of monocular inputs to the monkey striate cortex indicate that the postnatal period in which plastic changes can be induced in layer IVC. by reversal of monocular deprivation, is shorter
HUMAN VISUAL DEVELOPMENT
109
for the magnocellular inputs to layer IVC-alpha t h a n for the parvocellular inputs to layer IVC-beta (LeVay et al., 1980). This is consistent with a shorter developmental period of the magnocellular, compared with the parvocellular pathway. Visual acuity takes much longer to develop fully than contrast sensitivity a t lower spatial frequencies and the major constraint to infant visual acuity is probably the immaturity of the retina, and in particular the fovea. Whether a t each age the resulting acuity reflects primarily the properties of the P system, as seems to be the case for the adult monkey, or whether the M system also contributes to spatial resolution in infancy requires further investigation. The visual functions considered so far define the lowest values of the luminance- and color-contrast, in the spatial and temporal frequency domain, below which no vision is possible. Probably these threshold characteristics result from constraints imposed already at the input to the visual cortex and/or a t the earliest stages of cortical processing. Further processing however is required for the perception of suprathreshold stimuli and for guiding visuomotor responses. Various visual functions reviewed in the previous sections imply cortical processing, e.g. orientation discrimination, motion perception, stereopsis and, possibly in part, vernier acuity. Orientation discrimination, at least in a crude form, is present at birth and rapidly becomes more selective. Discrimination of moving from stationary patterns, VEP responses to reversal of motion direction, and smooth oculomotor pursuit have a later onset. In the monkey, processing of motion information relevant both for the perception of moving objects and the control of smooth pursuit seems to proceed primarily along the cortical stream leading from V1 to MT (Newsome et al., 1985; Livingstone and Hubel, 1988; Newsome and Pare', 1988: Schiller et al., 1990). which receives its major input from the M pathway. Accordingly, one might interpret the onset of motion responses during the third and fourth month of life as a sign of the emergence of cortical activity along this route. Orientation and color information required for the perception of other stimulus attributes are likely to be processed primarily along the cortical stream from V1 through V2 and V4 to IT (Mishkin et al.. 1983; Livingstone and Hubel, 1988; Merigan, 1989; Schiller et al.. 1990). The early presence of orientation selectivity and of color discrimination in infants might indicate an early functionality of this route, but we do not know how long it will take for it to reach its full potential. There may be various developmental stages, a s suggested by the fact that the orientation and spatial frequency channels, that provide the basic machinery for a multiscale analysis of form, are present shortly after birth, while more complex stimulus interactions emerge later. Judging from the latest findings on stereo- and vernier-acuity in children, the development of some cortical processes may cover a period of several years. outlasting the maturation of more peripheral stages. In conclusion, there are indications that visual development proceeds at different rates and/or emerges a t different ages for visual functions that are likely to be mediated by different neural streams between the retina and the primary visual cortex. As to the functions that imply further cortical processing, we are just beginning to acquire some knowledge of their age of onset and duration of development.
110
CHAPTER 3
Although there are indications that these may differ for different visual capacities, an interpretation in terms of different intracortical pathways must await future research.
References Abramov. I., Gordon, J., Hendrickson. A.. Hainline, L.. Dobs0n.V. & LaB0ssiere.E (1982).The retina of the newborn human infant. Science, 217, 265-267. Adams. R.J.. Maurer, D. and Davis, M. (1986).Newborn's discrimination of chromatic from achromatic stimuli. Journal of Experimental Child Psychology. 41, 267-281. Allen,D.. Banks,M.S.. N0rcia.A.M. and Shannon, L. (1990).Human infants' VEP responses to isoluminant stimuli. Investigative Ophthalmology and Visual Science, Suppl., 31. 10. Aslin R.N. (1987)Motor Aspects of Visual Development in Infancy. In P. Salapatek and L. Cohen (Eds.). Handbook of Infant Perception, Vol. I : From Sensation to Perception, pp. 43-113. Academic Press, Orlando, U.S.A.. Aslin. R.N., Shea. S.L. and Gallipeau. J.M. (1988).Motion threshold in 3-months-old infants. Inuestigative Ophthalmology and Visual Science, 29. 26. Atkinson. J. (1979).Development of optokinetic nystagmus in the human infant and monkey infant: an analog of development in kittens. In R.D. Freeman (Ed.). Developmental Neurobiology of Vision. pp. 277-287.Plenum Press, New York. USA. Atkinson, J. (1984).Human visual development over the first six months of life: a review and a hypothesis. Human Neurobiology. 3. 61-74. Atkinson, J.. Braddick. 0. & Braddick. F. (1974).Acuity and contrast sensitivity in infant vision. Nature, 247, 403-404. Atkinson. J.. Braddick, 0. and Moar. K. (1977).Development of contrast sensitivity over the first 3 months of life in the human infant. Vision Research, 17. 1037-1044. Atkinson. J.. French, J . and Braddick. 0. (1981).Contrast sensitivity function of preschool children. British Journal of Ophthalmology, 65, 525-529. Atkinson, J.. Hood, B.. Wattam-Bell, J.. Anker. S. & Tricklebank. J. (1988). Development of orientation discrimination in infancy. Perception, 17, 587-595. Atkinson J. and Braddick. O.J. (1990).The developmental course of cortical processing streams in the human infant. In C. Blakemore (Ed.). Vtsion: Coding and Efficiency. pp 247-253. Cambridge University Press, Cambridge, U.K. Banks, M.S. & Bennett, P.J. (1988).Optical and photoreceptor immaturities limit the spatial and chromatic vision of human neonates. Journal of the Optical Society of America, A5. 2059-2079. Banks, M.S. & Dannemiller, J.L. (1987).Infant visual psychophysics. In P. Salapatek & L.B. Cohen (Eds.), Handbook of Infant Perception, pp. 115-184.Academic, New York. Banks, M.S. and Salapatek, P. (1978).Acuity and contrast sensitivity in 1-. 2- and 3-month-old human infants. Investigative Ophthalmology and Visual Science, 17, 361-365.
HUMAN VISUAL DEVELOPMENT
111
Banks, M.S., Stephens, B.R. and Hartmann, E.E. (1985). The development of basic mechanisms of pattern vision: spatial frequency channels. Journal of Experimental Child Psychology, 40, 501-527. Bauer, 1.. Shimojo. S., Gwiazda, J. and Held, R. (1986). Sex difference in the development of binocularity in human infants. Investigative Ophthalmology and Visual Science, 27. 265. Beazley. L.D.. Illingworth, A.J. and Greer, D.V. (1980). Contrast sensitivity in children and adults. British Journal of Ophthalmology, 64, 863-866. Becker. L.E., Armstrong D.L., Chan, F. and Wood, M.M. (1984). Dendritic development in human occipital cortical neurons. Developmental Brain Research, 13, 117-124. Berardi, N. & Morrone. M.C. (1984). Development of gammaaminobutyric acid mediated inhibition of X cells of the cat lateral geniculate nucleus. Journal of Physiology, 357. 525-537. Blakemore. C. and Kawken, M. (1985). Contrast sensitivity of neurones in the lateral geniculate nucleus of the neonatal monkey. Journal of Physiology, 369, 37P. Blakemore. C. and Vital-Durand. F. (1983). Development of contrast sensitivity by neurones in monkey striate cortex. Journal of Physiology, 334, 18-19P. Blakemore, C. and Vital-Durand. F. (1986).Organization and post- natal development of the monkey's Lateral Geniculate Nucleus. Journal of Physiology, 380. 453-49 1. Boothe, R.G., Williams, R.A.. Kiorpes. L. and Teller, D.Y. (1980). Development of contrast sensitivity in infant Macaca Nemestrina monkeys. Science, N.Y.,208. 1290-1292. Boothe. R.G..Kiorpes.L., Williams, R.A. and Teller, D.Y. (1988).Operant measurements of contrast sensitivity in infant macaque monkeys during normal development. Vision Research, 28, 387- 396. Bornstein. M.H.. Kessen, W. and Weiskopf, S. (1976). Color vision and hue categorization in young human infants. Journal of Experimental Psychology: Human Perception and Performance, 2, 115-129. Boynton. R.M. and Gordon, J. (1965). Bezold-Bruecke hue shift measured by color-naming technique. Journal of the Optical Society of America, 55, 78-86. Braddick, O.J., Atkinson, J.. Wattam-Bell, J. and Hood, B. (1989). Characteristics of orientation selective mechanisms in early infancy. Inuestigatiue Ophthalmology and Visual Science, 30,313. Braddick. 0.. Campbell, F.W. and Atkinson. J. (1978). Channels in Vision. In R. Held, H.W. Leibowitz and H.L. Teuber (Eds.), Handbook of Sensory Physiology: Vol.WI1: Perception. ppl-38. Springer-Verlag, Berlin. Braddick. O.J.. Wattam-Bell, J . a n d Atkinson. J . (1986). Orientation-specific responses develop in early infancy. Nature, 320. 617-619. Brown, A.M., Dobson. V. and Maier, J. (1987). Visual acuity of human infants a t scotopic, mesopic and photopic luminances. vlsion Research, 27, 1845-1858. Brown, A.M. and Teller, D.Y. (1989). Chromatic opponency in 3month-old human infants. Vision Research, 29, 37-45. Burkhalter A. and Bernardo. K.L. (1989). Development of local connections in human visual cortex. Society for Neuroscience Abstracts, 15, 2.
112
CHAPTER 3
Campbell, F.W. and Kulikowski, J. (1966). Orientational selectivity of the human visual system. Journal of Physiology, 187.435-445. Campbell, F.W.& Maffei, L. (1970). Electrophysiological evidence for the existence of orientation and size detectors in the human visual system. Journal of Physiology, 207, 635-652. Campbell, F.W. & Robson, J.G.(1968). On the application of Fourier analysis to the visibility of gratings. Journal of Physiology, 197. 551-556. Clavadetscher. J.E.. Brown, A.M., Ankrum C. and Teller D.Y. (1988). Spectral sensitivity and chromatic discriminations in 3- and 7-week-old human infants. Journal of the Optical Society of America. SA,2093- 2105. Courage, M.L. and Adams. R.J. (1990). The early development of visual acuity in the binocular and monocular visual field. Infant Behavior and Development. Crook, J.M.. Lange-Malecki. B.. Lee, B.B. and Valberg, A. (1988).Visual resolution of macaque retinal ganglion cells. Journal of Physiology, 396.205-224. Dannemiller, J.L. and Freedland, R.L. (1989). The detection of slow stimulus movement in 2- to 5-month-olds. Journal of Experfmental Child Psychology. 47. 337-355. Derrington. A.M. and Fuchs. A. (1981). The development of spatialfrequency selectivity in kitten striate cortex. Journal of Physiology, 316,1-10. Derrington. A.M. and Lennie, P. (1984). Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. JOLUTUZZ Of Physiology, 357,2 19-240. DeValois. R. L.. Albrecht, D. G. & Thorell. L. G. (1982). Spatial frequency selectivity of cells in Macaque visual cortex. Vision Research, 22. 545-559. De Vries. M.. Van Dijk. B. and Spekreijse, H. (1989). Motion onset-offset VEPs in children. Electroencephalography and Clinical Neurophysiology, 7 4 . 81- 87. Drucker. D.N. and Hendrickson. A.E. (1989). The morphological development of extrafoveal human retina. Investigative Ophthalmology and Visual Science, Suppl.. 30, 226. Eisenman. B.S. and McCulloch, D. (1989). Development of binocular vision in infants. Investigative Ophthalmology and Visual Science, 30, 313. Fiorentini. A., Pirchio. M. and Spinelli, D. (1980). Scotopic contrast sensitivity in infants evaluated by evoked potentials. Investigative Ophthalmology and Visual Science, 19,950-955. Fiorentini A.. Pirchio M. & Spinelli D. (1983). Electrophysiological evidence for spatial frequency selective mechanisms in adults and infants. Vision Research, 23, 119-127. Fiorentini. A.. Pirchio. M. and Sandini. G. (1984). Development of retinal acuity in infants evaluated with pattern-electroretinogram. Human Neurobiology, 3,93-95. Fiorentini. A. and Trimarchi, C. (1989). Temporal properties of pattern electroretinograms in infants. Perception, 18, 49 1-492. Fiorentini, A.. Burr, D.C. and Morrone. M.C. (1990). Spatial and Temporal characteristics of colour vision: VEP and psychophysical
HUMAN VISUAL DEVELOPMENT
113
measurements. In A. Valberg (Ed.), Advances in Understanding Visual Processes. Plenum Press. Foster, K.H., Gaska. J.P.. Nagler. M. & Pollen, D.A. (1985).Spatial and temporal frequency selectivity of neurones in visual cortical areas V1 and V2 of the macaque Monkey. Journal of Physiology, 365. 331-363. Fulton. A.B. (1988). The development of scotopic retinal function in human infants. Documenta Ophthalmologica, 69. 101-109. Fulton, A.B. and Hansen, R.M. (1982). Background adaptation in human infants. Documenta Ophthalmologica Proceedings Series, 3 1, 191- 197. Fulton. A.B. and Hansen, R.M. (1989).Development of scotopic ERG OP's in human infants. Investigative Ophthalmology a n d Visual Science, 30, 314. Garey, L.J. and de Courten. C. (1983). Structural development of the Lateral Geniculate Nucleus and visual cortex in monkey and man. Behavioral Brain Research, 10. 3-13. Geisler, W.S. (1989). Sequential ideal-observer analysis of visual discriminations. Psychological Review, 96. 267-314. Gilbert, C.D. (1985).Horizontal integration in the neocortex. Trends in Neuroscience. 8. 160-165. Gwiazda J.. Bauer J. and Held, R. (1989a). From visual acuity to hyperacuity: a 10-year update. Canadian Journal of Psychology, 43, 109-120. Gwiazda. J.. Bauer, J. and Held, R. (1989131, Binocular function in h u m a n infants: correlation of stereoptic and fusion-rivalry discriminations. Journal of Pediatric Ophthalmology 8~Strabismus, 26. 128-132. Hamasaki. D.I. and Flynn, J.T. (1977). Physiological properties of retinal ganglion cells of three week old kittens. Vision Research, 17. 275-284. Hamer. R.D. and Schneck, M.E. (1984). Spatial summation in darkadapted human infants. Vision Research, 24. 77-85. Hansen. R.M. and Fulton. A.B. (1987). Scotopic spectral sensitivity of human infants. Investigative Ophthalmology and Visual Science, Suppl.. 28, 4. Hansen. R.M. & Fulton. A.B. (1989). Psychophysical estimates of ocular media density of human infants. Vision Research, 29, 687-690. Hansen, R.M. and Fulton, A.B. (1990). Effect of flash duration on scotopic thresholds of human infants. Investigative Ophthalmology and Visual Science, Suppl.. 31, 8. Harris. L.. Atkinson, J. and Braddick. 0. (1976). Visual contrast sensitivity of a 6-months old measured by the evoked potential. Nature, 264. 570-571. Hartmann. E.E. and Banks, M.S. (1984). Development of temporal contrast sensitivity in human infants. Investigative Ophthalmology and Visual Science, Suppl.. 25. 220. Held, R. (1985). Binocular Vision: Behavioural and Neural Development In J. Mehler and R. Fox (Eds.). Neonate Cognition: Beyond the Blooming Buzzing Confusion, pp. 37-44. Lawrence Herlbaum. Hillsdale, NJ. Held, R., Shimojo. S. and Gwiazda. J. (1984). Gender differences in the early development of human visual resolution. Investigative Ophthalmology and Visual Science, 25. 220.
114
CHAPTER 3
Held, R.. Yoshida, H., Gwiazda. J. and Bauer, J. (1989). Development of orientation selectivity measured by a masking procedure. Inuestigatfue Ophthalmology and Visual Science, 30. 312. Hendrickson, A. and Kupfer, C. (1976). The histogenesis of the fovea in the macaque monkey. Inuestigatiue Ophthalmology. 15. 746-756. Hickey. T.L. (1977). Postnatal development of the human Lateral Geniculate Nucleus: Relationship to a critical period for the visual system. Science, 198, 836-838. Hickey T.L. and Guillery R.W. (1979). Variability of laminar patterns in the human Lateral Geniculate Nucleus. Journal of Comparative Neurology, 183. 221-246. Hickey, T.L. and Peduzzi, J.D. (1987). Structure and Development of the Visual System. In P. Salapatek and L. Cohen (Eds.). Handbook of Infant Perception, Vol.1: From Sensation to Perception. pp. 1-42. Academic Press, Orlando, U.S.A.. Horton, J.C. and Hedley-Whyte. E.T. (1984). Mapping of cytochrome oxidase patches and ocular dominance columns in human visual cortex. Philisophical Trans. Royal Society of London B. 304, 255-272. Hubel. D.H., Wiesel. T.N. and LeVay. S. (1977). Plasticity of ocular dominance columns in monkey striate cortex. Philisophical Trans. Royal So~ietyOf London B, 278, 377-409. Jacobs, D.S. and Blakemore, C. (1988). Factors limiting the postnatal development of visual acuity in the monkey. Vision Research, 28. 947-958. Kaplan, E. and Shapley, R.M. (1982). X and Y cells in the lateral geniculate nucleus of macaque monkeys. J o m a l of Physiology, 330, 191- 198. Kaplan. E.. Lee.B.B. and Shapley, R.M. (1990). New views in primate retinal function. Progress in Retinal Research 9. 273- 337. Kaufmann F.. Stucki, M. and Kaufmann-Hayoz. R. (1985).Development of infants' sensitivity for slow and rapid motions. Infant Behavior and Deuelopment, 8, 89-98. Kiorpes. L. and Movshon, J.A. (1989). Differential development of two visual functions in primates. Proceedings of the National Academy of Science, USA, 86. 8998-9001. Lee, B.B., Martin, P.R. and Valberg, A. (1989). Nonlinear summation of M- and L-cone inputs to phasic retinal ganglion cells of the macaque. Journal of Neuroscience. 9. 1433-1442. Lennie. P., Trevarthen. C., Van Essen. D. and Waessle, H. (1989). Parallel Processing of Visual Information. In L. Spillmann and J.S. Werner (Eds.). Visual Perception: the Neurophysiological Foundations pp. 103-128. Academic Press, San Diego. USA. LeVay. S.. Wiesel, T.N. and Hubel, D.H. (1980). The development of ocular dominance columns in normal and visually deprived monkeys. Journal of Comparative Neurology. 191. 1-51. Levi. D.M.. Klein, S.A. & Aitsebaomo. A.P. (1985). Vernier acuity, crowding and cortical magnification. Vision Research 25. 963-977. Livingstone. M.S. and Hubel. D.H. (1988). Segregation of form, color, movement and depth: anatomy, physiology and perception. Science, 240, 740-749. Maffei. L. and Campbell, F.W. (1970). Neurophysiological localization of the vertical and horizontal visual coordinates in man. Science, 167. 386-387.
HUMAN VISUAL DEVELOPMENT
115
Magoon, E.H. and Robb, M. (1981).Development of myelin in human optic nerve. Archives of Ophthalmology, 99,655-659. Maurer, D. and Martello, M. (1980).The discrimination of orientation by young infants. Vision Research, 20,201-204. Maurer, D. Lewis, T.L.. Cavanagh, P. & Anstis, S. (1989).A new test of luminous efficiency for babies. Inuestigatfue Ophthalmology and Visual Science, 30. 297-304. Merigan W.H. and Eskin T.A. (19861. Spatio-temporal vision of macaques with severe loss of P-beta retinal ganglion cells. Vision Research, 28, 1751-1761. Merigan. W.H. (1989).Chromatic and achromatic vision of macaques: role of the P pathway. Journal of hTeuroscience,9,776-783. Mishkin. M.,Ungerleider. L.G. and Macko. K.A. (1983).Object vision and spatial vision: two cortical pathways. Trends in Neuroscience 8 , 414-417. Mohn. G. & van Hof. J. (1990)Development of spatial vision. In D.M. Regan (Ed.),Vision and Visual Dysfunction. Vol 10B. MacMillan. London. Mollon, J.D. (1990).The club-sandwich mystery. Nature. 343. 16-17. Morrone, M.C. & Burr, D.C. (1986).Evidence for the existence and development of visual inhibition in humans. Nature, 321, 235-237. Morrone, M.C., Burr, D.C. & Maffei, L. (1982).Functional significance of cross-orientational inhibition: part I, Neurophysiological evidence. Proceedings of the Royal Society of London, B216, 335-354. Morrone, M.C.. Burr, D.C. & Speed, H.D. (1987).Cross-orientation inhibition in cat is GABA mediated. Experimental Brain Research, 67. 635-644. Morrone. M.C.. Burr, D.C. & Fiorentini. A. (1989).Development of chromatic visual-evoked-potentials. Perception, 18. 491. Morrone. M.C., Burr, D.C. and Fiorentini, A. (1990).Development of infant contrast sensitivity and acuity to chromatic stimuli. Manuscript submitted for publication. Moskowitz, A. & Sokol. S. (1980).Spatial and temporal interactions of pattern-evoked cortical potentials in human infants. Vfsion Research, 20,699-707. Moskowitz. A. and Sokol. S. (1989).Development of lateral interactions in the infant visual system. Investigative Ophthalmology and Visual Science, Suppl., 30, 312. Movshon. J.A. and Kiorpes, L. (1988).Analykis of the development of spatial contrast sensitivity in monkey and human infants. Journal of the Optical Society of America A. 5. 2166-2172. Mullen. K.T. (1985).The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings. Journal of Physiology, 359. 381-400. Newsome. W.T. and Pare', E.B. (1988).A selective impairment of motion perception following lesions of the middle temporal visual area (MT). Journal of Neuroscience, 8 , 2201-2211. Newsome, W.T., Wurtz, R.H.. Duersteler, M.R. and Mikami, A. (1985). Deficits in visual motion processing following ibotenic acid lesions of the middle temporal visual area of the macaque monkey. Journal of Neuroscience, 5 , 825-840. Norcia. A.M., Clarke, M. and Tyler, C.W. (1985).Digital filtering and robust regression techniques for estimating sensory thresholds from the evoked potential. IEEE Engineering Medical Biology, 4. 26-32.
116
CHAPTER 3
Norcia, A.M., Tyler, C.W. and Hamer, R.D. (1988). High contrast sensitivity in the young human infant. Investigative Ophthalmology and Visual Science, 29, 44-49. Norcia, A.M., Tyler. C.W. and Hamer, R.D. (1990). Development of contrast sensitivity in the human infant. Vision Research. Packer, 0.. Hartmann. E.E. & Teller, D.Y. (1984). Infant colour vision: the effect of test field size on Rayleigh discriminations. Vision Research, 24. 1260-1984. Perry, V.H. & Cowey, A. (1985). The ganglion cell and cone distributions in the monkey's retina: Implications for central magnification factors. Vision Research, 25. 1795-1810. Pirchio, M.. SpinellLD.. Fiorentini, A. & Maffei, L. (1978). Infant contrast sensitivity evaluated by evoked potentials. Brain Research, 141. 179-184. Plant, G.T.. Hess. R.F. and Thomas, S. (1986). The pattern evoked electroretinogram in optic neuritis: a combined psychophysical and electrophysiological study. Brain, 109, 469- 490. Poggio. G.F., Doty, R.W. and Talbot. W.H. (1977). Foveal striate cortex of behaving monkey: single-neuron responses to square-wave gratings during fixation of gaze. Journal of Neurophysiology, 40, 1369-1391. Powers, M.K.. Schneck, M. & Teller, D.Y. (1981). Spectral sensitivity of human infants at absolute visual threshold. Vision Research, 2 1, 1005-1016. Regal D.M. (1981).Development of critical flicker frequency in human infants. Vision Research 21, 549-555. Robson. J.G. (1966).Spatial and temporal contrast sensitivity function of the visual system. Journal of the Optical Society of America, 56. 1141-1142. Rovamo. J. & Virsu. V. (1979). An estimation and application of the cortical magnification factor. Experimental Brain Research, 37, 495510. Rusoff, A.C. and Dubin. M.W. (1977). Development of receptive field properties of retinal ganglion cells in kittens. Journal of Neurophysiology. 40, 1188-1198. Schiller, P.H.. Logothetis, N.K. and Charles, E.R. (1990). Functions of the colour-opponent and broad-band channels of the visual system. Nature, 343, 68-70. Schwartz. T.L., Dobson, V.. Sandstrom. D.J. and van Hof-van Duin. J. (1987). Kinetic perimetry assessment of binocular visual field shape and size in young infants. Vision Research, 27. 2163- 2175. Shea. S.L. and Aslin. R.N. (1988). Oculomotor responses to step- ramp targets by young human infants. Investigative Ophthalmology and Visual Science, Suppl.. 29, 165. Shimojo. S . and Held, R. (1987). Vernier acuity is less than grating acuity in 2- and 3-month-olds. Vision Research 27, 77-86. Shimojo. S.. Bauer, J . , O'Connell, K.M. and Held, R. (1986). Pre-stereoptic binocular vision in infants. Vision Research, 2 6 , 501-5 10. Sireteanu. R., Fr0nius.M. and Constantinescu, D.H. (1988). The development of peripheral visual acuity in human infants: binocular summation and naso-temporal asymmetry. I n v e s t i g a t i v e Ophthalmology and Visual Science, Suppl., 29, 75.
HUMAN VISUAL DEVELOPMENT
117
Sireteanu, R., Kellerer. R. and Boergen. K.P. (1984). The development of peripheral visual acuity in human infants. A preliminary study. Human Neurobiology, 3. 81-85. Slater. A.. Morrison, V. and Somers, M. (1988). Orientation discrimination and cortical function in the human newborn. Perception, 17,597-602. Smith, J.. Atkinson, J, Braddick, O.J. and Wattam-Bell, J. (1988). Development of sensitivity to binocular correlation and disparity in infancy. Perception. 17. 395-396. Sokol, S.,Moskovitz. A. and Hansen. V. (1987). Electrophysiological evidence for the oblique effect in human infants. I n v e s t i g a t i v e Ophthalmology and Visual Science, 2 8 . 731-735. Spinelli. D.. Pirchio. M. and Sandini. G. (1983).Visual acuity in the young infant is highest in a small retinal area. Vision Research, 23. 1133-1136. Swanson. W.H. and Birch, E.E. (1989). Dependence of spatial contrast sensitivity on temporal frequency. Investigative Ophthalmology and Visual Science, Suppl., 30, 31 1. Teller, D. (1979). The forced-choice preferential looking procedure: A psychophysical technique for use with human infants. Infant Behauiour and Development, 2 , 135-153. Teller, D.Y. and Bornstein, M.H. (1987). Infant color vision and color perception. In P. Salapatek & L.B. Cohen (Eds.). Handbook of Infant Perception. Vol.1 : From Sensation to Perception, pp. 185-236. Academic, New York. Teller, D.Y. and Lindsey, D.T. (1989). Motion nulls for white versus isochromatic gratings in infants and adults. Journal of the Optical Society of America, 6A. 1945-1954. Teller. D.Y.. Regal, D.M., Videen, T.O. and Pulos, E. (1978). Development of visual acuity in infant monkeys (Macaca nemestrina) during the early postnatal weeks. Vision Research, 18.561-566. Van Hof - Van Duin. J. (1978). Direction preference of optokinetic responses in monocularly tested normal kittens and in light deprived cats. Archiues of Italian Biology, 116, 471-477. Van Sluyters. R.C., Atkinson. J.. Banks, M.S.. Hoffman, K.P. and Shatz. C. (1989). The Development of Vision and Visual Perception. In L. Spillmann and J.S. Werner (Eds.), Visual Perception: The Neurophysiological Foundations, pp. 349-379. Academic Press, San Diego, CA. Varner. D.,Cook. J.E.. Schneck. M.E.. McDonald, M. and Teller, D.Y. (1985). Tritan discriminations by 1- and 2-month-old human infants. Vision Research, 25, 82 1-832. Wattam-Bell, J. (1987). Motion-specific VEPs in adults and infants. Perception, 16.231-232. Wattam-Bell, J. ( 1990). Development of motion-specific cortical responses in infancy. Vision Research Westheimer. G . (1982). The spatial grain of the perifoveal visual field. Vision Research, 2 2 , 157-162. Wiesel, T.N. and Hubel, D.H. (1974). Ordered arrangement of orientation columns in monkeys lacking visual experience. Journal of Comparative Neurology, 158. 307-318. Wilson H.R. ( 1988). Development of spatiotemporal mechanisms in infant vision. Vision Research, 2 8 . 61 1-628.
118
CHAPTER 3
Yuodelis C. and Hendrickson, A. (1986). A qualitative and quantitative analysis of the human fovea during development. Vision Research, 26, 847-855. Zemon. V. & Ratliff. F. (1982). Visual evoked potentials: Evidence for lateral interactions. Proceedings of the National Academy of Science, 79, 5723-5725.
lications of Parallel Processing in Vision J.T r m a n (Editor) Q 1992 Elsevier Science Publishers B.V. All rights reserved A
119
Changes in Temporal Visual Processing in Normal Aging JULIE R. BRA"AN
Introduction Many aspects of visual processing change in association with normal aging. Acuity, spatial contrast sensitivity, color vision, dark adaptation, oculomotor function, binocular vision, and visual fields are often affected (see Owsley and Sloane. 1990, for a recent review), but perhaps the most pervasive change involves temporal sensitivity. Over a wide variety of tasks, older subjects generally find it difficult to detect rapid changes in a visual stimulus. This chapter will describe recent reports regarding temporal processing changes in older adults. The controversies over whether these changes are primarily due to optical or neural factors, and whether aging primarily affects one of two parallel systems, will also be discussed. Some of the earliest research on the effects of aging on temporal processing involved measurements of critical flicker fusion (CFF). the threshold temporal frequency where a flickering light no longer appears to be flickering. Several reports reveal that older adults consistently have lower CFFs than younger adults (Misiak. 1947; Coppinger, 1955; McFarland, Warren, and Karis, 1958; Huntington and Simonsen, 1965). suggesting that temporal resolution decreases with age. Some of this reduction can be attributed to yellowing of the lens and reduction in pupillary diameter with age (Kline and Schieber. 1982). but neural changes may also contribute to the decline in CFF sensitivity (Elliott, Whitaker, and MacVeigh, 1990). Another common change in the aging visual system, closely related to CFF threshold, is that older adults require more time between stimuli to detect their temporal order (McFarland. Warren, and Karis. 1958: Eriksen. Hamlin, and Breitmeyer. 1970; Kline and Orme-Rogers, 1978: Nine and Schieber. 1980). It has been suggested that age-related changes in temporal processing are due to an increase in stimulus persistence (Axelrod, Thompson, and Cohen.1963; for a review, see Kline and Schieber, 1982). Under this hypothesis, an overall slowing of the aging nervous system results in more time being necessary for recovery from a visual stimulus. If sufficient time is not provided, the first stimulus blurs, or persists, into the second. The results of many experiments support
120
CHAPTER 4
such a hypothesis. For example, Eriksen et al. (1970) required observers to detect the location of the gap in a Landolt-C. In spite of lower light sensitivity, if given sufficient time older adults' discrimination was equivalent to that of younger adults. This suggests that older adults may have compensated for lower overall sensitivity with longer integration (persistence) of the stimulus. While the stimulus persistence hypothesis may provide a convenient descriptive framework for temporal changes with age, it does not speculate on the cause of these changes. In the next section, a complementary hypothesis will be discussed which does suggest neural processes, specifically changes in one visual pathway, underlying agerelated decrements in temporal processing.
The "transient deficit" hypothesis There is considerable evidence that we process many aspects of visual information in parallel, along two separate pathways (see Weisstein, Ozog, and Szog, 1975; Breitmeyer and Ganz. 1976: Breitmeyer. this volume). Perhaps related to neuroanatomical dichotomies seen in ganglion cells of the cat (Enroth-Cugell and Robson, 1966) and cellular layers of the lateral geniculate nucleus in the monkey (Livingstone and Hubel, 1987; for a review see Shapley. this volume), this functional division of labor observed psychophysically in humans has been called the "transient/sustained" dichotomy. Under this heuristic division, the transient aspect of visual processing responds to stimuli with abrupt on- and off-sets, is optimally sensitive to low spatial and high temporal frequencies, contrasts near threshold, and produces a quick, rapidly decaying response. Because of these unique processing characteristics, the transient system might be best suited for holistic, global processing, the perception of motion, and the localization of objects in space. Conversely, the sustained aspect of visual processing responds in a more prolonged manner to stationary or slowly moving stimuli, higher spatial and lower temporal frequencies, and moderate to high contrasts. Its response characteristics make it optimally suited for analytic, featural processing, involving the perception of pattern information and fine detail. In general, stimuli presented in the fovea tend to activate the sustained system, with the transient system becoming more active a s stimuli move into peripheral vision. Kline and Schieber (1981) proposed that differential aging of the transient and sustained systems might account for changes in visual perception with age. Specifically, they suggested that a selective transient loss is consistent with the loss of temporal resolution with age, This idea has been followed up by many researchers (e.g.,Sturr, Kelly, Kobus. and Taub. 1982; Sturr, Church, and Taub, 1985; Sturr, Church, Nuding, Van Orden, and Taub, 1986; Kline. 1987; Sturr. Van Orden. and Taub, 1987; Sturr. Church, and Taub, 1988; Elliott et al.. 1990). with various conclusions. For a selective transient loss hypothesis to be correct, older adults should show consistent losses in sensitivity at high temporal and low spatial frequencies. As discussed previously. in general older adults do lose temporal resolution with age (although the use of certain paradigms result in little or no loss of temporal sensitivity with age,
AGING AND T E W O R A L PROCESSING
121
e.g.. Sturr et al., 1988). This temporal decline could be consistent with a "transient deficit" hypothesis. On the other hand, although there is one report of age-related contrast sensitivity losses at low and medium spatial frequencies (Sekuler. Hutman, and Owsley, 1980). most researchers have found losses a t higher spatial frequencies (e.g., Owsley, Sekuler. and Siemsen. 1983; Morrison and MacGrath, 1985; Owsley, Gardiner, Sekuler, and Lieberman. 1985; Elliott, 1987; Crassini, Brown, and Bowman, 1988; Elliott et al., 1990) or a generalized loss at all spatial frequencies (Ross, Clarke, and Bron. 1985; Sloane, Owsley, and Alvarez, 1988). It is difficult to relate spatial processing changes with age specifically to a neural "transient" loss as suggested originally by Kline and Schieber (1981). Some reports attribute spatial and/or temporal losses with age to changes in the optics of the eye such as senile miosis and increased lenticular light scatter (Owsley et al., 1983; Sturr et al., 19881, but others have reduced or compensated for these optical changes and suggest neural factors play a primary role (Morrison and MacGrath, 1985; Owsley et al.. 1985; Elliott, 1987; Brannan et al., 1988a. 1988b; Sloane. Owsley, and Jackson, 1988; Elliott, Whitaker. and Thompson, 1989; Elliott et al.. 1990). Elliott et al. (1990)has even suggested that aging may involve a selective sustained channel loss, based on Weale's (1975) theory of random cell death. Under this hypothesis, all cell types die randomly, but those types representing a one-to-one relationship between the retina and visual cortex (presumably those in the sustained system) would produce more disruption to visual perception than those with a many-to-one (transient) relationship. Although there is some consensus on the nature of temporal changes in aging vision, s o far there is no acceptable theoretical framework to account for these changes.
Recent experimental findings and a quantitative model of temporal processing changes with age To address many of the lingering questions regarding temporal processing changes with age, we (Brannan, Sekuler, Phillips, and Chan, 1988a; Brannan, Sekuler. and Phillips, 198813) designed experiments to provide a quantitative framework for describing age-related changes in visual temporal processing. The temporal properties of the visual system can be investigated by threshold contrast sensitivity measured for two or more temporal pulses separated by varying intervals (Ikeda. 1965; Rashbass, 1970). Using a variant of this approach, Bergen and Wilson (1985)demonstrated that the detectability of a trio of pulses can be accounted for by a model comprising linear filters together with nonlinear temporal probability summation. As the spatial properties of their pulsed stimuli varied, Bergen and Wilson found a covariation in the temporal impulse response function. For example, a biphasic impulse response function was needed to account for results with pulses of low spatial frequency, but a monophasic function was adequate for high spatial frequency pulses. This well-established procedure seemed promising a s a n instrument for probing age-related changes in vision. For any linear system, the impulse response function provides a complete characterization. Thus, if one could successfully describe the impulse
122
CHAPTER 4
response function, one could predict the system's response to any other temporal probe. Although the human visual system is inherently non-linear, it is possible to minimize non-linear effects by using a threshold detection paradigm (reducing contrast non-linearities) and stimuli of very short duration (lessening the contribution of temporal probability summation). It seems reasonable, a prtort, that the temporal impulse response function that underlies detection of a trio of near-threshold pulses should also control t h e perception of suprathreshold pulses, as manifest, for example, in studies of stimulus persistence. We decided to assess age-related changes in temporal processing in two different procedures: measuring near-threshold behavior by means of a three-pulse paradigm: and measuring suprathreshold responses in terms of persistence. Additionally, we wished to see whether age-related changes in these two measures could be accounted for by the Bergen and Wilson model. In the first experiment (Brannan et al., 1988b). the contrast sensitivities for three temporal pulses were obtained for a group of younger and older adults. Four older (three females, one male: mean age 74.1 years) and four younger (three females, one male: mean age 24.7) adults participated as subjects. Older adults had been screened for ocular disease during a thorough ophthalmological exam, and had normal or correct-to-normal visual acuity (mean 20/25, range 20/20 to 20/30).Acuity for all younger adults was 20/25 or better. All subjects were naive to the purpose of this experiment, although they had all previously participated in visual perception experiments unrelated to this one. Stimuli were spatial patterns whose one-dimensional luminance profiles were defined by the difference of Gaussians (hereafter, DOG): DOG(x,a) = 3 exp (-x2/02)- 2 exp (-x2/2.2502), where x is position along the horizontal, and Q is the DOG'S space constant. The advantage in using localized aperiodic spatial patterns such a s DOGS is that they are simultaneously well-localized in the space domain and band limited in the spatial-frequency domain. For all conditions, contrast sensitivities were measured for two sizes of DOGS, the narrower one having a peak spatial frequency of 12 c/deg and the broader one, 4 c/deg. The DOG patterns were temporally modulated by three, equally-spaced rectangular pulses, each 16.7 msec in duration. The amplitude (contrast) of the first and last pulses of the trio were 0.375 that of the middle pulse. Contrast was defined by Contrast = ( b e a k -
Lean
1/
Lean-
Threshold measurements were made a s a function of the delay between the pulses (the interstimulus interval: hereafter, ISI). Patterns were generated on the monochrome display of a Macintosh I1 computer. The mean luminance of the display was 22.1 cd/m2.
AGING AND TEMPORAL PROCESSING
123
Subjects sat 65 cm from the computer's monitor (17 cm x 25 cm) in a darkened room. A chin rest provided a comfortable rest for their heads, keeping the subjects' eyes even with the center of the display. Subjects viewed the display binocularly. Contrast thresholds were measured using a single interval forced choice procedure with a computer-controlled version of a randomized single staircase. 60
I
0
,
,
20
,
,
40
,
,
60
,
I
10
100 0
20
40
60
80
100
Interstimulus Interval (milliseconds)
Figure 1. Mean contrast sensitivity as a function of ISI. The left panel shows results for young subjects and the right panel shows results for old observers. Within each panel, filled symbols represent sensitivities for 4 c/deg, while open symbols are for 12 c/deg data. Arrows denote sensitivities to single pulses of 4 c/deg (filled arrow heads) or 12 c/deg (open arrow heads). Solid lines show the fits of the model to the data (see text for details). Within a single block of trials only one spatial frequency of DOG was used. For each trial, the computer chose at random one of seven ISIs, ranging from 0 to 100.4 msec in 16.7 msec increments. The subject initiated each trial with a key press. The DOG pattern was then presented at this IS1 at a preassigned contrast. The initial contrast of the middle DOG in all patterns was 80%. Following the presentation the subject was asked to press a key to indicate whether the pattern was "seen" or "not seen." If the response was "seen" the contrast was reduced by 1 dB for subsequent presentations at this ISI, otherwise it was increased by the same factor. Within a single block of trials each IS1 was presented 40 times. For each ISI. we computed the mean of the contrasts recorded for the last ten reversals of response (from "seen" to "not seen" or vice versa). The reciprocal of this mean contrast defined the contrast sensitivity. In addition, we measured the contrast sensitivity for each DOG modulated by just a single pulse of 16.7msec duration. Figure 1 shows the average contrast sensitivities for younger and older subjects. In each panel, sensitivity is plotted against ISI. The dark squares represent sensitivities for 4 c/deg, while the open squares are for 12 c/deg data. The solid straight llnes denote the single pulse sensitivity for the two spatial frequencies of DOGS. With 4 c/deg, younger subjects are more sensitive than older subjects a t all ISIs measured. Relative to the sensitivity of the single pulse data (solid line). both young and old subjects show facilitation at short ISIs followed
124
CHAPTER 4
by inhibition at ISIs beginning a t about 30 msec. The magnitude of the deviation from the single pulse sensitivity is more pronounced for the younger subjects. For the 12 c/deg DOG, younger subjects are again more sensitive than older subjects at all ISIs. At this spatial frequency only facilitation relative to the single pulse sensitivity is apparent. Contrast sensitivity as a function of IS1 is then monophasic for smaller DOG stimuli. To fit the data we used a form of temporal impulse response function proposed by Bergen and Wilson (1985): H(t) = A (t/d" exp(-t/r) [l/n! - B ( t / ~ ) ~ / ( n + k ) ! ] where A is the overall gain (amplitude), B is the area of the negative lobe (inhibition), l / z is the width of the positive lobe (facilitation), n is the steepness at onset and near the zero crossing, and k defines the shape of the negative lobe. The effects of temporal probability summation are taken into account by the Quick (1974) formulation: S=
(II
R(t') I'dt')
j/p
where R(t) =
L(t').H(t-t')d t '
and L(t') is the stimulus. The model fits to young and older subjects are shown as dashed lines in Figure la and lb. Parameters were chosen in order for the model curves to be within one standard deviation of the each data point. The parameters for each curve are tabulated in Table 1.
The data demonstrate that there are age-related changes in the sensitivity to three temporal pulses. It is possible to explain these changes in terms of the impulse response function proposed by Bergen and Wilson (1985). For the broad DOG (peak frequency, 4 c/deg), the parameters used to fit the young data are the same as those found in Bergen and Wilson (1985). except for the amplitude, A. To account for the changes in sensitivity with age, it was necessary to decrease the overall amplitude, A, by a factor of 1/410. However, simple attenuation of the impulse response was not sufficient to account for the changes. It was also necessary to decrease B. the area of the inhibitory lobe, by a factor of 1.6. This suggests that there is a differential loss of inhibition as well a s the overall decrease in temporal sensitivity with age. The contrast sensitivity data for the narrow DOG (12 c/deg) is monophasic and can be fit without a n inhibitory component (B=Oin the temporal impulse response function). For the young data, parameters were the same as those found by Bergen (1981).except for amplitude, A. To model the older adults' data, the amplitude, A, was decreased by a factor of 2. To maintain the temporal extent of the facilitory effect in the face of this decrease in amplitude it was necessary to decrease T, by a factor of 2. This suggests that temporal facilitation decreases with age, but its duration does not.
AGING AND TEMPORAL PROCESSING
125
In a second experiment (Brannan et al., 1988a), we measured the stimulus persistence of two suprathreshold pulses in younger and older observers. For various temporal separations between the pulses, we determined whether the observers perceived the two pulse presentation to be a single continuous pulse or two distinct pulses.
Table 1. 4 c/deg
12 c/deg
YOUNG
OLD
YOUNG
OLD
A
100
67
25
12
B
0.8
0.5
0
0
T
9.5
9.5
12.5
6.7
n
4.0
4.0
4.0
4.0
k
1 .o
1 .o
0
0
Sixteen older (10females, 6 males; mean age 73.2years) and 16 younger adults ( 1 1 females, 5 males: mean age 22.7)participated as subjects. All eight subjects from Experiment One were included in Experiment Two. Older adults had been screened for ocular disease, and had normal or corrected-to-normal visual acuity (mean 20/27. range 20/20 to 20/30).Acuity for all younger observers was 20/25 or better. Subjects were naive to the purpose of this experiment, although 10 of the older and 7 of the younger subjects had previously participated in other, unrelated visual perception experiments. Again. stimuli were spatially localized DOG patterns. Four different DOG stimuli were used. The peak spatial frequencies for the different sizes were 1. 4,8. and 12 c/deg. The DOGS were temporally modulated by a pair of 16.7 msec rectangular pulses, separated by intervals (ISI) of 0 to 100.4 msec. Both pulses were of equal amplitude (contrast) and were presented well above threshold, a t 0.2 contrast. At each ISI, we measured the proportion of trials in which the two pulses would be perceived as a single continuous flash as well as the proportion where they were seen as two distinct flashes. Data were collected using the method of constant stimuli. For every subject, four blocks of trials were run, one for each of the peak spatial frequencies: 1. 4. 8, and 12 c/deg. For every spatial frequency, each IS1 was presented randomly 20 times. Seven ISIs were used, ranging from 0 to 100.4 msec in 16.7 msec increments. For an IS1 of 0. the stimulus consisted of a single, uninterrupted pulse of 33.4 msec duration (two times 16.7 msec). After every presentation, observers reported whether the two-pulse presentation appeared continuous or not. To assess the effect of retinal illuminance, all younger subjects were also run wearing 0.7 neutral density filters. This reduction in
CHAPTER 4
126
retinal illuminance is equivalent to the most severe estimates of loss due to senile miosis (Said and Weale, 1959). Subjects viewed the display binocularly. YOUNGER SUBJECTS: 1 cldeg
YOUNGER SUBJECTS: 4 cldeg
htanllmulua Interval (maec)
Inlerallmulua Inlewal (muc)
YOUNGER SUBJECTS: 12 c l d g
VOUNGER SUBJECTS: 8 cldeg
loo-
I
n
m
a
a
-ca
a
-C
d .
c
-
C
0
0
0
”
Ql
Ql c
!!
0
a
20 -
o Inleratlmulus InlewaI (msec)
10
32
48
a4
80
96
Inlersllmuluolnlewal (mwc)
Figure 2.
Mean percent “continuous“ judgments a s a function of interstimulus interval for younger adults. Panels show data for 1. 4. 8, and 12 c/deg. Data for stimulus persistence of two pulses for younger and older observers are shown in Figures 2 and 3. For each spatial frequency, the proportion of two-pulse presentations that were reported as continuous is plotted against ISI. For younger observers (Figure 2). spatial frequency had little effect on the psychometric functions measured. However, for older observers (Figure 31, increasing the spatial frequency shifted the psychometric functions to the right, particularly at 12 c/deg. The rightward shift indicates that a s the spatial frequency increases, older observers require longer I S I s to detect the gap separating the two pulses. Figure 4 summarizes the persistence results. A threshold IS1 (in msec) for two-pulse persistence is plotted against the peak spatial
AGING AND TEMPORAL PROCESSING
127
frequency of the DOG stimuli used. To estimate the threshold for each spatial frequency, a least squares regression was fit to each set of data in Figures 2 and 3. The threshold value was defined as the IS1 a t which subjects had reported the two pulses as continuous 75% of the time. The psychometric functions measured for younger adults wearing 0.7 neutral density filters were also fit. It is evident that the decrease in retinal illumination produced by these filters had no significant effect on the two-pulse stimulus persistence for younger observers. These data capture the main findings of this experiment: 1) older subjects require longer ISIs than do young subjects before seeing the pair of pulses as not continuous: 2) this difference is largest at 12 c/deg: and 3) reducing retinal illuminance does not noticeably change younger subjects' persistence thresholds. OLDER SUBJECTS: 1 C/&g
OLDER SUBJECTS: 4 ddeg
loo,
I l
-
0
o
i s
32
48
64
so
r
n
P
es Int.nUmulu@ Interval (maec)
Interallmulua Interval (msee)
O U E R SUBJECTS: 12 ddeg
OLDER SUBJECTS: 8 ctdeg '- I
oJ 0
- .. . . . . - - .- - . . .
18
'32
48
66
lnlersllmulus Interval ( m a )
80
- . I 96
o
ia
31
48
64
80
91
Intentlmulua Interval (rnsac)
Figure 3. Mean percent "continuous" judgments as a function of interstimulus interval for older adults. Panels show data for 1, 4,8. and 12 c/deg. Differences in the psychometric functions for older and younger observers can not be ascribed to differences in criterion used by the two
128
CHAPTER 4
groups. Compare responses for an IS1 of 0 msec. where the stimulus should be continuous and a n IS1 of 100.2 msec. where the two stimuli should be distinct. In response to the continuous pulse (IS1= 0). all observers reported continuous virtually all the time. At an IS1 of 100.2 msec. two pulses were consistently reported by all observers. regardless of age.
Spatial Frequency (c/deg)
Figure 4. Threshold IS1 (msec) as a function of target center spatial frequency. Data are shown for old and younger observers, and for younger observers studied while viewing the display through neutral density filters. We wanted to see how well the data of this experiment could be accounted for using the same model that we had used for Brannan et al. (1988b). To do so, though, we had to give the model a n explicit criterion for distinguishing between a single pulse and two pulses. As before, the temporal response of the model is given by R(t). I t seems reasonable to assume that to detect two pulses rather than a single long combined pulse, there must be some criterion decrement, A, in the response R(t) over a sufficient time interval between the responses to the first and second pulses. We have chosen to assume that this criterion decrement follows a Weber law behavior
where
AGING AND TEMPORAL PROCESSING
129
is the maximum response to the first pulse pooled over some time
interval T and
is the minimum response which occurs between the two pulses pooled over some time interval (T). Weber law behavior was used as a first approximation because of the abundant evidence of its presence in a variety of discrimination tasks in visual processing. To complete the model, it is necessary to relate values of A to the percentages of "continuous" responses in the stimulus persistence task. This was done using the psychometric function, A: @(A) = 2
-
bA
k
The steps for the complete model are schematized in Figure 5. Temporal Impulse Function
Two -pulse stimulus
Pooled response decrement
Convolved Response
Psychometric function
Figure 5. A schematic of the steps taken to derive the model of temporal processing in aging vision.
130
CHAPTER 4
Figure 6 shows the fit of the model to the averaged persistence data of younger and older subjects at 4 and 12 c/deg. The data has been averaged only across subjects who also participated in Brannan et al. (1988b). The parameters of the temporal impulse response function used for these fits were determined from the threshold temporal three-pulse data. The fit shown in Figure 6 are for a temporal pooling interval. T=30 msec, and q=4. However, the fit does not critically depend on either of these parameters. Reasonable fits to the results can be obtained with T ranging from 20 to 50 msec and q ranging from 2 to 6. Parameters 6=.91 and k=3.8 were determined by a least-mean-square fit of the model to the data. The cross-correlation of the fit is .94. So, the changes in the temporal impulse response function determined by the three-pulse data can account for the suprathreshold stimulus persistence data.
'
Interstimulus Interval (milliseconds) I .o
0
16.7
33.4
50.1
66.8
16.7
83.5 100.2
Young 4 cldeg
33.4
50.1
66.8
83.5 100.2
Old 4 c/deg
0.8
c
g 2
1.0
.0 L
I
Young I2 cldeg:
Old 1 2 d d e g
:
10
C
08
0.8
0.6
0.6
0.4
z "I
0.4
0 0.2
?
1 -0 0
'
16.7
33.4
50.1
66.8
0.2
83.5 100.2
Interstimulus Interval (milliseconds)
Figure 6. The fit of the model to the averaged persistence data of young (left panels) and old (right panels) observers a t 4 and 12 c/deg. Note that data are for observers who participated in both experiments. To summarize, our data revealed substantial age-related changes in temporal processing as measured by the three-pulse data. These changes can be fit by modifications of parameters in the temporal impulse response function. In addition, stimulus persistence also alters with age, and does so in a manner that is predictable from changes in the impulse response function.
AGING AND T E M P O W PROCESSING
131
Conclusions The nature of temporal deficits in normal visual aging has been debated for over three decades. Recently Sturr et al. (1988) have reported that older adults do not differ from younger adults in temporal summation. Sturr et al. suggested that previous reports of large temporal losses could be due in part to four factors: 1) the use of suprathreshold measures: 2) cognitive effort required by the tasks used; 3) less than rigorous screening for ocular disease in the elderly; or 4) differences in retinal illuminance between young and old due to senile miosis. However, recent reports (Sloane et al.. 1988a. 1988b; Mayer et al.. 1988) suggest that optical effects due to senile miosis cannot explain age-related differences in spatio-temporal processing. Moreover, Tulanay-Keesey et al. (1988) have suggested that there is neural decline in spatial and temporal processing channels with age, but that the substrate for spatio-temporal interaction remains relatively intact with increasing age. Tulanay-Keesey et al. further propose that age-related changes in visual processing only occur at threshold, not at suprathreshold. Recently Schieber, Hiris. White, Williams, and Brannan (1990) have compared oscillatory motion detection thresholds for a small dot displaced vertically at 8 Hz and thresholds for the detection of correlated motion in a random dot cinematogram. Interestingly, all older adults had higher thresholds than younger adults for the oscillating dot stimulus, but only older females had significantly higher thresholds to the cinematogram. When younger adults were run on the same tasks with optical blur of up to 2 diopters introduced, their performance did not decline. This provides converging evidence that optical factors cannot completely explain age-related changes in temporal processing. The results of our investigation support the idea that there is a consistent slowing of temporal processing in normal visual aging. We found that older adults were less sensitive to both threshold three-pulse and suprathreshold two-pulse temporal stimuli. This was true despite using a task presumed to require little cognitive effort (Brannan et al.. 1988b) and a suprathreshold measure (Brannan et al., 1988a). In addition, when retinal illuminance in the younger subjects was reduced to approximate that due to senile miosis. persistence threshold was not increased. Therefore, even when the four factors suggested by Sturr et al. (1988) are taken into account, there is still evidence for a significant decrement in temporal processing ability with age. These age-related changes in threshold and suprathreshold temporal tasks can be accounted for quantitatively. The threshold task determined contrast sensitivity to three temporal pulses. The sensitivity changes with age can be explained by a change in the temporal impulse response function. The suprathreshold task measured temporal persistence. Age-related changes in the temporal impulse response function determined by the threshold measurements of Brannan et al. (1988b) could account for these suprathreshold temporal processing changes with age. Our results suggest that the decrease in temporal response with age is due not only to an overall sensitivity loss, but also to a selective
132
CHAPTER 4
loss of inhibition. To detect a change between two stimuli, the visual system must register excitation from the first stimulus. This must then be followed by a decrease in excitation before any additional excitation can be processed from the second stimulus. If older adults do have less inhibition in temporal visual processing, they would find it more difficult to register a second stimulus. More time would be needed between stimuli before an older person would have enough fading of the initial excitation to detect additional excitation brought on by another stimulus. This need for additional time between stimuli is exactly what numerous studies have reported over the years (McFarland. Warren, and Karis, 1958: Eriksen. Hamlin, and Breitmeyer, 1970: Kline and Orme-Rogers, 1978) and what we have found. Thus, our model suggests a quantitative explanation for increased stimulus persistence with age. Interaction between inhibitory and excitatory lobes of our model suggest a role for cooperative parallel processing in age-related changes in temporal resolution. Currently however, a theory based on selective loss of one neurophysiological parallel channel over another appears to be too simplistic to account for the complicated processes underlying visual aging.
References Axelrod, S., Thompson, L.W., and Cohen. L.D. (1968). Effects of senescence on the temporal resolution of somesthetic stimuli presented to one hand or both. Journal of Gerontology. 23, 191-195. Bergen, J.R. (1981). A quantitative model of human spatiotemporal vision at threshold. Unpublished doctoral dissertation, The University of Chicago. Bergen, J.R.. and Wilson, H.R. (1985). Prediction of flicker sensitivities from temporal three-pulse data. Vision Research, 25. 577-582. Brannan. J.R.. Phillips, G., Chan. C.. and Sekuler, R. (1988a). Stimulus persistence in young and older adults: The effects of reduced retinal illumination. Inuestigatiue Ophthalmology and Visual Science, 2 9 (suppl.), 432. Brannan. J.R.. Phillips, G., and Sekuler, R. (198813). Temporal processing in young &d older observers. Presentation a t the annual meeting of the Psychonomic Society. Chicago, Illinois: November. Breitmeyer, B.G. (1991). Parallel processing in human vision: History, critique, and review. In J.R. Brannan (Ed.), Applications of parallel processing in vision Amsterdam: Elsevier. Breitmeyer. B.G.,and Ganz. L. (1976). Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing. Psychological Review. 83. 136. Coppinger. N.W. (1955). The relationship between critical flicker frequency and chronological age for varying levels of stimulus brightness. Journal of Gerontology, 10, 48-52. Crassini. B., Brown, B., and Bowman, K. (1988). Age-related changes in contrast sensitivity in central and peripheral retina. Perception, 17, 315-332. Elliott, D. (1987). Contrast sensitivity decline with ageing: a neural or optical phenomenon? Ophthalmic and Physiological Optics.7. 415419.
AGING AND TEMF'ORAL PROCESSING
133
Elliott, D.. Whitaker, D., and MacVeigh, D. (1990).Neural contribution to spatiotemporal contrast sensitivity decline in healthy ageing eyes. Vision Researck30. 541-547. Elliott, D., Whitaker, D., and Thompson, P. (1989). Use of displacement threshold hyperacuity to isolate the neural component of senile vision loss. Applied Optics, 28. 1914-1918. Enroth-Cugell, C.. and Robson. J.C. (1966). The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology, 187.517-552. Eriksen, C.W.. Hamlin. R.M. and Breitmeyer. R.G. (1970).Temporal factors in visual perception related to aging. Perception and PSyChophySicS, 7,354-356. Huntington, J.M.. and Simonsen, E. (1965). Critical flicker fusion frequency as a function of exposure time in two different age groups. Journal of Gerontology, 20. 527-529. Ikeda. M. (1965).Temporal summation of positive and negative flashes in the visual system. Journal of the Optical Society of America. 55 , 1527-1534. Kline, D.W. (1987). Ageing and the spatiotemporal discrimination performance of the visual system. Eye, 1. 323-329. Kline. D.W., and Orme-Rogers, C. (1978).Examination of stimulus persistence as a basis for superior visual identification performance among older adults. Journal of Gerontology, 33. 76-81. Kline, D.W., and Schieber. F. (1981). Visual aging: A transient/sustained shift? Perception and Psychophysics, 29. 181 182. Kline, D.W.. and Schieber. F.J. (1982).Visual persistence and temporal processing. In R. Sekuler. D. Kline, and K. Dismukes (Eds.). Aging and Human Visual Function. New York: Alan R. Liss. Inc. Livingstone, M.S., and Hubel, D.H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neurophysblogy, 7.3416-3468. Mayer. M. J.. Kim, C. B. Y., Svingos, A.. and Glucs. A. (1988).Foveal flicker sensitivity in healthy aging eyes. I. Compensating for pupil variation. Journal of the Optical Society of America A, 5, 2201-2209. McFarland, R.A.. Warren, A.B.. and Karis, C. (1958).Alterations in critical flicker frequency a s a function of age and 1ight:dark ratio. Journal of Experimental Psychology, 56. 529-538. Misiak. H. (1947).Age and sex differences in critical flicker frequency. Journal of Experimental Psychology, 37. 318-332. Morrison, J.D., and MacGrath. C. (1985). Assessment of the optical contributions to the age-related loss of contrast sensitivity. Quarterly Journal of Experimental Physiology, 70.249-269. Owsley, C., Gardner. T., Sekuler. R., and Lieberman, H.(1985).Role of the crystalline lens in the spatial vision loss of the elderly. Inuestgatiue Ophthalmology and Visual Science, 26. 1165-1170. Owsley. C.. Sekuler, R., and Siemsen, D. (1983). Contrast sensitivity throughout adulthood. Vision Research, 23,689-699. Owsley. C.. and Sloane. M. (1990).Vision and aging. In F. Boller and J. Grafman (eds.), Handbook of Neuropsychology (Vol. 4). Amsterdam: Elsevier Science Publishers. Quick, R.F. (1974).A vector magnitude model of contrast detection. Kybernetik, 15, 65-67. Rashbass. C. (1970).The visibility of transient changes in luminance. Journal of Physiology, 210, 165-186.
.
134
CHAPTER 4
Ross, J.E.. Clarke, D.D.. and Bron. A.J. (1985).Effect of age on contrast sensitivity function: uniocular and binocular findings. British Journal of Ophthalmology, 69,51-56. Said, F.S., and Weale, R.A. (1959). The variation with age of the spectral sensitivity of the living crystalline lens. Gerontologia, 3, 213231. Schieber, F., Hiris. E.. White, J.. Williams, M.. and Brannan. J. (1990). Assessing age-differences in motion perception using simple oscillatory displacement versus random dot cinematography. Investigative Ophthalmology and Visual Science (suppl.), 355. Sekuler. R.. Hutman, L.P., and Owsley, C. (1980). Human aging and spatial vision. Science, 209. 1255-1256. Shapley, R. (1991).Parallel retinocortical channels: X and Y and P and M. In J.R. Brannan (Ed.). Applications of Parallel Processing in Vision. Amsterdam: Elsevier Science Publishers. Sloane. M. E.. Owsley, C.. and Alvarez, S. (1988a). Aging, senile miosis, and spatial contrast sensitivity at low luminance. Vision Research, 2 8, 1235-1246. Sloane, M. E., Owsley, C.. and Alvarez, S. (1988b).Aging, and luminance-adaptation effects on spatial contrast sensitivity. Journal of the Optical Society of America A, 6 , 2181-2190. Sloane. M.E.. Owsley. C.. and Jackson, C.A. (1988). Aging and luminance-adaptation effects of spatial contrast sensitivity. Journal of the Optical Society of America A, 5. 2181-2190. Sturr. J.F.. Church, K.L.. Nuding, S.C., Van Orden. K., and Taub, H.A. (1986). Older observers have attenuated increment thresholds upon transient backgrounds. Journal of Gerontology, 41, 743-747. Sturr, J.F.. Church, K.L.. and Taub, H.A. (1985).Early light adaptation in young, middle-aged, and older observers. Perception and PSyChophySicS, 37. 455-458. Sturr, J.F., Church, K.L.. and Taub, H.A. (1988).Temporal summation functions for detection of sine-wave gratings in young and older adults. Vision Research, 28, 1247-1253. Sturr, J.F.. Kelly, S.A., Kobus, D.A.. and Taub. H.A. (1982). Agedependent magnitude and time course of early light adaptation. Perception and Psychophysics, 31, 402-404. Sturr, J.F., Van Orden, K., and Taub, H.A. (1987).Selective attenuation in brightness for brief stimuli and at low intensities supports agerelated transient channel loss. ExperfmentalAging Research 13. 145149., Tulanay-Keesey. V., Ver Hoeve. J. N.. and Terkla-McCrane. C. (1988). Threshold and suprathreshold spatiotemporal response throughout adulthood. Journal of the Optical Society of America A, 5. 2191-2200. Weale, R.A. (1975).Senile changes in visual acuity. Transactions of the Ophthalmological Society. U.K.. 95. 36-38. Weisstein, N.. Ozog, G., and Szog. R. (1975). A comparison and evaluation of two models of metacontrast. Psychological Review, 82, 325-343.
Parallel Processing in Higher-Order Perception
This Page Intentionally Left Blank
lications of Parallel Processing in Vision ~.Trman (Editor) @ 1592 Elsevier Science Publishers B.V. All rights resewed A
137
M and P Pathways and the Perception of Figure and Ground NAOMI WEISSTEIN. WILLIAM MAGUIRE, and JULIE R. BRANNAN
Introduction A compelling idea in perceptual psychology is that figure and ground are processed in functionally different ways by the visual system (Rubin, 1922; Koffka, 1935; Calis and Leeuwenberg. 1981: Breitmeyer and Ganz. 1976; Julesz. 1978; Weisstein and Wong. 1986. 1987). Figure 1 shows a circle divided into a series of pie shaped sectors. Every other sector contains a fine grained texture. The sectors are easily grouped into one of two ambiguous configurations: a Maltese cross consisting of the fine grained textured sectors on a textureless background, or alternatively a Maltese cross consisting of the textureless sectors on a fine grained background. The two organizations can alternate. As they alternate, one can observe that the texture disappears when the textured sectors become ground, and the other cross is seen. Such simple demonstrations inspire the idea that figure and ground perception involve two different kinds of visual coding. We have researched figure/ground phenemona for the past decade particularly focusing our efforts on how the spatial and temporal frequency composition of a n image region contributes to the segmentation of that region into figure or ground, a s well a s how the segmentation of a region into figure or ground influences its spatial and temporal frequency sensitivity. This work indicates that the high spatial and low temporal frequency domain of visual information is strongly associated with figural perception, while the low spatial and high temporal frequency domain is strongly associated with ground perception. Exploring the spatial and temporal frequency basis of this dichotomy, we have found that a sine wave grating that is of a higher spatial or lower temporal frequency than an adjacent grating will appear to float in front of the adjacent grating. Recently we have also established that sine wave gratings that are colored red will float in front of gratings of the same or similar spatial frequency, a result which we describe in this paper and relate to current theories of magnocellular (M)and parvocellular (P) processing streams. We present a model where figurelground relations emerge from antagonistic interactions of M and P pathways responsive to luminance
138
CHAPTER 5
contrast over overlapping portions of the spatiotemporal frequency continuum. Much recent interest in early visual processing has centered around the magnocellular (M) and parvocellular (P) pathways which remain distinct and relatively independent going through LGN, V1, and visual associative cortex. These independent sets of neurons have been characterized a s having different sensitivities to visual information (Livingstone and Hubel. 1987. 1988). In particular, the magnocellular pathway has been regarded as relatively "color blind," although there is partial suppression of magnocellular response by diffuse red light (Dreher. Fukada. and Rodieck, 1976; Livingstone and Hubel. 1984). while there are great numbers of color opponent cells in the parvocellular pathways. The two pathways also differ in spatial and temporal frequency sensitivity, with parvocellular neurons generally responding to higher spatial and lower temporal frequencies than magnocellular neurons (Derrington and Lennie, 1984; Tootell et al., 1988a).
Figure 1. Maltese cross pattern consisting of black and fine grating sectors. When the black sectors are organized into figure, the background texture becomes indistinct. A number of theorists have pointed out that figure and ground regions have different appearances with the figure characterized by distinct form and fine detail (Koffka, 1935: Julesz. 1978: and see below). Most intriguing for theories of figure and ground perception is reported correlations between the M pathway and perceptions of motion, depth, and clear figure/ground segmentation (Livingstone and Hubel, 1987. 1988; Cavanagh. Tyler, and Favreau. 1984) while the smaller receptive field sizes and color opponency in the P pathway suggest a role in the analysis of form, fine detail, and color (but see Cavanagh, 1989, and Logothetis et al.. 1990. for the role of the parvocellular pathway in depth analysis). In this paper we pursue the idea that these functional distinctions establish a relationship between
FIGURE AND GROUND
139
ground perception and M pathways and figure perception and P pathways. A consideration of previous work, and of new work discussed in the next section leads to a model of figure and ground perception. We suggest that both M and P pathways are sensitive to luminance contrast throughout most of the temporal and spatial frequency range overlapping considerably in the stimuli to which they respond, but each pathway is most sensitive to a different part of the spatio-temporal frequency spectrum. The M pathway is most responsive to low spatial and high temporal frequencies, while the P pathway is most sensitive to low temporal frequencies throughout the spatial range. The basic tenet of the model is that the magnocellular pathway produces a ground biased signal and the activity in the parvocellular pathway biases perception of a region towards figure. Where both M and P pathways respond to a stimulus, the perception is the result of a subtraction, parvocellular response minus magnocellular response with more positive responses associated more strongly with figural perception. This means that figure/ground coding is relative in these pathways, so that when the figure/ground relations between two regions are computed, the relatively more positive regional response will bias a perception of figure whatever the sign or absolute size of the regional figure and ground responses. We briefly review the notion of different types of processing associated with figure and ground perception below, and present a summary of empirical work that leads us to the model described above.
Early work The Gestalt theorists recognized the fundamental importance of figure-ground perception in human vision and pioneered the early phenomenological studies of figure and ground. They pointed out that figure and ground regions have different perceptual properties. Figural regions are "richer and more differentiated" than ground regions, and have a "thing-like" character while ground regions appear to "extend behind the figure" (Kofka, 1935. Rubin. 1922). The Gestalt theorists also postulated that figure and ground perception involved functionally different neural processes. Many of the ideas of the Gestalt psychologists have been abandoned in modem perceptual psychology, particularly their theories of brain function (see Hochberg, 1972). but the idea of two distinct processes underlying figure and ground perception continues to influence modern theory. Julesz (1978) proposed that figure and ground perception involves two different types of image analysis. Ground analysis involves the rapid detection and organization of the scene at the resolution of "blobs" while figure analysis takes place more slowly and involves the analysis of fine detail. This concept of two systems differing in sensitivity and function both involved in the initial analysis of the pattern, has also been central to the thinking of a number of others (Weisstein, 1968, 1972; Weisstein. Ozog, and Szoc. 1975; Kulikowski and Tolhurst, 1973; Breitmeyer and Ganz, 1976; Alwitt, 1981). Central to many of these models is the idea that the analysis of figural regions often involves scrutiny of details and attention, while the perception of background involves the detection of global shape (see also Henning, Hertz, and Broadbent. 1975).
140
CHAPTER 5
The .direct spatial frequency connection High spatial frequencies appear to play a dominant role in edge perception and the resolution of details in a n image (Carpenter and Ganz, 1972: Broadbent, 1977: Julesz. 1978: Ginsburg. 1982: Norman and Ehrlich, 1987: Shulman and Wilson, 1987: b u t see also Westheimer and McKee. 1980, for evidence that low spatial frequencies must be present for good stereoacuity). On the other hand, low spatial frequency information may be sufficient in a n initial rough scan of a scene when details are not required (Henning, Hertz, and Broadbent, 1975: Breitmeyer and Ganz. 1976: Broadbent, 1977: Marr and Poggio. 1979; Ginsburg, 1982: Shulman and Wilson. 1987).
Figure 2. Rubin faces/vases reversible figures where regions defined by high and low spatial frequencies.
FIGURE AND GROUND
141
There appears then to be a link between high spatial frequency analysis and figural perception and blob or low spatial frequency analysis and ground perception. About ten years ago, Wong and Weisstein set out to see whether a more direct link could be established between low and high spatial frequencies and blob-ground. edge-figure perception. We have used ambiguous displays such as Rubin's familiar faces/vase drawing (see Figure 2) to explore the effects of a region being figure or ground on spatial sensitivity independent of the context's stimulus characteristics. Our findings are that sharp targets (with high spatial frequencies present) are detected better in the region of the ambiguous pictures seen as figure regardless of which physical region that is. Conversely blurred targets (with energy primarily in lower spatial frequencies) are detected better in the region perceived as background (Wong and Weisstein, 1983). More recently, we have obtained contrast sensitivity functions for Gaussian modulated sine-wave patches in figure and ground regions in a related design. The observer adjusted the contrast of a patch located in the center of the Rubin figure until it was just visible. The spatial frequency of the target ranged from 1 c/deg to 16 c/deg. In one block of trials, the observer made the adjustment only when the central region was perceived a s figure: in another block of trials, the adjustment was made only when the central region was perceived a s ground. The contrast sensitivity for gratings in the figure region was shifted toward relatively greater sensitivity at the higher spatial frequencies while that for gratings when the region was perceived a s ground was shifted toward lower spatial frequencies (see Figure 2). These findings are seem consistent with the general theoretical overview of blob type ground and edge type figure processing. Seeking a more specific formalization of this theoretical overview we turn to models of the distribution of channels in spatio-temporal frequency space based on the huge amount of detection data for flickering and stationary luminance contrast sine wave gratings gathered over the last twenty years (Graham, 1989). As Grossberg (1987a.b) and others have pointed out, the perception of such gratings involves multiple computations on the image and cannot be assumed to be isomorphic with the response of spatial and temporal frequency channels in early visual processing. Nonetheless these channels must be the building blocks for such perceptions and the evidence from these detection experiments seems a good place to start. In the following pages we present three models of figure/ground processing. These models are based upon putative sets of channels in the visual system that have been explored using threshold detection and discrimination methods (Watson, 1986; Graham, 19891. We start by asking whether our data and other data which we shall present later can be explained by looking at what we call the spatial channels of the visual system. These are the channels that are narrowly tuned to spatial frequencies and quite broadly tuned to temporal frequencies, with lower spatial frequency channels tuned to slightly higher temporal frequencies. These channels are also known a s sustained channels in previous literature and are closely associated with the P pathway. The spatial model of figure and ground processing will be seen to fail when the effects of temporal frequency and directional motion on figure and ground perception are considered. We will then consider a model where channels tuned to
142
CHAPTER 5
temporal frequencies and sensitive to direction of motion are the sole mechanism of figure and ground perception. We shall call these temporal channels. They have also been known as transient channels in earlier literature, and we believe they are closely related to the M pathway. These temporal channels are also inadequate alone to explain all data. We finally consider a model where input from both the spatial and temporal channels is antagonistically combined. This model does a good job of explaining our data. The putative spatial and temporal channels have characteristics that naturally lead to a n interpretation in terms of the parvocellular (P)and magnocellular (M) pathways of the primate visual system. We finally consider this interpretation and consider some new predictions and data generated by it. Before turning to these models however we wish to define some terms.
Some definitions In this paper we will frequently describe analyzers, channels and pathways. By analyzers we mean psychophysically identified mechanisms which are sensitive to a limited range of values on a specific stimulus dimension, and respond to a stimulus in a local region of the retinal image. Physiologically, the equivalent level of analysis is at the level of individual neurons and their receptive fields. Channel refers to a set of otherwise identical analyzers that differ in their responses along one or more dimensions. The variant dimension might typically be spatial position, in which case the channel would be a spatial distribution of identical analyzers. One might also speak of a channel responsive to vertical orientation which contained analyzers with identical orientation tuning, but different size and/or spatial position tuning. A pathway is a group of channels which share response characteristics on a limited set of dimensions. Thus we might speak of a motion pathway, consisting of channels sensitive to different speeds and directions of motion, which in turn are composed of spatially distributed sets of analyzers, tuned to particular speeds and directions. The term pathway is used as well to refer to the physiological structures underlying the psychophysically defined pathway. We can speak of a n analyzer or channel as being labeled (Graham, 1989; Watson, 1986: Treisman. 1986). With a labeled analyzer, mechanisms above in the processing hierarchy have information about their input sufficient such that even at the analyzer's threshold of responding, stimuli may be identified and discriminated from other patterns. An example of labelling is spatial position. A light flashed a t one location, is discriminable from an identical light flashed a t another spatial location at threshold. Thus the analyzers responsive at threshold are labeled for spatial position. For a labeled analyzer or channel, we would also like to know the aspects of the response profile that are most important in information terms. A channel response might be best characterized by the peak responding analyzer, or by the difference between the peak responding and lowest (trough) responding analyzer, the peak to trough response, or some other pattern (Graham, 1989). We will consider our channels in terms of peak or peak-to-trough response.
FIGURE AND GROUND
143
A spatial model of figure and ground perception A schematic explanation of how two channels, one tuned to lower spatial frequencies and somewhat higher temporal frequencies than the other, would respond to a flickering sinewave grating patch, gaussian modulated in both space and time, is given in Figure 3.
P m
-” m
c 0 n L m
time
Figure 3. (1) (Top) spatial luminance distribution and (bottom) temporal luminance distribution a t point x of a contrast reversal flickering sine wave grating. (2) Hypothetical spatial impulse response of two spatial channels. (3)Hypothetical temporal impulse response of these channels. (4) Time averaged respnse of these channels to the stimulus in panel 1.
The top row of panel 1 shows the luminance profile of a gaussian modulated sinewave grating patch, while in the bottom row, the temporal luminance profile of the spot marked x on the patch is given.
144
CHAPTER 5
In panel 2 the spatial sensitivity profiles of the analyzers tuned to lower spatial frequencies (top row) and higher ones (bottom row) are shown. The analyzers are centered at a retinal location corresponding to x on the patch. For this contrast and luminance the stimulus does not produce maximal response in either analyzer, since it is a little too small for the receptive field of the lower spatial frequency analyzer and a little too big for the receptive field of the higher one. An equal but opposite response occurs for the analyzers whose retinal location corresponds to the spot marked y on the patch. The integrated spatial peak-to-trough response from each channel is hence about the same. This is illustrated in panel 3 by the height of the vertical arrows as a percentage of the maximum peak-to-trough response in each channel for that contrast and luminance. But now consider the different temporal response of the two analyzers to the integrated spatial response. The top row in panel 4 shows the fast biphasic response of the low spatial frequency analyzer. The bottom row of panel 4 shows the slower monophasic temporal impulse response of the high spatial frequency one. Superimposed on each temporal impulse response is a temporal luminance profile of the flickering grating patch shown in the bottom row of panel 1. I t is a little harder to figure out what the temporal response of these two analyzers is because the answer requires a convolution of the impulse response with the stimulus rather than a simple correlation as in the spatial case, but the results are straightforward enough. The fast biphasic response follows the stimulus rising and falling as the stimulus does while the slower monophasic response barely gets started by the time the stimulus has fallen below its mean luminance level. The results of the convolution are shown in panel 5 and the peak of each function is shown by the height of the arrows in panel 6 as a function of maximum peak-to-trough difference of each channel for that contrast and luminance. Clearly the channel with the lower spatial frequency response and the higher temporal frequency response has the larger amplitude. If these were the only two channels in the visual system, a rule which chose the largest peak-to-trough amplitude among channels would choose the lower spatial frequency one. In general where a set of spatial analyzers differ in their temporal responses such that analyzers tuned to lower spatial frequencies are tuned to higher temporal frequencies, this type of response shift should occur. The number of spatial channels that are found psychophysically to be independent near threshold is about five or six when sensitivity to stationary gratings is measured along the spatial frequency axis. It is important to note that psychophysical methods define a minimum number of channels, however. Specifically they do not exclude the possibility of a near continuum of channels corresponding to the near continuous distribution of receptive field sizes, orientations, temporal response characteristics, etc. Consistent with the above analysis of spatiotemporal channels. We can build a spatial model of figure and ground perception based upon a set of analyzers narrowly tuned to spatial frequency and broadly tuned to temporal frequency. Such a model is shown in Figure 4 (see Graham. 1989). Each ellipse represents a different channel. The outline of the ellipse represents iso-response at half the peak sensitivity. Peak sensitivity at the center of the ellipse forms a series of
FIGURE AND GROUND
145
points of decreasing temporal frequency with increasing spatial frequency. This is the configuration of a set of spatial channels maximally sensitive a t very low to low temporal frequencies specialized for the analysis of spatial structure.
"
c
P
d
log spatial frequency Figure 4 . A set of spatial channels derived from empirical studies of threshold detection and identification of sine wave stimuli. (After Graham, 1989.)
Assume that the spatial analyzers are arranged as shown in the above figure. Assume further that they are labeled not only with respect to spatial frequency but also with respect to "figureness." The channels tuned to the lowest spatial frequencies signal that the region from which the response originates is ground. The channels tuned to the highest spatial frequencies signal that the region of origin is figure. The channels in between generate increasingly weaker ground responses as their most sensitive spatial frequency is near the middle then
146
CHAPTER 6
increasingly stronger figure responses as we move towards high spatial frequency tuning. In two adjacent locations that stimulate the same analyzer, the region which elicits the greater response will be seen as figure, if the peak responding analyzer is tuned to high spatial frequencies, and ground if the peak responding analyzer is tuned to low spatial frequencies. Where different regions stimulate different analyzers, we will assume that the region's figure/ground response is determined by the most vigorously responding channel. The overall appearance of the display will be determined by the relative figure/ground labels of the regions modulated by the relative response strength of the channels being compared. With these assumptions we might explain our spatial frequency, contrast sensitivity, ground shift results in the following way, Since fluctuations in the ambiguous figure are not consequent to changes in the stimulus configuration which is unchanging, these changes in figure/ground organization are understood to be due to time varying activity in the spatially separated regions. When fluctuating activity in the lower spatial frequency analyzers is relatively great, the region will be seen as ground, and spatial sensitivity will be momentarily enhanced for lower spatial frequencies. When the activity in the high spatial frequency analyzers is greater, the region will be seen as figure, and spatial sensitivity will shift to higher spatial frequencies. An assumption in this model and in the models to follow, is that perceptual relationships like figure and ground, three dimensional structure, occlusion, and transparency are not only computed from the initial output of channels in early visual processing. These relations are represented by sustained ongoing changes in the activity and sensitivity of the channels for the duration of the percept. The representation of stable higher order perceptual relations by units that appear as elementary analyzers in most psychophysical contexts has been a theme of work that we have done for many years. We have found spatial frequency specific adaptation in a retinal area corresponding to an untextured portion of the visual field, if that portion is perceived to be part of a three dimensional object which occludes the grating (Weisstein, 1970: Weisstein. 1973; Weisstein and Maguire. 1978). We have found that a line is detected better if it is part of a two dimensional representation of a three dimensional object, than if it is presented alone or in a flatter less coherent context Weisstein and Harris, 1974: Williams and Weisstein, 1978). Extensive discussion of related findings by ourselves and others (n.b. Nakayama et aL.1989. Shimojo et aL.1988; Shimojo and Nakayama. 1990) is found in Maguire et al. (1990). A model of the representation of perceptual relations by early visual processing is elaborated in Maguire and Weisstein (199 1). We consider below a number of the experiments that we have performed in the past few years, and consider the application of the spatial model described above to that data. We believe that while explaining a number of our results, the spatial model is not adequate to explain the full range of figure and ground effects, nor is the alternative temporal channels model which follows it. Finally we consider a model which we do feel is adequate which combines information from spatial and temporal channels.
FIGURE AND GROUND
147
Spatial determinants of figure-ground perception We have in numerous experiments found that the spatial frequency composition of a region contributes to the perception of that region as figure or ground. Using a variety of configurations that produce figure/ground ambiguity (the Rubin faces/vase picture as in Figure 2, interposed Maltese crosses in a circular pattern as in Figure 1. a bipartite field. a center-surround configuration, and a diagonal/triangles ambiguous figure) sinewave gratings of different spatial frequencies are used to fill the regions with texture. In this way differences in spatial frequency define the regions (see Figure 2). Figure-ground stability as a function of spatial frequency difference is measured by the percentage of time one of the regions is perceived as figure. We found that the region filled with the lower spatial frequency was perceived predominantly a s background, for all these configurations (Klymenko and Weisstein, 1986; Brown and Weisstein. 1988; Wong and Weisstein, 1989). As the octave separation between the regions increased, the percentage of time the higher spatial frequency region was seen as figure increased.
Temporal determinants of figure-ground perception Several lines of evidence suggest that flickering regions of an ambiguous region are predominantly perceived as background, while adjacent nonflickering regions are perceived as figures (Wong and Weisstein, 1984. 1985. 1987; Meyer and Doherty, 1987). This is true whether the regions are outlined by contours or merely defined by temporal changes. Using on/off flicker (where the spatial pattern is replaced by a spatially uniform field of the same space averaged luminance on each half cycle, this "flicker-induced ground" effect was optimal when the flickering frequencies were between 6 and 8 Hz. Maximum perceived depth segregation between the flicker and nonflickering regions also occurred at these rates of flicker. At lower (1.4 Hz) and higher (12.5 Hz) rates of flicker, regions maintained their segregation, but the dominance of a region as figure or ground and the depth segregation between the flickering and nonflickering regions diminished. Klymenko et al. (1989) and Klymenko and Weisstein (1989) explored whether flicker induced orderly figure/ground perception throughout the spatial frequency domain. The display was spatially uniform consisting of a single spatial frequency. Regions were defined by differences in temporal frequency. In different conditions the spatially uniform texture was of different spatial frequencies. An ambiguous pattern consisting of a rightward and leftward leaning maltese cross (see Figure 1) positioned so that they perfectly filled a circular area was defined by the temporal frequency differences. There were four flicker rates (0, 3.75. 7.5. and 15 Hz). of which all combinations were tested, and four spatial frequencies (0.5. 1. 4. and 8 c/degl. In the two experiments, the waveform of the flicker differed, as did the ambiguous pattern (square wave on-off, Maltese cross, Klymenko et al., 1989; contrast reversal, bipartite field. Klymenko and Weisstein, 1989). The general result was that the cross with the higher temporal frequency was perceived primarily as background, regardless
148
CHAPTER 5
of the spatial frequency of the whole circular area. The effect of temporal frequency difference was greater for high spatial frequency patterns than for low for the square-wave contrast reversal flicker. In a final experiment, Klymenko et al. (19891 tested displays where low spatial, low temporal frequency patterns. and high spatial, high temporal frequency patterns were compared neither type of pattern dominated the figure response.
The role of depth in figure-ground perception In our figure/ground studies, a concomitant depth effect is also observed. Figure regions are always perceptually localized in front of ground regions. In fact, spatial frequency induced depth between the figure and ground regions can be perceived despite the presence of contradictory stereoscopic depth cues. If the magnitude of the contradictory binocular disparity cue is increased however, a point is reached a t which spatial frequency induced depth is cancelled (Brown and Weisstein, 1988; Wong and Weisstein, 1989). Brown and Weisstein (1988) assessed the amount of depth induced by spatial frequency differences in this way. Crossed disparity was added to one or both regions of a pattern containing sinewave gratings differing in spatial frequency. The display consisted of rectangular areas filled with sinewave gratings. The regions of higher spatial frequency were perceptually localized in front of the lower spatial frequency regions (see also Schorr and Howarth. 1986: Frisby and Mayhew, 1978). Again, the effect was dependent on the relative spatial frequency difference between the regions. When the spatial frequency difference between regions was greater than 1.32 octaves, the higher spatial frequency region tended to be seen as foreground regardless of the disparity imposed on the regions: i.e.. spatial frequency difference dominated binocular disparity as a cue to depth. Using the same configuration, we then instructed observers to cancel the depth induced by spatial frequency differences between the regions by adjusting the disparity of the image so that all regions within the display lay on the same depth plane. Observers consistently placed the regions fill with lower spatial frequency closer in stereo depth than the relatively higher spatial frequency areas. Similar trends were observed when the gratings were placed out of phase. As a control, the procedure was repeated using square wave gratings (which contain many very high spatial frequency components in addition to their fundamental frequency). Although depth was occasionally observed, neither region was reliably placed in front of the other. When stereo depth is supplied to cancel spatial frequency induced depth, the display becomes bistable with frequent figureground reversals. If regions of equal spatial frequency are induced into a particular figure ground interpretation by manipulation of binocular disparity, the stability of this configuration, its resistance to reversal, is determined by the magnitude of the relative disparity differences between the regions. We examined the joint roles of flicker and perceived depth on the perception of figure and ground by stereoscopically cancelling the depth induced by the flickering region (Wong and Weisstein. 1987b).
FIGURE AND GROUND
149
The figure-ground context was an ambiguous figure that could either be seen as a diagonal stripe or a pair of triangles. There were no contours to define these regions which were simply composed of random dots. The regions were defined by homogeneous flicker rates and/or binocular disparities. The percentage of time a flickering region was perceived as ground was measured for four temporal frequencies (1.4. 6.3,8.3,and 12.5 Hz) and compared when disparity differences were absent and perceived depth present, or when depth differences were stereoscopically cancelled. Results indicated that temporal frequency induced depth differences between the two regions could be cancelled stereoscopically analogous to spatial frequency effects. Together these data show that spatial and temporal frequency effects on figure and ground organization, function much like binocular disparity differences. The resulting percept with simultaneous appearance of a particular figure ground organization, with particular depth relationships, with texture elements of particular sizes, is the result of a global computation that integrates spatial frequency, temporal frequency and binocular disparity information.
The effects of unidirectional motion on figure-ground organization The analysis of image motion and velocity fields can yield valuable information about how the visual system processes change. Image motion can be a powerful segmentation cue (Johansson. 1976). Wong and Weisstein (1987)investigated how the velocity of moving fields affected figure/ground perception. We used a display consisting of a center and surround region filled with 1 c/deg sinewave gratings. One region of the display was always stationary while the other region moved. The observer was instructed to monitor figure-ground perception in the way described in our previous experiments. We found that as the velocity increased, the moving grating was seen as ground more often than the stationary one. The effect increased with velocity up 8 degrees per second and remained high at the highest velocities tested (32 degrees per second). Since at high velocities observers reported blur or streaking in the images and at very high velocities the stimulus would be indiscriminable from a stationary field of uniform luminance, motion response in figure-ground perception must have a high velocity limit which we did not test. Wong and Weisstein (1989)compared absolute and relative motion between a center-surround display of moving gratings as predictors of relative figure ground segregation. They discovered that the fastest moving grating was generally perceived as ground, but the magnitude of this effect was influenced by whether the center or surround moved in same or opposite directions. This sensitivity to relative motion, implies that the mechanisms computing figure and ground in these experiments are direction selective.
Problems with the spatial model and a consideration of the temporal model A number of the results discussed above would appear to be difficult to explain using the spatial model. At all spatial frequencies
150
CHAPTER 5
there was a monotonic effect of temporal frequency on figure segmentation. The effect was more pronounced a t high spatial frequencies than low. I t would appear that the spatial model would predict different effects of temporal frequency a t high and low spatial frequencies. At high spatial frequencies, the effect of raising temporal frequency would be to reduce responding, reducing the figure response and thus producing the monotonic effects we have noted. If all the spatial channels are tuned to relatively low temporal frequencies, then a t low spatial frequencies, a fall-off in channel response with increased temporal frequency should produce a reduced ground response, and an effect opposite to what we observed. If the low spatial frequency channels, respond vigorously a t higher temporal frequencies, the model implies a single set of spatio-temporal frequency analyzers. Even characterizing the spatial channels in this way is problematic for the following reason. The effects of temporal frequency at the middle spatial frequencies where channels neither signal figure or ground very strongly should be less than at high and low spatial frequencies. We also have more recent research that is relevant to the model. Recent research indicates that as the contrast of low frequency gratings is raised, they are more likely to be perceived as figure, but if the low spatial frequency channel carries a ground signal, and responds monotonically to stimulus contrast, the opposite would be expected. Additionally as mentioned, figure/ground relations with moving gratings shows sensitivity to stimulus direction. This implies the channels underlying figure/ground segmentation are directionally selective. Such directional selectivity is generally associated with the temporal channels. We consider a temporal channel model briefly below. Figure 5 presents the outline of a set of directionally selective temporal channels, that appear to be the underlying mechanisms uncovered by experiments that examine subthreshold summation of flickering patterns and discrimination of pattern temporal frequency (see Graham, 1989: Mandler and Makous. 1984). This set of channels can be considered as a pathway particularly sensitive to information about stimulus motion. Individual channels appear to be labeled for the direction of stimulus motion (see Watson, 1986: Graham, 1989). The M pathway is considered to be rich in directionally selective analyzers, and particularly sensitive to stimulus motion (Livingstone and Hubel, 1987) It exhibits relative sensitivity to higher temporal and lower spatial frequency information. (The description of the M pathway also roughly corresponds to the descriptions of transient channels in previous research: see Breitmeyer. this volume: Kulikowski and Tolhurst. 1973.) This pathway has been characterized as playing a critical role in depth, and motion perception. Relative depth and motion are powerful segmentation cues, and our own research has established a strong relationship between binocular disparity and spatial and temporal frequency in determining figure/ground organization. I t has been suggested that the M pathway and not the P pathway is most critical in the segmentation of regions into figure and ground (Livingstone and Hubel. 1987, 1988). We consider a straightforward model of figure/ground organization in which temporal channels play an exclusive role. The model assumes all information from all temporal channels is summed.
FIGURE AND GROUND
151
When a region stimulates the temporal pathway strongly that region is more likely than adjacent regions not as strongly stimulated to take on the characteristics of ground. Thus where two regions are ambiguous in their figure/ground relationships the region with the greatest ground response will form the ground. In this view the quick responding temporal channels accomplish the initial segmentation of the image. This in turn permits attention to be directed to regions defined as figure where detailed analysis of figural qualities can be accomplished by the slower responding spatial channels.
log spatial frequency Figure 5 . A set of temporal channels derived from empirical studies of threshold detection and identification of moving and flickering sine wave stimuli. (After Graham, 1989.) The model permits us to make a straightforward prediction. When we measure the spatio-temporal tuning of figure-ground perception, should simply arrive at a surface that represents the
152
CHAPTER 5
envelope of the temporal channels illustrated in Figure 5. One proviso should be made to this claim. Figure 5 is based upon threshold measurements of sensitivity. The stimuli used in figure/ground experiments are generally well above threshold contrast. Therefore we expect the actual surface generated to cover larger areas of the spatio-temporal plane than the figure suggests. The spatial frequency dependence of figure and ground is consistent with the model presented above. Since the envelope of the spatial frequency sensitivity of the temporal frequency sensitive channels is low pass one might expect the effects of spatial frequency to be roughly monotonic as we have found. It is worth noting however, that we have found good figure ground segregation between gratings 0.5 and 1 cycle per degree (Klymenko and Weisstein. 1986) in the region where the temporal envelope is relatively flat. If we consider the data reviewed above, the effects of spatial and temporal frequency are generally what one might expect, but there are certain details that appear troublesome for a model that relies exclusively on temporal channels. Klymenko e t al. (1989) and Klymenko and Weisstein (1989) found that there were differences in figure/ground segmentation between square wave on/off and contrast reversal flicker. As the temporal frequency difference Between two regions increased, the magnitude of the effect increased with contrast reversal flicker. Temporal frequency differences with on-off flicker showed leveling of the effect a t the lower spatial frequencies, at the highest flicker rate (15 hz.). Wong and Weisstein (1984) using random dot fields and on-off flicker found peak figure/ground effects a t around 7.5 hz. These differences have been also demonstrated with stereoscopic cancellation as well. With on/off flicker, maximal depth separation between regions (measured by disparity needed to cancel depth effect) is achieved when the background flickers at 7.5 hz. With contrast reversal, maximal effects are achieved a t 15 hz., the highest temporal frequency tested (Klymenko, Weisstein, and Maguire. 1990). The asymptotic behavior of the on-off flicker a t around 7.5 hz might be what a model of the temporal channels would predict given that this value represents peak sensitivity to sine wave flicker. This difference in asymptote for contrast reversal and on/off flicker has important implications for the mechanism of figure/ground segmentation. Kulikowski and Tolhurst (1973) found that threshold flicker sensitivity for the two types of flicker did not diverge a t high temporal frequencies, but that pattern sensitivity threshold did with much greater sensitivity to spatial structure for on/off flicker a t high temporal frequencies. This can be readily understood if we conceptualize on/off flicker as a stimulus composed of a stationary spatial frequency to which a contrast reversing grating of the same spatial frequency has been added. This in turn implies that the channels most responsive to spatial structure may play a role in figure/ground perception, specifically. stimulation of pattern sensitive pathways in a region may mitigate the ground response produced by stimulation of the temporal channels. In other experiments, Klymenko and Weisstein held temporal frequency constant within conditions while spatial frequency was varied. In one condition, the two crosses were filled with sinewave gratings of 1 and 4 c/deg; in the other condition, they were filled with
FIGURE AND GROUND
153
gratings of 1 and 8 c/deg. For each of these two spatial frequency conditions, there were four temporal frequencies at which the entire pattern flickered (0,3.75, 7.5. and 15 Hz). Once again, the type of flicker and type of pattern differed between the two experiments. Results indicated that when flicker was absent, the low spatial frequency region was seen predominately as background, consistent with Klymenko and Weisstein (1986)and Wong and Weisstein (1989a).
log spatial frequency Figure 6 . Spatial and temporal channels combined in a single figure. (After Graham, 1989.) In general, with the addition of flicker to the display, the appearance was unchanged or declined slightly with the low spatial frequency less likely to be perceived as background. A dramatic decline in the spatial frequency effect was produced when the display with 1 and 8 cpd underwent contrast reversal flicker at 15 hz. In this case, the standard spatial frequency effect was actually reversed. The general effect of
154
CHAPTER 5
flicker then is to obliterate spatial frequency based figure/ground effects and perhaps when high and low spatial frequencies are compared to reverse them (Klymenko et al.. 1989; Klymenko and Weisstein. 1989b). If we assume that differences in temporal channel response to spatial frequency differences are more pronounced a t low temporal frequencies than at high temporal frequencies, one might expect a reduction in spatial frequency produced figure/ground effects, but Kulikowski and Tolhurst found the ratio of flicker thresholds between low and high spatial frequency patterns to be approximately equal a t low and high temporal frequencies (see Kulikowski and Tolhurst, 1973, Figures 7 and 8). They did however find that pattern thresholds for various spatial frequencies converged at high temporal frequencies, which might tend to flatten out figure/ground differences at high spatial frequencies if spatial channels play a role in figure/ground perception. To summarize, we have found that models that rely on spatial or temporal channels alone to explain figure and ground segmentation cannot explain all important aspects of the data we have gathered. The data lead u s rather to consider a model where the perception of figure and ground results from interaction between spatial and temporal channels in the human visual system.
A model of figure-ground perception based on the interaction between spatial frequency channels and temporal frequency channels If we assume both spatial and temporal channels are active and interacting in the perception of figure and ground then our data are explained quite well. Figure 6 shows the most likely distribution and shape of spatial analyzers in log frequency space. Again. the ellipses represent isosensitivity curves where the channel's sensitivity is half its peak sensitivity. Notice that the high spatial frequency spatial channels are more sensitive a t very low temporal frequencies than are the low spatial frequency channels. The extent of individual channels in the figure has been determined by threshold experiments and we assume t h a t they appear wider and overlap more extensively when suprathreshold stimuli of the type we use are processed. This overlap above threshold implies that both figure and ground pathways are activated over most of the spatial and temporal frequency range with the possible exception of the most extreme stimuli (extremely low spatial frequencies flickering a t high temporal frequencies, and stationary extremely high spatial frequency stimuli). The model assumes that spatial analyzers responding to a region bias the percept of that region towards figure, while temporal analyzers responding in a region bias the percept of that region towards ground. For any region, relative figureness is a monotonic function of the pooled spatial analyzer contrast response minus the pooled temporal analyzer contrast response. This can be described somewhat more formally as:
Figureness = F ( C S I rc(a.7) - CTI rc(a.r) )
(11
where S I rc(a.T) is the pooled spatiotemporal luminance contrast response of the spatial frequency tuned channels and TI rc(a,d is the pooled spatiotemporal luminance contrast response of the temporal
FIGURE AND GROUND
155
frequency tuned channels for retinal region a in the interval T. Figure however is a relation relative to ground, so a more general formulation would be Figureness = M a x 1 F ( C S I rc(a,z) - CTI rc(a,z) 1 , F ( CSI rc(b,z) - CTI rc(b,z) 1 I
(2)
where the two regions a and b are potential figures. The spatial and temporal responses are time varying functions showing random fluctuations even during fixed input with corresponding perceptual effects such as reversal of figure/ground organization. There is another monotonic function which predicts the strength of a region's figureness measured as the percentage of time during which region a is perceived to be figure. Vividness = V 1 ( CS I rc(a,o) - CTI rc(a,z) )
It may turn out that a ratio between spatial and temporal responses or a ratio between different regions in Equation 3 fits the experimental data most closely. Similarly it is necessary to specify the channel response to stimulus contrast in order to accurately predict the effects of that important variable. A notable feature of the model is that it does not assign the calculation of figure/ground segmentation exclusively to either the spatial or temporal channels, Activity in both normally underlies figure/ground perception, and activity in either is sufficient for figure/ground segmentation. The absence of input from either spatial or temporal channels reduces Equation 3 to Strength= V ( CS I rc(a,z) - C S I rc(b.7) )
(4)
Strength = V ( CTIrc(a,z) - CTIrc(b,z) 1.
(51
or
This relationship will become important later when we reject the idea that either an M or P stream (e.g. Livingstone and Hubel. 1987 vs. Ingling and Rigby. 1990) exclusively codes for depth (see also Cavanagh, 1989).
The model is sufficient to explain key aspects of our data. With low spatial, low temporal frequency sine waves, a region that contains the higher spatial frequency sine wave will be perceived to be figure. This is because the spatial channel response increases with increasing spatial frequency in this range. In addition at lower temporal frequencies and throughout the spatial frequency range the contribution from the temporal channels will diminish continuously a s spatial frequency increases further increasing the figureness signal associated with the higher spatial frequency. The following data are explained. Over the spatial frequency range that we have tested, the higher spatial frequency is more likely to be seen a s figure. Over the temporal
156
CHAPTER 5
frequency range that we have tested, the higher temporal frequency is more likely to be seen as background. When the whole pattern flickers at high temporal frequencies, the differences between spatial frequencies will be diminished because the gradient is shallower above the 1 degree/second line. For the same reason, when two temporal frequencies are compared at the same spatial frequency with contrast reversal, the effect is greater for higher spatial frequencies. The inhibitory interactions between the pathways find some support in the fact that specification of a region a s figure or ground affects the spatial and temporal sensitivity to patterns in that region regardless of other stimulus characteristics. We also believe that changes in the appearance of a region associated with the shift from a ground to a figure percept, can be explained by a shift in the composition of analyzers most responsive to the pattern. These changes include the following: texture elements appear larger when an ambiguous region is perceived as figure than when it is perceived as background (Maguire and Weisstein. 199 1). [See also perceptual shrinkage associated with amodal contours, Kanisza and Gerbino (19821.1 Depth cues enhance the probability that a region will be seen as in front, even as a region appearing to be a figure brings it closer in depth. All these changes reflect a perceptual coupling of such stimulus dimensions as binocular disparity, size/ spatial frequency, and motion/ temporal frequency. We believe these perceptual effects are best explained by a shift in activity in the transition from figure to ground to a dominance of one pathway or another different in sensitivity to all of the above stimulus dimensions (see Maguire and Weisstein. 1991, for a discussion). Such a transition in turn may be accomplished by an inhibitory coupling of temporal and spatial signals a t some level of the visual system.
Fundamental implications of the combined model The model has two major implications regarding figure and ground. The first is that the perception of figure and ground is completely relative. The comparator of Equation 3 does not require information about where the signals come from, spatial channels alone, temporal channels alone or a combination of spatial and temporal channels. It merely assigns figure to the larger of the summed signals. This implies for instance, that any manipulation that increases the activity of spatial channels in one location rather than a n adjacent location will increase the perception of figureness in that location. The second implication is that the input from spatial and temporal channels to the figure mechanism in a region is antagonistic. Any manipulation that increases spatial channel activity over temporal channel activity increases figureness. Any stimulus manipulation that increases temporal channel activity over spatial channel activity decreases figureness. These principles allow u s to make a number of predictions about figure/ground organization as a function of varying stimulus parameters. We consider some predictions based upon varying the contrast of test regions. We have found that in the contrast region well above threshold, a stationary grating that appears as ground will reverse to figure as its contrast is raised provided its contrast becomes sufficiently greater
FIGURE AND GROUND
157
than its neighbor. The effect of increasing contrast in general is to increase channel response broadly. Temporal channels however show high contrast gain and response saturation a t relatively low contrast (Pantle and Sekuler, 1969; Pantle. Lehmkuhle, and Caudill, 1978). This means that when one varies the contrast of stimuli well above contrast threshold, the effects are limited to the spatial channels since the temporal channels are saturated. Weisstein and Wong (1990) used a disc and annulus configuration filled with differing spatial frequencies and observed figure/ground organization. In one condition the spatial frequencies of the disc and annulus were the same: a shift in phase defining the regions. They presented the disc for one minute and measured the percentage of time during which the disc was perceived a s figure. They found that a n increase in contrast in either region increased its figureness. The region of lower contrast would start as ground, and with increasing contrast it finally would be seen as figure. In earlier experiments with random dot fields we looked a t temporal modulation depth of the flickering fields Wong and Weisstein. 1984). We found that 100% modulation of a region, flickering in the range of 6- 12 hz.,produced the greatest depth separation and ground response to that region relative to a stationary field of dots (Wong and Weisstein, 1984). These 100% modulation fields are essentially on/off flickering stimuli. Modulations of the random dot fields less than 100% created substantial stationary components in the pattern. A modulation of O?! would simply have been a stationary field of random dots. So it is consistent with the above theory that removing stationary components from the display would decrease spatial channel activity and increase the ground response of the field. The use of random dot stimuli makes contrast predictions which are based upon the contrast response to grating patterns somewhat difficult since the random dot fields are broadband stimuli and may stimulate many channels. We intend low contrast experiments with contrast reversing sine wave patterns in the future to determine whether raising contrast will increase ground response in the contrast range where the temporal channels are monotonically increasing in activity.
A model of figure-ground perception based on antagonistic P pathway and M pathway interactions in the visual system The temporal and spatial channels we have described closely resemble the transient and sustained channels of earlier work (Kulikowski and Tolhurst, 1973: Breitmeyer, 1975; Breitmeyer and Ganz, 1976; Weisstein et al., 1975. Meyer and Maguire, 1977). With Breitmeyer (this volume) we feel a case can be made that these psychophysically defined channels correspond closely to the anatomically defined parvocellular, P pathway and magnocellular. M pathway in the primate. By this reasoning the P pathway corresponds to the spatial (sustained) channels, while the M pathway corresponds to the temporal (transient) channels. We can make the link explicit by rewriting Equations 1-3 substituting P I rc(a.T) (parvo response), and M I rc(a.7) (magno response) for the spatial and temporal frequency tuned channels, respectively.
158
CHAPTER 5
Figureness = Max ( F ( CPI rc(a,z) - CM I rc(a.7) ) ,
where two regions a and b are potential figures. Strength = v 1 ( CP I rc(a,.c) - CM I rc(a.7)
Since figure/ground segregation inevitably leads to perceptions of relative depth between figure and ground regions, a two pathway model of figure/ground segregation would appear to contradict the view that depth perception is largely accomplished by the M pathway (Livingstone and Hubel, 1987). There appears now to be ample evidence that depth perception is not simply accomplished by M pathways. The argument for a n exclusive magnocellular mechanism is principally supported by demonstrations of difficulty in depth perception with isoluminant figures. The assumption that isoluminance leaves parvocellular function intact while eliminating magnocellular function however, has been strongly challenged recently (Cavanagh. 1989: Ingling and Grigsby, 1990; Logothetis et al., 1990; Breitmeyer, this volume). Isoluminance does not appear to be a n effective way of segregating magnocellular from parvocellular function. At the same time other evidence does suggest a role for P pathways in depth perception. This includes the loss of fine stereoscopic discrimination in rhesus monkeys after parvocellular but not magnocellular lesions (Schiller et al., 1990) and the finding that monocular depth perception is normal with stabilized images, stimuli which should effectively isolate the P pathway (Ingling and Grigsby, 1990). The loss in depth perception and clarity of figure/ground reported with isoluminant stimuli while not complete is striking, but current research indicates that this may stem from loss of activity in both P and M systems (Logothetis et al., 1990). Reduction of activity in P and M pathways would lead to a loss of depth and figure/ground stability in our model. This is can be seen by considering a n extreme example. Where one region strongly stimulates the M pathway, and another strongly stimulates the P pathway. A large figure-ground difference is computed for these two regions. As magnocellular and parvocellular activity is reduced. this difference gets smaller and smaller, reducing apparent depth and stability of perception. The correspondence of P pathway and spatial channels should be discussed. The P pathway has been described a s a system specialized for color information (see Shapley. this volume). Color channels have been found to have extremely poor spatial resolution (Poirsson and Wandel. 1990) making them very poor candidates for the spatial channels. We have been looking however a t luminance not color contrast and there are data around to suggest that high spatial frequency sustained information is carried by the P pathway. Kelly (1981) found that the peak of the contrast sensitivity function was
FIGURE AND GROUND
159
shifted to higher spatial frequencies for stabilized grating images. There is also considerable research in the macaque to suggest a relationship between the P pathway and the processing of high spatial frequency information. Parvocellular neurons generally have smaller receptive fields and show tuning to higher spatial frequencies [Derrington and Lennie. 1984). There is 2-DG uptake in parvocellular sites with high spatial frequency luminance grating stimuli (Tootell et al.. 1988). Chemically induced lesions in the parvocellular layers reduce contrast sensitivity across the full range of spatial frequencies including high spatial frequencies (Merigan, 1989: Logothetis et al.. 1990).
Suppression of M pathway response by diffuse red light and figure-ground perception The explicit linkage of our spatial and temporal channels to P and M pathways respectively allows us to make some psychophysical predictions based upon what is known about the physiological properties of magnocellular and parvocellular pathways. Diffuse red light is unique in that it is an effective suppressor of magnocellular activity (Livingstone and Hubel. 1984. 1987: Derrington e t al.. 1984). Breitmeyer and Williams (1990) found a psychophysical analog showing that diffuse red light could be used to suppress transient masking responses in a metacontrast paradigm. In a new set of experiments Weisstein and Brannan. 1991) we have examined the effects of diffuse red light on figure/ground perception. We used a bipartite field comprised of horizontal gratings divided in the middle. On one side of the bipartite field was a 1 c/deg sinewave grating, on the other a 1.4 c/deg sinewave grating. Using achromatic gratings, the 1.4 c/deg grating consistently appeared "in front of' the 1 c/deg grating. However, when one side of the field was diffusely illuminated by red light, and the other by diffuse green light matched in luminance, the red grating consistently appeared "in front of' the green grating. This was true regardless of which spatial frequency was illuminated by the red light. In fact the result is the same when both regions have the same spatial frequency. The red side also displayed an interesting appearance: the red and black stripes appeared to be somewhat three-dimensional and wavy, as if the red was "in front of' the black, and the black "in front of' the green grating. We also tested very high spatial frequencies (15 and 21 c/deg). As achromatic gratings, neither appeared significantly in front of the other. When exposed to diffuse red and green light, no change in figure/ground appearance or depth occurred. Finally, we looked at red and green gratings presented alone, to determine if the effect is absolute or relative. Although there was minimal waviness to the red 1 c/deg grating presented alone, the strong effect seen with the bipartite field was not present. These findings provide strong support for the model. If magnocellular activity in a region is reduced, by whatever means, be it by the spatial and temporal composition of the area or its spectral characteristics, the region is more likely to be perceived as a figure and in front of adjacent regions. The chromatic effect is not evident with high spatial frequencies, because the spatial patterns elicit minimal activity from the magnocellular pathway, hence there is little
160
CHAPTER 5
or no activity to suppress. Taken together with the wealth of evidence relating to spatial and temporal effects on figure-ground perception, this preliminary chromatic data supports the hypothesis that the M pathway codes for background, while the P pathway codes for foreground. The actual appearance of a pattern is determined by comparison of M and P outputs.
Further implications of the model As originally proposed by Livingstone and Hubel (1987,1988) the organization of the primate visual system represented a true parceling of function into different processing streams with most form related functions handled exclusively by the M system. The model of figure/ground processing that we propose assumes that M and P systems overlap greatly in their response characteristics with differences in sensitivity generally being differences of degree. The more general question is whether it is necessary or desirable for different analyzers, channels or pathways to code exclusively for one feature or "primitive" or another. This has certainly been a popular idea among vision scientists (Livingstone and Hubel. 1987; Treisman and Gormican, 1988; Zeki, 1978). We are often disappointed to find that still another promising class of detectors, fails to code dimensions separably or shows broad responding across numerous dimensions. Empirically it appears that there is extensive overlap in response characteristics between the different analyzers, channels, and pathways. We do not believe that the extensive overlap in responding of "primitive" analyzers precludes their being labeled. We expect that analyzers carry labeled information, and this information has sensory effects, but is also crucial in producing perceptual properties. We perhaps fail to understand the labeling because perhaps these labels are terms in various heterarchical computations (e.g. Grossberg, 1987a.b). We seriously doubt that labels associated with analyzers reach thresholds as perceptual primitives of any sort. (In this connection, it is noteworthy that visual search paradigms where "primitive" features are supposed to "pop out" are heavily dependent on image context such as direction or lighting and structure of distractors.) We believe that the response components of early visual processing overlap because perceptual relationships are coded by broad arrays often carrying redundant information. The current model of figure/ground segmentation will serve to illustrate the point. Figure and ground relationships are coded broadly by analyzers in both P and M streams. Only when that information has been integrated do the current figure/ground relationships emerge. These same analyzers simultaneously are labeled to carry information about size, direction of motion, etc. of the images in the visual field. Because all this information passes through the same sets of analyzers we note empirical correlations between image size, direction of motion etc. and figure/ground organization. Nor do we feel that such a model of necessity implies a "higher" level a t which M and P input forms the basis for figure/ground computations. It appears t h a t these computations take place in some sense a t the same level a s conceptually more fundamental feature extraction operations.
FIGURE AND GROUND
161
To state this another way, our model rests on the assumption that fundamental perceptual processing is part of what early visual components do. We have held to this idea for a long time (Weisstein. 1968. 1969, 1970, 1973; Weisstein and Maguire, 1978) but it is only recently with the renewed interest in M and P streams that the idea has received widespread acceptance. The most primitive sensory properties such as local brightness, chroma, edge orientation, etc.. are influenced by complex perceptual relationships and so we might expect that the analyzers whose responses appear to be most correlated with these elementary sensory experiences will show complex responses to a broad array of stimulus dimensions (Weisstein, 1973; Weisstein and Maguire, 1978; Maguire et al., 1990). Figure/ground relationships increase our confidence in this view. In a sense they illustrate the converse of what we found in early research. Earlier we found that complex perceptual relationships would affect simple sensory judgements. In our figure/ground experiments, simple sensory differences between regions affect fundamentally the way they are perceived, affecting in turn apparent form and depth. We need to expand our understanding of the visual system by building models that incorporate these traits in there fundamental organization.
Summary and conclusions We have found, in a number of studies, correlations between the spatial and temporal frequency composition of regions of ambiguous displays and the determination of which regions of the image appear predominantly as figure. In a search for a mechanism, we have found that models that are restricted to spatial or temporal channels alone cannot explain all the data associated with this effect. A model which assumes antagonistic interactions between a set of temporal channels coding regions for ground and a set of spatial channels coding regions for figure appears to provide a fair explanation for our results. We associate these sets of channels with the magnocellular and parvocellular pathways in the primate visual system. New data on effects of diffuse red light on figure/ground perception, and the effects of stimulus contrast on figure/ground perception support the model overall. We conclude that M and P pathways can be identified with neural signals for ground and figure processing respectively. The appearance of the segmented visual image will be determined in part by the relative strength of activity in the two pathways. Relative activity in the two pathways may also determine apparent size, depth, brightness and flicker frequency of the regions. We are currently pursuing experiments to explore these possibilities. Acknowledpments - We would like to thank Davida Teller, Norma Graham, Pat Phillips, Nancy Jerome, Jesse Lemisch. and Andrea Hodelin.
References Alwitt. L.F. (1981). Two neural mechanisms related to modes of selective attention. Journal of Experimental Psychology: Human Perception and Perfomance, 7 . 324-332.
162
CHAPTER 5
Breitmeyer, B.G. (1975). Simple reaction time as a measure of the temporal response properties of transient and sustained channels. Vision Research, 15, 1411-1412. Breitmeyer, B.G. (1984). Visual Masking: An Integrative Approach. Oxford University Press, New York. Breitmeyer. B.G. and Ganz, L. (1976). Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression and information processing. Psychological Review, 83, 1-35.
Breitmeyer, B.G. and Williams, M.C. (1990). Effects of isoluminantbackground color on metacontrast and stroboscopic motion: Interactions between sustained (P) and transient (M) channels. Vision Research, 30. 1069-1075. Broadbent, D.E. (1977). The hidden preattentive processes. American Psychologist, 32, 109-118. Brown, J. and Weisstein, N. (1988). A spatial frequency effect on perceived depth. Perception and Psychophysics, 44. 157-166. Calis. G. and Leeuwenberg. E. (1981). Grounding the Figure. Journal of Experimental Psychology: Human Perception and Performance, 7 1386- 1397.
Cavanagh, P., Tyler, C.W., and Favreau, O.E. (1984). Perceived velocity of moving chromatic gratings. Journal of the Optical Society of America, Section A Optics and Image Science, 1, 893-899. Cavanagh. P. (1989). Pathways in early vision. In 2. Pylyshyn (Ed.) Computational Processes in Human Vision: An Interdisciplinary Perspective. Ablex, Norwood N.J, pp.. 254-289. Derrington, A.M. and Lennie, P. (1984). Spatial and temporal contrast sensitivities of neurons in lateral geniculate nucleus of macaque. Journal of Physiology (Lond),357. 219-240. Dreher, B. Fukada. Y. and Rodieck. R.W. (1976). Identification, classification, and anatomical segregation of cells with X-like and Y-like properties in the lateral geniculate nucleus of old-world primates. Journal of Physiology, ILond). 258,433-452. Frisby, J.P. and Mayhew, J.E. (1978). The relationship between apparent depth and disparity in rivalrous-texture stereograms. Perception, 7. 661-678. Graham, N. (1989). Visual Pattern Analyzers. New York, Oxford. Henning. G.B.. Hertz, B.G. and Broadbent. D.E. (1975). Some experiments bearing on the hypothesis that the visual system analyzes spatial patterns in independent bands of spatial frequency. Vision Research, 15,887-899. Hochberg. J. (1971). Perception: Space and movement. In J.A. Kling and L.A. Riggs (Eds.), Woodworth a n d Schlosberg's Experimental Psychology. New York: Holt, Rinehart. and Winston. Ingling. C.R. and Grigsby, S.S. (1990). Perceptual correlates of magnocellular and parvocellular channels: seeing form and depth in afterimages. Vision Research, 30.823-828. Johansson, G. (1976). Spatio-temporal differentiation and integration in visual motion perception. Psychological Research, 38. 379-393. Julesz. B. (1975). Experiments in the visual perception of texture. Scientij??American, 232,34-43. Julesz. B. (1978). Perceptual limits of texture discrimination and their
FIGURE AND GROUND
163
implications for figure-ground separation. In E. Leeuwenberg (Ed.) Formal Theories of Perception. New York. Wiley. Julesz, B. (1987). Preattentive human vision: link between neurophysiology and psychophysics. In Vernon and B. Mountcastle (Eds.), Handbook of Physiology Section 1-Nervous System Vol 5, Higher Functions of the Brain, Pt 2. American Physiological Society, Bethesda. Maryland. Kanizsa. G., and Gerbino, W. (1982). Amodal completion: Seeing or thinking. In J. Beck (Ed.). Organization and Representation in Perception, Lawrence Erlbaum Associates, Hillsdale N.J., pp. 167 -190. Kelly, D.H. (1981 ) . Disappearance of stabilized chromatic gratings. Science. 214, 1257-1258. King-Smith, P.E. and Kulikowski, J.J. (1975) Pattern and flicker detection analysed by subthreshold summation. Journal of Physiology. 249, 519-548. Klymenko, V. and Weisstein. N. (1986). Spatial frequency differences can determine figure-ground organization. Journal of Experimental Psychology: Human Perception and Performance, 12. 324-330. Klymenko, V., Weisstein, N.. Topolski, R. and Hsieh. C.H. (19891. Spatial and temporal frequency in figure-ground organization. Perception and Psychophysics, 45, 395-403. Klyrnenko, V. and Weisstein, N . (1989a). Figure and ground in space and time: Temporal response surfaces of perceptual organization. Perception, 18, 627-637. Klymenko, V. and Weisstein. N. (1989b). Figure and ground in space and time: 2. Frequency velocity and perceptual organization. Perception, 18. 639-648. Koffka, K. (1935). Principles of Gestalt Psychology, Harcourt Brace, New York. Kulikowski, J.J. and To1hurst.D.J. (1973). Psychophysical evidence for sustained and transient detectors in human vision. Journal of Physiology, 232, 149-162. Livingstone. M.S. and Hubel. D.H.(1984).Anatomy and physiology of a color system in the primate visual cortex. Journal of Neuroscience, 4. 309-356. Livingstone, M.S. and Hubel. D.H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7 . 34 16-3468. Logothetis, N.K., Schiller, P.H., Charles, E.R.. and Huthbert. A.C. (1990). Perceptual deficits and the activity of the color-opponent and broad-band pathways at isoluminance. Science, 247, 2 14-217. Lu, C. and Fender, D.H. (1972).The interaction of color and luminance in stereoscopic vision. Inuestigatiue Opthalmology and Visual Science, 11. 482-490. Mandler, M.B. and Makous. W. (1984). A three channel model of temporal frequency perception. Vision Research, 24, 188 1 - 1887. Maguire, W., Weisstein, N., and Klymenko, V. (1990). From visual structure to perceptual function. In K. Leibovic (Ed.). Vision: A convergence of disciplines. Springer Verlag. New York. 254-3 10. Maguire, W., and Weisstein. N. (1991). The effects of figure-ground organization on the perception of regional features. Manuscript in preparation.
164
CHAPTER 5
Merigan. W.H. (1989). Chromatic and achromatic vision of macaques: Role of the p pathway. Journal of Neuroscience, 9. 776-783. Meyer, G.E., and Dougherty. T. (1987). Effects of flicker-induced depth on chromatic subjective contours. Journal of Experimental Psychology: Human Percpetion and Performance, 13,355-360. Meyer. G.E., and Maguire, W.M. (1977). Spatial frequency and the mediation of short term visual storage. Science, 198. 524-525. Nakayama, K. Shimojo. S . and Silverman. G.H. (1989). Stereoscopic depth: Its relation to image segmentation, grouping, and the recognition of occluded objects. Perception. Norman, J., and Ehr1ich.S. (1987). Spatial frequency filtering and target indentification. Vision Research, 27. 87-96. Pantle. A., Lehmkuhle. S . , and Caudill. M. (1978). On the capacity of directionally selective mechanisms to encode different dimensions of moving stimuli. Perception, 7, 261-267. Pentland, A.P. (1985). The focal gradient: Optics ecologically salient. Investigative Ophthalmology and Visual Science, 26, 243. Pomerantz. J.R., and Kubovy. M. (1986). Theoretical approaches to perceptual organization. In K.R. Boff. L. Kaufman, and J.P. Thomas (Eds.), Handbook of Perception and Human Performance Vol 2: Cognitive Processes and Performance, Chapter 36. New York: Wiley. Ramachandron. V.S., and Anstis. S. (1986). Figure-ground segregation modulates apparent motion. Vision Research, 26, 1969-1975. Ramachandran, V.S., and Gregory, R.L. (1978). Does colour provide an input to human motion perception? Nature, 275. 55-56. Rubin. E. (1958). Figure and Ground. In D.C. Beardslee and M. Wertheimer (Eds.) Readings in Perception. Princeton .N.J., Van Nostrand. (Original work published 1921) Sachs. M.B.. Nachmias. J . , and Robson. J. (1971). Spatial-frequency channels in human vision. Journal of the Optical Society of America, 61. 1176-1186. Schiller. P.H., Logothetis, N.K., Charles, E.R. (1990). Functions of the colour-opponent and broad-band channels of the visual system. Nature, 343,68-70. Schor, C.M., and Howarth, P.A. (1986). Suprathreshold stereo-depth matches as a function of contrast and spatial frequency. Perception, 15. 249-258. Shapley, R.. Kaplan. E., and Soodak, R. (1981). Spatial summation and contrast sensitivity of X and Y cells in the lateral geniculate nucleus of the macaque. Nature, 292, 543-545. Shimojo. S., and Nakayama. K. (1990). Amodal representation of occluded surfaces: role of invisible stimuli in apparent motion correspondence. Perception. Shimojo, S . Silverman, GH. and Nakayama, K. (1988). An occlusionrelated mechanism of depth perception based on motion and interocular sequence. Nature, 333,265-268. Shulman. G.L. and Wilson, J. (1987). Spatial frequency and selective attention to local and global information. Perception. 16, 89- 101. Tootell. R.B.. Silverman, M.S.. Hamilton, S.L.. Switkes. E. and DeValois. R.L. (1988).Functional anatomy of macaque striate cortex. V. Spatial frequency. Journal of Neurophysiology. 8, 1610-1624. Watson, A.B. (1986). Temporal Sensitivity. In K. Boff. L. Kaufman. and J.P. Thomas (Eds) Handbook of Perception and Human Performance, Wiley. New York, Chapter 6.
FIGURE AND GROUND
165
Watson, A.B. and Nachmias, J. (1977). Patterns of temporal interaction in the detection of gratings. Vision Research. 17. 893-902. Watson, A.B. and Robson, J.G. (1981). Discrimination a t threshold: Labelled detectors in human vision. Vision Research, 21, 1115- 1'122. Weisstein.N. (1968). A Rashevsky-Landahl neural net: simulation of metacontrast. Psychological Review, 75, 494-52 1. Weisstein, N. (1973). Beyond the yellow Volkswagen detector and the grandmother cell: A general strategy for the exploration of operations in human pattern recognition. In R. Solso (Ed.). Contemporary Issues in Cognitive Psychology: The Loyola Symposium. WH. Winston and Sons, Washington D.C. Weisstein, N., and Brannan, J.R. (1991). A low spatial frequency, red sine wave grating will float in front of gratings with the same or similar spatial frequency but other chromaticities: M and P interactions in figure-ground perception. Investigative Ophthalmology and Visual Science, 32 (suppl.). 1274. Weisstein, N. and Harris, C.S. (1980). Masking and unmasking of distributed representations in the visual system. In C.S. Harris (Ed.) Visual Coding and Adaptability, Lawrence Erlbaum. Hillsdale, N.J. Weisstein, N. and Maguire. W. (1978). Computing the next step: psychophysical measures of representation and interpretation. in A.R. Hanson and E.M. Riseman (Eds). Computer Vision Systems, pp. 243260. Weisstein, N. Ozog, G . and Szoc, R. (1975). A comparison and elaboration of two models of metacontrast. Psychological Review, 82, 375-343. Weisstein. N. and Wong, E. (1986). Figure-ground organization and the spatial and temporal responses of the visual system. In E. Schwab and H.C. Nusbaum (Eds.) Pattern Recognition by Humans and Machines, vol. 2. New York, Academic Press. Weisstein, N. and Wong, E. (1987). Figure-ground organization affects the early visual processing of information. In M.A. Arbib and A.R. Hanson (Eds.). Vision, Brain, and Cooperatiue Computation. Cambridge MA, MIT Press. Westheimer. G. and McKee, S. (1980). Stereoscopic acuity with defocused and spatially filtered retinal images. Journal of the Optical Society of America. 70, 772-778. Wong, E. and Weisstein, N. (1982). A new perceptual contestsuperiority effect: Line segments are more visible against a figure than against a ground. Science, 218, 587-589. Wong, E. and Weisstein, N. (1983). Sharp targets are detected better against a figure, and blurred targets are detected better against a background. Journal of Experimental Psychology: Human Perception and Performance. 9, 194-202. Wong. E. and Weisstein, N. (1984). Flicker induces depth: spatial and temporal factors in the perceptual segregation of flickering and nonflickering regions in depth. Perception and Psychophysics. 35, 229-236. Wong, E. and Weisstein, N. (1985). A new visual illusion: Flickering fields are localized in a depth plane behind nonflickering fields. Perception. 14, 13-17 Wong, E. and Weisstein, N. (1987). The effects of flicker on the
166
CHAPTER 6
perception of figure and ground. Perception and Psychophysics.
41,
440-448.
Wong. E. and Weisstein. N. (1989). The effect of relative image velocities on the perception of figure and ground. Investigative Ophthalmology and Visual Science. 30 (suppl.).74. Wong. E. and Weisstein. N. (1991). Spatial frequency, perceived depth, and figure-ground perception. Manuscript in preparation. Zeki, S.M. (1978). Uniformity and diversity of structure and function of rhesus monkey prestriate visual cortex. Journal of Physiology, 277, 273-290.
Ap licalions of Parallel Processing in Vision J. (mitor) Q 1992 Elsevier Science Publishers B.V. All righfs reserved
Pmm
167
Cooperative Parallel Processing in Depth, Motion and Texture Perception DOUGLAS WILLIAMS
Introduction There are two distinct geniculostriate pathways in the visual system; one variously called the parvocellular, P. or color-opponent system, and the other the magnocellular. M. or broad-band system. As discussed in more detail in earlier chapters of this book, these have different physiological properties and possibly subserve different visual functions (Zeki. 1978: Lennie, 1980: Van Essen and Maunsell, 1983: DeYoe and Van Essen. 1988). The broad-band pathway is assumed to mediate motion and depth perception, while the color-opponent system is involved in texture perception (Livingstone. 1988). Recent physiological results have cast doubt on a strict segregation of these tasks exclusively to one or the other of these pathways (Logothetis. Schiller. Charles, and Hurlbert. 1990). There are however many aspects of stereo, motion, and texture perception which do depend on parallel processing but are not related to the chromatic/broad-band pathway controversy. It is to these aspects of parallel processing to which this chapter is devoted. The parallel nature of visual processing results because visual information is carried by independent spatially localized mechanisms or filters (Campbell and Robson. 1968; Blakemore and Campbell, 1969; Graham and Nachmias. 1971: Sachs, Nachmias and Robson. 1971). The visual scene is in effect fragmented into local entities. Perception requires the integration of these local responses into a global construct. A major challenge confronting psychology is that of understanding how the billions of neurons within the human brain interact to process sensory information and generate behaviorally appropriate responses. What is known is that these myriad neurons are interconnected in networks whose complexity almost defies description. The potential computational power represented by this rich interconnectivity has motivated the development of network models for brain function that are based upon parallel distributed processing (Rumelhart and McClelland. 1986). The informational unit in a parallel distributed network is represented not by the activity of an isolated processor but
168
CHAPTER 6
rather by the parallel activity in a distributed, interconnected set of such processors. Parallel distributed processing is essentially a variant on a form of interaction that has been known for some time from the study of non-linear systems - cooperativity. If the local elements of a parallel network are extensively interconnected and are permitted to interact, then global behavior can be generated that would not occur if the mechanisms were isolated from each other. Such behavior is termed "cooperative." According to the Gestalt school of thought, the act of perception involves more than a simple assimilation of individual sensations. Cooperativity in fact complements well the Gestalt expression that the whole is greater than the sum of its parts.
Depth The first evidence of cooperative neural parallel processing was in visual depth perception. Although we live in a three-dimensional world, its image on our retina is only two-dimensional. Each eye, however, views the world from a slightly different angle. By using this disparity between the two eyes, it is possible to recover the third dimension of information - depth. In order to do this, it is first necessary to be able to determine which retinal projections in the left and right eyes correspond to the same object in the visual field. This is not a trivial problem. For example, consider the case for which the object is an array of four identical dots equally spaced in a horizontal row. There is no apriori information available to determine which dot retinal images in one eye correspond to which in the other eye. In fact with four dots, there are 24 different depth combinations for the four dots which will produce the same two retinal projections. Each combination will be distinguished by which dot retinal projection in one eye corresponds to which dot projection in the other eye. The human visual system must determine which of the 24 potential combinations is the correct one. The combinatorial difficulties can become horrendous for random dot stereograms which consist of thousands of identical dots (Julesz. 1966). The problem is said to be under-constrained, in that there is not enough inherent information in the two retinal images to determine an unambiguous solution. In spite of this difficulty, the human visual system can reliably and quickly solve the correspondence problem. There is evidence that cooperative processes play a critical role in solving this problem. Extensively interconnected parallel networks which are cooperative are capable of exhibiting three properties: multistable states, order-disorder transitions, and "hysteresis." Hysteresis is a form of memory in which a system, having reached a stable state, shows resistance to further change. A consequence of such behavior is that the system's response depends on the history of stimulation. Hysteresis has been demonstrated for binocular stereopsis (Fender and Julesz. 1967). Fender and Julesz found that it was necessary for the left and right images for a stereo pair of random dot patterns to be moved within 6 minutes of visual angle before they fused into a single stereoscopic percept. Once fused, however, the disparity between the two halves could be slowly increased to 2 degrees before the single
DEPTH. MOTION AND TEXTURE PERCEPTION
169
fused percept split into two. Once fusion was lost, the stereo pair had to be returned to a disparity of 6 minutes before refusion was reestablished. The amount of disparity required to fuse or split apart the two stereograms thus depended on the initial perceptual condition and direction of the disparity change. The response of the system is dependent on the history of stimulation. The lag in the change of state of the system (i.e.,from fused to split and vice versa) with stimulation is indicative of hysteresis. In their experiments, Fender and Julesz used stabilized retinal images. As such the hysteresis could not be attributed to the oculomotor system, but rather reflects neural parallel cooperative processing. As discussed above, given the ambiguities inherent in the correspondence process, the computational task represented in binocular stereopsis is said to be under-determined. Marr and Poggio (1979) proposed two constraints which must be applied to the sensory information for the brain to resolve this computational difficulty. The first stipulates that each point in the retinal images be assigned one and only one disparity value. This requirement is based on the principle that every point in physical space has a unique position. The second constraint posits that disparity values vary smoothly almost everywhere a s a consequence of the continuity of physical matter. Cooperative algorithms that implement these constraints have proven successful in resolving the matching problem and extracting disparity information (Sperling, 1970; Dev. 1975: Nelson,1975; Marr and Poggio, 1976). The constraints are incorporated in these models by means of the type and configuration of interactions. In the most general terms, the cooperative interactions which are common to all the models are nonlinear excitation between units of similar disparity tuning and nonlinear inhibition between units tuned to different disparities. These interactions are sufficient to provide an unambiguous solution to the correspondence problem and also exhibit hysteresis. Before considering other aspects of parallel processing, it is worth considering the function of binocular cooperativity. In normal vision random, small-amplitude eye movements occur. As a result, binocular images are constantly going in and out of registry. Since a cooperative system contains a lag between input and output (hysteresis). it will reduce such noise. Therefore, even though we constantly have oculomotor noise, because of our parallel cooperative system perceptually we have stable binocular stereopsis.
Motion
The motion correspondence problem The correspondence problem is not unique to stereopsis. An analogous paradox arises in motion perception. Within the framework of motion, correspondence is defined a s the process that identifies elements in different views as representing the same object a t different times. This maintains a perceptual identity of objects in motion. The difficulty of the motion correspondence problem can be illustrated using a pattern consisting of random dots. If a set of dots is displaced, there is no inherent information to determine which dot in the initial position matches which dot after the displacement. Intuitively it might
170
CHAPTER 6
be expected that a dot is matched or is perceived to move to the nearest displaced dot. However, this simple rule does not always hold (Ullman. 1979). There is evidence t h a t the solution to this correspondence problem requires parallel cooperative interactions. The evidence comes in the form of a phenomenon called "pulling." If the elements in a parallel network are isolated, then the action of a few elements could not influence the others. On the other hand, if cooperative interactions are permitted, the effect of a few elements could propagate throughout the network and change its overall state. Such pulling has been demonstrated in motion perception by Chang and Julesz (1984)using random dot stimuli. The stimuli are constructed such that the motion for every dot is ambiguous. That is. each dot has a potential matching dot after displacement to the left and to the right. Dots are thus likely to be perceived as moving left or right with equal probability. Chang and Julesz demonstrated that if only 4% of the dots are biased to move unambiguously in one direction, then all dots are perceived to be moving in that direction. That is, just 4% is sufficient to pull the entire system to one perceptual state. This phenomenon requires interactions between mechanisms. J u s t as in the case of binocular stereopsis, such a cooperative property in the visual system would maintain a stable motion percept in the presence of noise.
Global coherent motion In the preceding example we have seen a global percept determined by the activity of only a few local elements. There are several demonstrations in which a collection of localized motion vectors, each moving in a different direction, can produce the perception of global coherent motion moving in only a single direction (Adelson and Movshon. 1982;Williams and Sekuler, 1984). An example of this phenomenon is a stochastic random dot cinematogram for which each dot takes an independent, two-dimensional random walk of constant step size (Williams and Sekuler. 1984). Specifically, all dots move the same distance from frame to frame. However, each individual dot's direction of displacement from one frame to the next is chosen at random from a uniform distribution of directions. The resulting perception is dependent on the range of the uniform distribution (Williams and Sekuler. 1984). If the range of the uniform distribution of motion directions extends over a full 360 degrees only the local random motion of the individual dots is evident. This appears very similar to the noise of a detuned television set. However, if the range of the uniform distribution is 180 degrees or less, a percept of global coherent motion is generated and the pattern appears to flow enmasse in the direction of the mean of the distribution. This is true even though the individual perturbations of the dots are still evident. This looks like individual snowflakes that, although each perturbed individually by wind currents, together appear to drift together in one direction. Control experiments demonstrated that the direction of motion of the global coherent motion percept does not simply represent the average of the directions of motion of the local motion vectors (Williams and Sekuler, 1984). If the perception of global coherent motion is a result of cooperative processing, it would be reasonable to expect the percept to
DEPTH, MOTION AND TEXTURE PERCEPTION
171
exhibit hysteresis. That is, one could measure the transition points marking the change from global coherent motion to local random motion and vice versa by gradually changing the directional content of the stimulus between the two extremes of a unique distribution with a range of 180 degrees or less and a uniform distribution with a range of 360 degrees. If the directional content of the stimulus for which these transitions occur depends on whether the perceptual change is from local to global motion or from global to local motion, then the results are indicative of hysteresis. The experimental results (Williams, Phillips, and Sekuler. 1986; Williams and Phillips, 1987a) confirm the existence of hysteresis for the global coherent motion percept. In addition, we were able to account for this hysteresis by cooperative, nonlinear excitatory and inhibitory interactions among direction-selective mechanisms for motion. The stimuli were dynamic random dot cinematograms, comprising 512 dots, generated by computer. Each dot took an independent, two-dimensional walk of constant step size (0.9 degrees]. The direction in which any individual dot moved was independent of its own previous displacement as well as the displacements of the other dots Williams and Sekuler. 1984). The direction of motion for each dot was chosen from either 1) one of two uniform distributions, or 2) a mixture of these two distributions. When all the dots drew their movements from a uniform distribution extending 360 degrees, only the local random motion of individual dots was evident. However, when all the dots drew their position from a uniform distribution of 180 degrees or less, the dots appeared to flow together in the direction of the distribution mean, although individual perturbations were still evident. In all experiments the mean of the signal distribution was chosen to be upward. The 180 degree and 360 degree distributions are referred to as the signal and noise distributions, respectively. When dots drew their movements from a combination of these distributions, each dot's displacement came randomly from either distribution. The resulting perception depended on the relative proportion of noise versus signal. Two types of trials were run in random order. In one type, the direction of motion for each dot was chosen initially from the signal distribution. This produced a percept of upward flow. After a random interval lasting u p to 12 seconds, the proportion of dots drawing directions from the signal distribution was progressively decreased and simultaneously the proportion choosing from the noise distribution increased. This continued until the observer responded that the cinematogram changed its appearance from upward flow to local random motion. After such a response, the proportion of signal continued to decrease for a random interval of up to 6 seconds. Then the process was reversed, with the proportion of signal increasing until the subject responded that the upward flow had reappeared. This response terminated the trial. For the second type of trial, the stimulus sequence was reversed. We started with all dots choosing directions of motion from the noise distribution which generated an initial percept of local random motion. The points of transition to a perception of upward flow and back again to local random motion were then measured. These two different trial structures, signal first versus noise first, were designed to produce different histories of directional
CHAPTER 6
172
exposure. This should reveal any perceptual biases dependent upon the history of stimulation. For both trial types, the signal to noise ratio was changed slowly, with the proportion of dots sampling from each distribution changing by only two dots per frame. At 10 Hz (the frame rate of our display), it took a minimum of 25 seconds for the display to shift from complete distribution on one distribution to complete dependence on the other. Observers viewed monocularly the center of a circular display subtending 16 degrees in diameter. The dots were presented a t two
"Upward"
"Upward"
"Locat'
"Local"
Dbserver TKD Sgnal rrnge.1 00
,
c
n
a
2
II
Upward"
Signd ro
"Upword"
a"
HH
c
=90a
HH
0 .c
"LocaI"
"~ocal" Signal range = 10
n-+
"Upward"
Y o c a 1''
-f
Signal range = 1.
"Upward"
"Local" 1
C
1
I
I
I
l
I
I
I
1
0.2 0.4 0.6 0.8 1 Proportion of Signal
I
C
Proportion of Signal
Figure 1. PerceDtual transitions measured under two different histories of stimuius exposure for three signal distributions. The results, for both observers J.F. and T.K.D., were obtained for signal distributions whose ranges were: 180 degrees, top: 90 degrees, middle; 1 degree, bottom. Measurements were collected using a step size of 0.9 degree. Data points show proportion of "signal" dots required for perceptual transition from local random motion to global upward flow ( 0 ) and for perceptual transition from global upward flow to local random motion ( 0 ) .Error bars indicate one standard deviation (100 measurements). In each panel the separation between transition points measured with the different exposure histories is and index of hysteresis. Note the narrowing and the leftward shift of the profiles with decreasing signal range.
DEPTH, MOTION AND TEXTURE PERCEPTION
173
times threshold luminance against a dim background. One hundred measurements were made over five sessions for each of three naive observers. This experiment was also repeated using two narrower ranges of signal distribution: 90 degrees and 1 degree. Each of the three signal distributions generated a different history of exposure. It is to be expected that as the distribution narrows the signal would become more effective in stimulating directionally selective visual elements that are tuned to upward motion. As a result, the occurrence of a transition between perceived states might require fewer dots. The results from two observers are shown in Figure 1. Similar results were obtained for the other observer. For each observer each panel illustrates data for a different range of signal. Notice that the transitions differ for the three signal ranges: 180 degrees, 90 degrees and 1 degree. More significantly, the transition from local random motion to upward flow requires a larger number of dots sampled from the signal distribution than does the transition from upward flow to local random motion. The two types of transition occur at significantly different ratios of signal to noise for all three signal distributions (P < 0.005). We considered and rejected a number of alternate explanations before attributing our results solely to neural hysteresis. For example, the results were unaffected by the use of a fixation point, suggesting that eye movements played little or no role in the results. Also, the motion after-effect, or waterfall illusion, cannot be responsible for our results since this after-effect would have facilitated rather than retarded the transition from upward motion to noise. Finally, given the slow time course for changing the signal proportion, reaction time could be dismissed as a possible explanation. We needed a more complete account of how spatial parameters might affect hysteresis before we could develop a simple network to describe our results. We used a 180 degree signal distribution and made measurements under four additional conditions: 1) with a fourfold decrease in the spatial density of the cinematogram's dots, 2 ) with a four-fold decrease in the area of the display, 3) with a nine-fold decrease in step size, and 4) with the display shifted horizontally into the periphery of the visual field so that the nearest dot was 4 degrees from fixation. The results for all four conditions did not differ significantly from the original measurements (Williams and Phillips, 1987a). It is probable that extreme changes in the variables would affect the hysteresis characteristics. However, due to these data and to previous results suggesting that spatial variables have little effect on the dot interactions responsible for motion in cinematograms (Baker and Braddick. 1982: Williams and Sekuler, 1984). we chose not to treat space explicitly in the development of a model network to account for our results. Our model comprises a set of direction-selective mechanisms which cover all 360 degrees of motion direction, with each mechanism having a gaussian profile for directional sensitivity. Based on previous results (Williams, Tweten, and Sekuler, 1984). the half-amplitude halfbandwidth of each mechanism's gaussian sensitivity profile was set to 30 degrees. The model, whose mathematical formulation is a modification of the cooperative neural network previously proposed by Wilson and
174
CHAPTER 6
Cowan (1973).assumes nonlinear excitatory interactions among mechanisms sensitive to similar directions of motion and nonlinear inhibition among mechanisms sensitive to different directions. The dynamic response of this cooperative system can be represented by a pair of coupled differential equations. For the excitatory activity, El. in direction channel i:
where S is a nonlinear function of sigmoidal shape, Pi is the external input to channel i, and alJ and by are the excitatory and inhibitory weights, respectively, of channel j with respect to channel i. A similar equation gives the inhibitory activity 11. in channel i:
where S is again the nonlinear function of sigmoidal shape, and cIJand dg are the excitatory and inhibitory weights, respectively. of channel j with respect to channel i. The functional form of the nonlinear sigmoidal function S is given by:
In general terms, interactions such as those in Equations 1 and 2 promote the formulation of stable coalitions between similarly tuned elements within the network (Feldman and Ballard, 1983). These neural coalitions can in turn produce various cooperative properties, including hysteresis. We constrained the parameters of the model so that the model behaved in what is defined (Wilson and Cowan, 1973) as the active transient mode. In this mode, the system shows hysteresis, switching back and forth between different states of activity. The perception of local random motion is represented in the model by a steady state of uniform activity across all mechanisms. Conversely, global upward flow is represented by a steady state in which all the activity is localized about the mechanism selective for upward movement. A transition point can be defined by the proportion of signal at which the network switches between these two states of activity. The results from this model are shown in Figure 2. Dashed lines represent the transition points calculated from the model using a single parameter set. I t is clear that the model captures both the leftward shift and the narrowing of the hysteresis profile with decreasing signal range. The model's behavior can be easily explained. As signal range decreases, more activity is concentrated in fewer motion selective elements arrayed about the upward direction. As a consequence, a smaller proportion of signal dots is sufficient to indicate upward motion. Additionally, fewer active elements reduces the opportunity for cooperative interactions in the network. This translates into a narrowing of the hysteresis profile. Accepting the concept that the perception of motion direction is dependent on a cooperative network, there may be circumstances outside the laboratory in which network's cooperativity might be especially useful. Observers must extract a mean direction vector from
DEPTH, MOTION AND TEXTURE PERCEPTION
175
a scene containing a large number of different local vectors in many naturally occurring situations. For example, we can determine the average direction in which the ocean's surf moves despite the fact that individual waves moves along somewhat different paths. We can also judge the average direction that the wind is blowing the leaves on a tree despite the random variations in that movement from one leaf to the next, or in any one leaf from one time toanother. Faced with such "Upward"
I'Lacal'' "Upward"
l lo cot" "Upward"
"Local" 1
0.0 0.2 0.4 0.6 0.6
Proportion of Signal
Figure 2. Hysteresis profiles as
4.0
1
1
1
1
1
1
1
1
0.0 0.2 0.4 0.6 0.6 1.0 Proportion of Signal
in Figure 1 but with additional data for the 0.1 degree step size. For a step size of 0.9 degree, data points denoted by the symbol, ( 0 ) .show the proportion of signal dots required for perceptual transition from local random motion to global upward flow; and data points denoted by (0) represent perceptual transitions from global upward flow to local random motion. For a step size of 0.1 degree, perceptual transition from local random motion to global upward flow are represented by (m): while transitions from global upward flow to local random motion are represented by ( 0 ) . The dashed lines mark the transition points calculated from a cooperative model incorporating cooperative interactions among directions-selective motion elements (see the text). The same parameter set for the model was used to fit the data in all panels. Note that for both step sizes, the model captures both the leftward shift and the narrowing of the hysteresis profile with decreasing signal range.
176
CHAPTER 6
multivectoral stimuli, a cooperative network like the one described by our model would enhance the signal to noise ratio, thereby facilitating the perception of the mean direction of motion.
Recovery of 3-D Structure from 2-D Motion 3-0 perceptfrom stochastic 2-0 motion In the preceding experiment, if the signal range is 180 degrees then it requires approximately a 0.8 signal to noise ratio before there is a transition from local random motion to global upward flow (see Figure 1). With a range of 90 degrees this ratio falls to 0.5. For this range and for ranges less than 90 degrees an interesting percept is produced for higher proportions of signal to noise. A global three-dimensional percept results (Williams and Phillips, 1986). This percept resembled a side-view of a rigidly rotating cylindrical volume with dots appearing on the surface of the cylinder as well as embedded inside. This volume appeared both to rotate about, and to translate along the upward direction (the mean of the distribution). Perceived direction of rotation, either right-handed or left-handed, varied from observer to observer. Unlike binocular stereopsis which uses the disparity between the two eyes to recover depth, this depth percept can be perceived monocularly. The sole cue to depth is the relative motion of the twodimensional retinal image. The basis for the recovery of an unambiguous three-dimensional percept under such circumstances is unclear since infinitely many combinations of three-dimensional structure and motion can project to the same two-dimensional retinal image. As in the case of the correspondence problems of binocular stereopsis and motion, recovexy is an under-determined task. We sought to determine if a cooperative algorithm might also underlie the recovery of three-dimensional structure in our display (Williams and Phillips, 1986). Using similar methods to those used to demonstrate hysteresis for the global coherent motion percept, we looked for hysteresis in the occurrence a n d loss of our three-dimensional percept. Display dots were permitted to randomly chose their directions of motion from two uniform distributions while the proportion of dots choosing from each distribution was slowly changed. One of the distributions had a directional range of 90 degrees (mean direction, upward) and was used to generate the percept of an upwardly moving, rotating three-dimensional volume. The other distribution had a directional range of 360 degrees, which generated a percept of two-dimensional local, random motion. We again refer to the first distribution as the signal, and to the second as noise, after the percept associated with each. Two types of trials were presented in random order. In one, all dots initially chose directions of motion from the signal distribution, generating an initial percept of a three-dimensional volume. After a random amount of time, the proportion of dots choosing from the signal distribution decreased slowly (under computer control) while the proportion of dots choosing from the noise distribution increased slowly. This continued until the subject responded that the three-dimensional percept had given way to a flat, two-dimensional percept. Following this response, the proportion of signal continued to
DEPTH, MOTION AND TEXTURE PERCEPTION
177
decrease for a random time. The procedure was then reversed and the proportion of dots choosing from signal increased until the observer responded that the three-dimensional percept had been restored. The latter response terminated the trial. In the other trial type, the procedure was reversed, with all dots initially choosing their directions of motion from the noise distribution. In this type, perceptual transitions were from two-dimensional to three-dimensional and back again. I t should be noted that for both types of trials, as the display shifted from complete dependence on one distribution to complete dependence on the other, a third percept was evident. Between the three-dimensional percept of a moving cylinder and the twodimensional motion percept of local random motion there is the intermediate two-dimensional motion percept of global upward flow described earlier. Subjects were instructed to ignore the difference between the two-dimensional motion percepts of local random motion and global upward flow and only respond to a transition between twodimensional and three-dimensional structure. Transition measurements were therefore actually obtained for the perceptual transition between the three-dimensional percept of a moving cylinder and the twodimensional percept of global upward flow. The two types of trials produce different histories for the directional content of the display, and thus should expose perceptual effects dependent upon history of stimulation. One hundred measurements were made of the signal content in the display for each perceptual transition (i.e.. two-dimensional to three-dimensional and vice versa ). Further data were obtained for two narrower signal ranges, 40 and 10 degrees. The results from two observers at all three signal distributions are shown in Figure 3. with each signal range represented in a different panel. For all three signal distributions, the two types of transition occur at significantly different ratios of signal to noise (P<.005).Note that the transitions differ for the three signal ranges, i.e., decreasing the signal range generally shifts the transition points to the left. Although changes in the directional content, and thus the history, of the stimulus altered the hysteresis profile, rather large changes in certain spatial parameters of the display did not greatly affect an observer's results. This property is similar to the results obtained for motion hysteresis reported in the previous section. Specifically, a four-fold decrease in either dot density or display area did not significantly alter the hysteresis profile. Likewise, a nine-fold decrease in step size had no appreciable effect. However, in contrast to the results for motion hysteresis, displacing the stimulus horizontally into the periphery so that the nearest dot is 4 degrees from fixation did change the results. The form of this change will be important for our conceptualization of the recovery of structure from motion which will be discussed later. At this point there will be no loss of generality by ignoring it. The other results which demonstrate that the hysteresis profile is not change by altering spatial parameters of the display, suggest that it should be possible to account for the hysteresis results solely on the basis of the history of the directional content of the stimulus. Furthermore, by demonstrating hysteresis in this three-dimensional percept. our results support a cooperative interpretation for the recovery of structure from motion.
CHAPTER 6
0.0
0.2
0.4
0.6
0.8
08
0.2
0.4
0.6
0.8
1.0
Proportion of Signal
Figure 3. Perceptual transitions for structure from motion measured under two different histories of stimulus exposure (indicated by the arrows) for three different signal distributions. The results were obtained for signal distributions whose ranges were: 90 degrees, top; 40 degrees, middle: 10 degrees, bottom. For observer T.P.D. measurements were collected using a step size of 0.9 degrees, while for L.V.W. it was 0.1 degree. Data points show proportion of signal dots required for perceptual transition from two-dimensional structure to ) and for the three-dimensional structure ( T.P.D.. (MI: L.V.W.(.) perceptual transition from three-dimensional structure to twodimensional structure ( T.P.D., (0); L.V.W. ( 0 )). In each panel the separation between transition points measured with the different histories is a n index of hysteresis.
DEPTH, MOTION AND TEXTURE PERCEPTION
179
To solve the perceptual task of recovering three-dimensional structure from two-dimensional motion, the visual system requires some form of internal constraint. This is necessary because even though an infinity of three-dimensional structures can generate the same two-dimensional pattern of motion, the human brain exhibits the capacity to signal the single correct three-dimensional structure of a moving object (Wallach and O'Connell, 1953). Certain perceptual studies suggest that to recover structure from motion, the visual system "assumes" that the object being seen is rigid (Wallach and O'Connell. 1953;Gibson and Gibson, 1957;Green, 1961;Jansson and Johansson, 1973; Johansson, 1975). A rigidity-based scheme for the recovery of structure from motion reflects the observation that physical objects in motion, even those that deform with time, can be considered in a first approximation to be rigid over sufficiently short time intervals. In its simplest form, the rigidity constraint requires that three-dimensional Euclidean distances among a rotating object's elements do not change with time, even though two-dimensional distances in the projected image do change. As such, it implies that specific spatial-temporal relations among elements in a two-dimensional projection are critical to recovery of structure from motion. This assumption is challenged by our observation that a rigid three-dimensional percept can be obtained using a two-dimensional stimulus in which the motion is locally stochastic. The following examination of the trajectories of elements in our display demonstrates the inadequacy of the rigidity-based interpretation. For relatively short time intervals, there are some local similarities between our random-dot stimulus and the two-dimensional projection of an actual cylindrical volume that is rotating and moving upwards (Figure 4a). Consider the trajectories of points in such a moving cylinder. For exposition, only right-handed rotation will be considered, that is. rotation in which the front surface of the cylinder moves toward the right. Now consider a set of points in this volume that lies along the line of sight at time to. A short time later at time t l , points in the front half of the cylinder will have moved upward and to the right, while those in the back will have moved upward and to the left. The amount of horizontal displacement increases with proximity of the point to the surface of the cylinder. As shown in Figure 4b. the two-dimensional projection of displacements of these points in the moving cylinder comprises a distribution of motion directions centered about the vertical, as does our random-dot stimulus. The similarity between the two-dimensional projection of a moving cylinder and our stimulus breaks down, however, when one examines the path of points for temporal durations longer than two frames. In the two-dimensional projection of a moving cylinder, points can change from rightward to leftward motion or vice versa a t most twice for one complete revolution of the cylinder, By contrast, dots in our stimulus randomly change their direction of motion from frame to frame. For example, a dot moving in a rightward direction on one displacement has a 50% chance of moving in a leftward direction on the next. To explore why such random behavior does not disrupt the stability of the three-dimensional percept, we compared the
CHAPTER 6
180
I( Axis of Rotation
tl
t0
a
b
2-D Projection
Figure 4. (a)Schematic representation of a three-dimensional cylinder which is simultaneously rotating about and moving along the vertical. The positions of five representative points at two different times are shown. Initially (to) points lie along the line of sight. A short time later ( t l ) points in the front half of the cylinder have moved upward and to the right, while those in the back have moved upward and to the left. (b)The two-dimensional projection, orthogonal to the line of sight, of the trajectories shown in Figure 4a. This is analogous to the directional content of our display, since the directions of motion in the stimulus are random samples chosen from a set of directions centered about the vertical mean of the distribution.
DEPTH. MOTION AND TEXTURE PERCEPTION
181
perceptibility of three-dimensional structure for three conditions, each of which defined different paths for the dots (Williams and Phillips, 1986). For each condition, the permissible directions were distributed uniformly over 90 degrees. In the first condition. all dots simply chose their directions of displacement a t random from the uniform distribution. In the second condition, all dots again chose at random from the distribution, but now the direction of motion for any dot successively alternated between rightward and leftward. The final condition differed in that a dot moved only in the direction of the initial random choice for that dot. Except for the first displacement (the first two frames), the three conditions generate dramatically different dot paths. In spite of these differences, the resulting three-dimensional percepts looked remarkably similar with respect to both the shape of the cylinder and the speed of its rotation and translation. We next quantified the effect of dot path on the recovery of s t r u c t u r e by obtaining frequency-of-seeing d a t a for t h e three-dimensional percept as a function of the number of frames presented (frame rate: 10 Hz). A two-interval, forced-choice procedure was used. In one interval, chosen at random, dots moved according to one of the three conditions described above. In the other interval, dots randomly chose their directions of motion from a 180 degree uniform distribution producing a two-dimensional percept of global upward flow. The duration of both intervals was identical for a given trial but varied from trial to trial. Observers identified the interval in which the three-dimensional percept occurred. The results are shown in Figure 5 for two naive observers. Note that more than two frames (one displacement) are required to reliably perceive three-dimensional structure. The data further indicate that the strength of the three-dimensional percept does not differ significantly over the three conditions, suggesting that the recovery of structure-from-motion does not depend on the detailed spatio-temporal relations among local motion vectors but rather on the distribution of motion directions present from frame to frame. A rigidity constraint will not suffice to account for the recovery of three-dimensional structure in our display. To generate a three-dimensional percept of a moving cylinder solely from the local directional content, the visual system must have the capacity to do in effect the reverse of the projection analysis outlined above. That is. it must sort the sampled two-dimensional directional distribution into apparent depth planes such that the closer the sampled directional is to the extremes of the uniform distribution the closer it appears to the surface of the rotating volume. To perform this segregation task, two constraints are required. The first assumes that motion vectors moving in different directions in the same location of the visual field lie a t different depths from the observer. This constraint reflects the observation that physical objects moving in different directions in the same region of the visual field must be a t different depths in order to avoid collision. The second constraint requires that apparent depth varies smoothly almost everywhere with changes in the direction of motion. This constraint follows simply from the continuity of physical matter. These constraints closely parallel those postulated by M a r r and Poggio (1979) for the recovery of depth in binocular stereopsis by cooperative interactions, which were discussed earlier in this chapter.
CHAPTER 6
182
I
0
2
4
6
8
10
Number of Frames
100
.
90
.
80
.
70
-
60
-
50
' 0
2
4
6
8
10
Number of Frames Figure 6. Data from two observers (T.P.D., L.V.W.)showing the percentage of trials on which the observer [correctly) distinguished a three-dimensional from a two-dimensional motion percept. Results are shown for three different conditions. For the condition denoted by ( 01, all dots chose their directions of motion at random from a uniform distribution of range 90 degrees. For condition (0). all dots again chose directions of displacement from this distribution, but the direction of motion of any dot alternated between rightward and leftward on successive displacements. In the final condition (A), dots chose a t random from the distribution only for the first displacement; for the remaining displacements, a dot moved in the direction of this initial random choice. Discriminability of the three-dimensional percept does not differ significantly over the three conditions.
DEPTH, MOTION AND TEXTURE PERCEPTION
183
The cooperative model that we present to describe these hysteresis data extends one that we proposed earlier to account for hysteresis in two-dimensional motion perception (see Equations 1 - 3). The earlier model comprised a set of direction-selective mechanisms that together covered all 360 degrees of motion direction, with each mechanism having a gaussian profile for direction selectivity. Generally speaking, the cooperative interactions in this model consisted of non-linear excitation among mechanisms sensitive to similar directions of motion and non-linear inhibition among mechanisms sensitive to different directions. We have extended this model to describe the observed structure-from-motion hysteresis by constructing a set of layers for signalling apparent depth. Within each such layer, the nonlinear interactions among direction-selective mechanisms are as given for the earlier model. Additional cooperative interactions are required
Directions
0 Excitatory 0 Inhibitory
Figure 6. Schematic of the interactions for the cooperative network used to recover three-dimensional structure from two-dimensional motion. For clarity, only cooperative interactions leading from a single mechanism (highlighted) are depicted. Within a layer, mechanisms sensitive to similar directions of motion facilitate one another, while those sensitive to different directions of motion inhibit each other. Between layers, mechanisms sensitive to similar directions of motion inhibit each other. Mechanisms which lie in different depth layers and are sensitive to moderately different directions of motion are coupled by facilatory interactions while those sensitive to very different directions inhibit each other. When the activity from competing directions of motion reaches a threshold within a single layer, this activity spreads to adjacent layers, corresponding to a transition from two-dimensional to three-dimensional structure. The spread of activity across the layers in the three-dimensional state is analogous to the projection analysis described in the text.
184
CHAPTER 6
between the apparent depth layers, however, in order to account for the hysteresis in the three-dimensional percept (Figure 6 ) . These cooperative interactions implement the two constraints proposed for the recovery of structure from motion. Inhibitory interactions among mechanisms lying in adjacent depth layers but responding to similar directions of motion serve to constrain activity from a single direction of motion to a single depth layer. Conversely, facilitory interactions among mechanisms that lie in adjacent depth layers but respond to moderately different directions of motion serve to distribute the activity across depth layers. In short, when the activity from different directions of motion within a depth layer becomes sufficiently great, this activity "breaks out" into neighboring depth layers. The results from the model are shown in Figure 7. Dashed lines in each panel mark the transitions calculated for the respective signal ranges. For the purpose of the numerical simulation, a transition was defined by the proportion of signal a t which the model network switches between the state in which all activity is confined to one depth layer (a two-dimensional percept) and the state in which activity spreads across several depth layers (a three-dimensional percept). The model captures the leftward shift of the hysteresis profile with decreasing signal range. With a smaller signal range, more activity is confined to fewer direction-selective mechanisms, thus a smaller proportion of signal is needed to switch between the states of the network. For each observer, a single parameter set was used to fit the results from all three signal ranges. Although we chose not to treat space explicitly in the model, other experimental results suggest how the cylindrical form of the three-dimensional percept that we have observed may arise. The magnitude of apparent depth of three-dimensional structure from two-dimensional motion decreases with retinal eccentricity (Hildreth and Koch, 1987). For example, if our stimulus is displaced horizontally into the periphery so that the nearest dot is 4 degrees from fixation, there is no three-dimensional percept (Williams and Phillips, 1986). This inhomogeneity of apparent depth with eccentricity is consistent with the percept of a three-dimensional cylinderical volume whose apparent depth falls off with distance from fixation.
3-0percept from stochastic 1 -Dmotion Thus a percept of structure from motion can result from a random dot cinematogram in which each dot takes an independent twodimensional random walk. Although such a stimulus is stochastic, an apparently rigid organization of the dots into a rotating threedimensional volume is observed. A somewhat more dramatic example of structure from stochastic local motion can be demonstrated using a random dot cinematogram in which each dot takes an independent onedimensional random walk of variable step size. Instead of a distribution of directions as in the previous example, there is now a distribution of dot velocities along a single dimension. Again, a percept of a rigidly rotating three-dimensional volume is generated. Our results are difficult to reconcile with rigidity based models of structure from motion. However, they can be interpreted in the context of cooperative interactions among velocity selective mechanisms.
DEPTH. MOTION AND TEXTURE PERCEPTION
I! t
-I!
t
t
I
0.0
0.2
0.4
0.6
0.8
185
0.0
.
.
0.2
. . . .
0.4
0.6
*
t
n
0.8
3
1.0
Proportion of Signal Figure 7 . Hysteresis profiles as in Figure 3. Perceptual transitions for
structure from motion measured under two different histories of stimulus exposure (indicated by the arrows) for three different signal distributions whose ranges were: 90 degrees, top: 40 degrees, middle: 10 degrees, bottom. For observer T.P.D. measurements were collected using a step size of 0.9 degrees, while for L.V.W. it was 0.1 degree. Data points show proportion of signal dots required -for perceptual transition from two-dimensional structure to three-dimensional structure ( T.P.D.. ( W): L.V.W.(.) ) and for the perceptual transition from three-dimensional structure to two-dimensional structure (T.P.D., L.V.W. ( 0 )). The dashed lines mark the transition points calculated from a model incorporating cooperative interactions among direction-selective motion elements.
(n):
186
CHAPTER 6
The stimulus is again a random dot cinematogram. but in this case each dot takes an independent one-dimensional random walk of variable step size along the horizontal dimension. In the display, the direction and velocity any dot moves from one frame to the next is independent of those of the other dots as well as its own previous displacements. Each dot has a n equal probability of moving left or right from frame to frame. A dot's velocity from frame to frame is defined by a uniform probability distribution. The resulting motion percept depends on the range of the distribution. If the range of permissable velocities extends from 0 deg/sec to greater than 20 deg/sec. only a two-dimensional motion percept of "local random motion" of dots is evident. However, if the extent of the range is from 0 deg/sec to less than 20 deg/sec a global three-dimensional percept results. The three-dimensional percept is that of the side view of a rigid rotating cylinder with dots appearing not only on the surface but embedded inside as well. As in the previous experiments the perceived direction of rotation, either right-handed or left-handed, varied from observer to observer. There are local similarities between our random dot stimulus and the two-dimensional projection of an actual rotated cylinder (Figure 8). For convenience only right-handed rotation will be considered: that is. rotation in which the front surface of the cylinder moves toward the right. Consider a set of points which initially a t time to lie along the line of sight. A short time later, at time t l , points in the front half of the cylinder will have moved to the right, while those in the back will have moved to the left. The amount of horizontal displacement increases with proximity of the point to the surface of the cylinder. Now, horizontal motion in the cylinder will be preserved in a projection to a plane perpendicular to the line of sight. In the two-dimensional projection, shown at the bottom of Figure 8. displacements to the right are projections of points in the front half of the cylinder, while motion to the left corresponds to points in the back. The larger the displacement, or equivalently the greater the velocity. the closer the corresponding point must be to the surface of the cylinder. Thus, the local two-dimensional projection of the rotating cylinder consists of left and right directions of motion and a distribution of motion velocities, as does our random dot stimulus. As was the case for three-dimensional structure from twodimensional stochastic motion, the similarity between the twodimensional projection and our one-dimensional stochastic motion stimulus breaks down when the path of points is examined for temporal durations longer than two frames. Dots randomly change direction from frame to frame in our stimulus. In comparison, for the twodimensional projection points can change direction at most twice for one complete rotation of the cylinder - when it moves from the front to the back and vice versa. To explore why such random behavior does not disrupt the stability of the three-dimensional percept, we compared the perceptibility of three-dimensional structure for three conditions, each of which defined different paths for the dots. In the first condition, each dot chose both its direction and velocity at random for each displacement. In the second, each dot again chose its velocity at random. b u t its direction alternated between left and right on successive displacements. The final condition differed in that a dot
DEPTH, MOTION AND TEXTURE PERCEPTION
187
A
r( Axis of Rotation
-----__-.-- t l 9.
8.
T
*'.
Line of Sight
*. **.
t0
2-D Projection
Figure 8. (a) Schematic representation of a three-dimensional cylinder which is rotating about the vertical. The positions of five representative points a t two different times are shown. Initially (to) points lie along the line of sight. A short time later (tl) points in the front half of the cylinder have moved to the right, while those in the back have moved to the left. (b)The two-dimensional projection, orthogonal to the line of sight, of the trajectories shown in Figure 8a. This is analogous to the directional and velocity content of our display, since the directions and velocities of motion in the stimulus are random samples chosen from a distribution of motion velocities which can move in either to the left or right.
188
CHAPTER 6
moved only a t the velocity and in the direction of the random choice made for its first displacement. In spite of the differences in dot paths, the resulting three-dimensional percepts for the three conditions looked very similar with respect to both shape of the cylinder and speed of rotation (Williams and Phillips, 1987b). To quantify the effect of dot path on recovery, we repeated the analogous experiment from the two-dimensional random walk case. In the experiment the frequency of seeing three-dimensional structure for the three conditions was measured as a function of number of frames presented. Results are shown in Figure 9. The percentage of trials on which the observer reported three-dimensional structure is plotted against the number of frames presented. Again, detection of the threedimensional percept does not differ significantly over the three conditions. Note also that almost six frames are required for reliable detection. The failure to find a difference for the three conditions suggests that recovery of structure does not depend on the specific trajectories of the dots over frames, but rather on the distribution of motion detection and velocity present from frame to frame. To generate a three-dimensional percept of a rotating cylinder solely from the local directional and velocity content, the visual system must do the reverse of the projection analysis. In order to do this two constraints are required. These constraints extent the two proposed for the recovery of structure from motion which were discussed in the previous section. The first assumes that local motion vectors moving in different directions or a t different velocities in the same region of the visual field are assigned to different depths. The second constraint requires that apparent depth varies smoothly almost everywhere. We next sought to determine if cooperative interactions might mediate this recovery of structure from motion by testing for hysteresis. To search for hysteresis, we measured the transition between the twodimensional and three-dimensional perceptual states for two different histories of the distribution of velocities. To do this, display dots were permitted to randomly choose their velocities from two uniform distributions while the proportion of dots choosing from each was slowly changed. One distribution referred to as the signal has a range of 13 deg/sec and generates a three-dimensional percept of a rotating cylindrical volume. The other. called noise, has a range of permissable velocities extending uniformly u p to 26 deg/sec and generates a twodimensional percept of local random motion. For one history of the distribution of velocities all dots initially chose velocities from the signal distribution corresponding to a n initial three-dimensional percept. Over time, the proportion of dots choosing from the signal distribution was progressively decreased and simultaneously the proportion choosing from the noise increased until the three-dimensional percept gave way to a two-dimensional percept. The second history condition was just the reverse. Initially all dots chose velocities from the noise distribution. The structure would initially be two-dimensional. The proportion of signal was then increased until there was a transition to three-dimensional structure. The results will be indicative of hysteresis is the proportion of signal at the transitions is different for the two conditions. For both conditions dots chose their directions of motion either left or right at random from frame to frame.
DEPTH, MOTION AND TEXTURE PERCEPTION
189
100 80 60 40
0 Condition 1 0 Condition 2
20
A Condition 3
0 0
2
4
6
8
10
Number of Frames Figure 9. Data for observer, T.P.D., showing the percentage of trials on which the observer (correctly) distinguished a three-dimensional from a two-dimensional motion percept. Results are shown for three different conditions. Each condition defines different paths for the dots in the stimulus which generates the three-dimensional percept. For the condition denoted by (0).each dot chooses both its of direction and velocity of motion at random for each displacement. For the condition denoted by (0).each dot chooses its velocity at random, but its direction alternated between left a n d right on successive displacements. In the final condition ( A ) , dot moved only at the velocity and in the direction of the random choice made for its first displacement. The three-dimensional percepts for these three conditions were very similar. Using a two-interval forced choice procedure, observers were required to distinguish t h e three-dimensional percept generated by the three conditions from a two-dimensional motion percept. Discriminability was measured as a function of number of frames presented. Viewing was monocular. Each data point represents 60 measurements. Discriminability of the three-dimensional percept does not differ significantly over the three conditions. Furthermore, for both observers almost six frames were required for reliable discrimination for all three conditions. These results suggest that the recovery of structure from motion does not depend on the spatial relationships over time among local motion vectors but rather on the distribution of velocities and directions present from frame to frame.
190
CHAPTER 6
What we obtained for one observer with a 13 deg/sec signal range is shown in the top panel of Figure 10. The two possible structural percepts, three-dimensional and two-dimensional, are plotted against the proportion of dots choosing velocities from the signal distribution. The solid square, together with the error bars, is the mean and standard deviation of 100 measurements of the proportion of signal dots at which there is a transition from twodimensional to three-dimensional structure. For these measurements, all dots initially chose velocities from the noise, and it was necessary to increase the proportion of signal to 90% before the two-dimensional percept gave way to a three-dimensional percept. The open square is the data for the opposite transition from three-dimensional to twodimensional. In this case all dots initially chose velocities from the signal distribution and it was necessary to decrease the proportion of signal to nearly 60% before three-dimensional structure gave way to twodimensional. The data points are significantly different and consistent with hysteresis. Another way we manipulated the history of stimulation was by changing the range of the signal distribution. Using the same procedure, we measured perceptual transitions for two narrower signal ranges: 8 deg/sec and 3 deglsec. The results for these two signal ranges are shown in the lower two panels of Figure 10. respectively. Note that decreasing the range alters the hysteresis profile by shifting transition points to the left (Williams and Phillips, 1987b). To account for the hysteresis results with a cooperative model which depends only on the directional content and the history of the distribution of velocities, we have assumed that there are a succession of apparent depth planes or layers. Each plane consists of a set of velocity selective mechanisms which are selectively sensitive to rightward motion and another set selectively sensitive to leftward motion. Cooperative interactions were required both within and between depth layers to produce response profiles consistent with both the twodimensional and three-dimensional percepts. In the most general terms, for velocity mechanisms selectively sensitive to the same direction cooperative interactions within a single depth plane were such that mechanisms sensitive to similar velocities facilitate one another's responses, whereas mechanism sensitive to different velocities inhibit each other. Interactions between different depth planes are just the reverse. Mechanisms sensitive to similar velocities of movements but lying in different depth planes inhibit one another's responses, while those sensitive to different velocities facilitate each other. Our efforts at providing a cooperative description of the previous hysteresis data are shown in Figure 11. The vertical dashed lines in each panel mark the transition points calculated from the model, suing a single parameter set. As can be seen, the model captures the leftward shift of the hysteresis profile with decreasing signal range. Our results, both for the two-dimensional random walk stimulus and the one-dimensional random walk stimulus, challenge schemes for the recovery of structure from motion that are based solely on an assumption of rigidity or near-rigidity (Ullman, 1984; Grzywacz and Hildreth. 1987). Such schemes require as their input the sequential positions of points in a two-dimensional projection space, from which a three-dimensional volume that is consistent with a rigidity-based interpretation is constructed. The randomness of our stimuli, however,
DEPTH, MOTION AND TEXTURE PERCEPTION
191
3-D
2-D
2
a
3-D
w
u
2 ; ;
B>
. I
w
2 2-D
3-D
I
II 2-D 0.0
0.2
0.4
0.6
0.8
1.0
Proportion of Signal
Figure 10. Perceptual transitions for structure from motion measured under two different histories of stimulus exposure (indicated by the arrows) for three different signal velocity distributions. The results were obtained for signal distributions whose ranges were: 13 deg/sec, top; 8 deg/sec. middle: 3 deg/sec, bottom. Data points show proportion of signal dots required for perceptual transition from twodimensional structure to three-dimensional structure () . and for the perceptual transition from three-dimensional structure to twodimensional structure (0). In each panel the separation between transition points measured with the different histories is an index of hysteresis.
CHAPTER 6
192
3-D
2-D
f a
3-D
1;
.L
0
2
!
3;
Signal range = Jo/sec
3-D
C
&
2-D 0.0
0.2
0.4
0.6
0.8
1.0
Proportion of Signal
Figure 11. Hysteresis profiles as in Figure 10. Perceptual transitions for structure from motion measured under two different histories of stimulus exposure (indicated by the arrows) for three different velocity signal distributions whose ranges were: 13 deg/sec. top: 8 deg/sec. middle: 3 deg/sec. bottom. Data points show proportion of signal dots required for perceptual transition from two-dimensional structure to three-dimensional structure ( m) and for the perceptual transition from three-dimensional structure to two-dimensional structure ( 01. The dashed lines mark the transition points calculated from a model incorporating cooperative interactions among direction-selective and velocity motion elements.
DEPTH, MOTION AND TEXTURE PERCEPTION
193
would clearly confound such an algorithm. Instead, our experimental results suggest that the overall directional/velocity distribution, and not the individual dot path, is important to the generation of motion-based structure in stochastic stimuli. The demonstration of hysteresis in the occurrence and loss of this three-dimensional percept strongly argues for a cooperative algorithm in the recovery of structure from motion. This algorithm, which is constrained by considerations based upon the behavior of physical objects, incorporates cooperative interactions among direction-selective and velocity-selective mechanisms. The implementation of any constraint in the recovery of structure from motion must be compatible with the brain's computational machinery. The brain comprises a multitude of elements (neurons) extensively interconnected to form parallel networks. Networks of this complexity are predisposed to exhibit cooperative behavior. The plausibility of the algorithm is further strengthened by its modest demands on the brain's computational machinery and its robustness in the presence of noise.
Texture Perception The human visual system exhibits a remarkable ability to detect subtle differences in texture. What is texture? The Oxford English Dictionary reveals that the origin of the word lies in the textile industry. Specifically, it refers to the capacity to determine coarse versus fine weaves of cloth. We are familiar with humanly made textures such as different weaves of cloths or carpets, as well as textures created by the repeated random or nonrandom placement of local patterns such as in wall paper. There are also naturally occurring textures: for example, different coarseness of bark on trees. In the most general terms textures are generated from an aggregate of local micropatterns or elements. In natural surfaces, the local repeated features are not identical but can be considered to have a fractal quality. Two examples of texture discrimination are shown in Figure 12. The textures are constructed of three elements types: a T,an L and a +. All three elements consist of two line segments of equal length oriented at right angles to one another, The elements differ only by the exact location at which the two line segments meet. Shown in Figure 12a is a texture group of +s embedded in a texture field of Ls. In Figure 12b a texture of L elements is embedded in a texture of Ts. The orientations of the elements in the textures are randomized. The two examples demonstrate a robust property of texture discrimination: it can easily be divided into two categories - either effortless (texture differences "pop out" at the observer) or requiring scrutiny (differences are only revealed by an systematic search of the patterns). The +s pop-out from the Ls as a parallel process, almost instantaneously, independent of the number of elements (Figure 12a). However, it requires time-consuming element-by-element scrutiny to detect the array composed of Ls among T s (Figure 12b). I t of interest to note that in isolation a single L and a single T can be effortlessly discriminated. It is only as the number of elements increase that segregation becomes difficult. In texture perception a global percept arises from the repeated local features. The fact that textures are devoid of recognizable global forms has been taken as justification for using such stimuli to determine primitive local features of visual information processing. Presumably if
194
CHAPTER 6
mgure 12. Examples of textures composed of the texture elements: T. +. In texture (a), at the top, a target group of four +s is
L and
effortlessly segregated from a field of Ls. However, it requires scrutiny to distinguish the group of four L s from the background T s shown in (b).at the bottom.
DEPTH, MOTION AND TEXTURE PERCEPTION
195
two element types generate respective textures which can be distinguished from each other then a specific feature in one of the elements types could be an emergent feature of the visual system. Current theories attempt to explain texture d!scriminability in terms of specific features of the elements. A property of texture perception which is not dealt with by these theories is the asymmetry effect, that the ease of distinguishing a texture A from a background texture B is not the same as distinguishing the texture B from background texture A (Julesz, 1981; Gumsey and Browse, 1988; Treisman and Gormican, 1988). The existence of such perceptual asymmetries poses problems for feature specific models of texture segregation because the response to the two micropattern types will remain the same even when the roles are switched. Our investigation suggests that this asymmetry, and texture segregation in general, crucially depend on fundamental nonlinear properties of the visual system which are also responsible for visual illusions such as subjective contour and spatial fill-in (Williams and Julesz, 1989; Williams and Julesz, 1990; Williams and Julesz, 1991). Both of these properties have been shown to result from parallel cooperative processing (Cohen and Grossberg, 1985: Grossberg and Mingolla. 1985; Grossberg and Todorovic. 1988).
Subjective closure An example of perceptual asymmetry in texture segregation is shown in Figure 13 for the texture pair consisting of circles with gaps in the circumference and intact circles. As shown in Figure 13a we can effortless detect a gapped circle in a field of intact circles. One might then expect that it should be just as easy to detect an intact circle in a field of gapped ones. However, as shown in Figure 13b, this is not true. To detect a closed circle in a field of open ones is much more difficult and requires element by element scrutiny. Although the orientation of the gaps has been randomized in the figure, the asymmetry is still observed if all the gaps have the same orientation. This is demonstrated in Figure 14. The fact that a gapped circle is more easily detected has been taken to imply that line ends (terminators) marking the gap are a more significant or emergent feature for perception than connectedness (closure) (Treisman and Souther, 1985). To determine if the asymmetry can be attributed to a specific feature of the elements, we used a simplified version of the stimulus in Figure 13 which consisted of only two elements (Williams and Julesz, 1989). Using only two elements eliminates possible interelement effects as well a s complications related to different numbers of targets and distractors. To minimize the effects of spatial inhomogeneity elements were presented on the circumference of a ring which had a radius of 3 degrees of visual angle. The diameter of each element was 1 degree. Elements could occur in any location on the ring and the orientation of the gap was randomized. Observers fixated the center of the ring. During an experiment two different stimuli were presented. On half of the experimental trials, chosen at random, a stimulus consisting of a single target and distractor was presented for 48 msec. To limit the observer’s search time this stimulus was masked by a pattern in which each element was composed of both the target
196
CHAPTER 6
0 00 00000 0 00 0 0000 0 oc 0 0000 0 00 0 0000 0 00 0 0000 0 00 0 0000 0 00 0 0000 0 00 0 0000
Figure 13. An example of perceptual asymmetry is shown here for a texture element pair consisting of a closed circle and a n open circle. The perception of an open circle in the field of closed circles in (a).at the top, yields easier discrtmination than perception of a closed circle in a field of open ones in (b).at the bottom. The orientation of the gaps in the open circles has been randomized.
DEPTH, MOTION AND TEXTURE PERCEPTION
197
000 0 0000 0000 0000 o o c 0 0000 0000 0000 0000 0000 0000 0000 000 0 0000 000 0 0000
CccccCCC CccccCCC CccccCCC CccccCCC CccccCCC CccccCCC C oc c c C C C CccccC CC Figure 14. The texture pair composed of closed circles and open circles for the case in which the orientation of the gap is not randomized. The asymmetry is still observed. It is easier to detect a n open circle in the field of closed ones in (a). at the top, than when the roles are reversed as shown in (b). at the bottom.
198
CHAPTER 6
and distractor elements, juxtaposed next to one another. The masking pattern had a stimulus onset asynchrony of 112 msec and a duration of 48 msec. For the other trials the two elements of the stimulus were both distractor elements. This stimulus was also masked by a pattern with elements consisting of the juxtaposition of target and distractor. Observers were required to determine whether or not the target was presented during a trial. The stimulus consisting of two distractor elements provided a measure of the frequency for which a n observer falsely reported a target present when in fact it was not. Over the course of 5 sessions 500 trials were presented. Four observers participated in the study. One of the authors served as a n observer, while the other three observers were naive to the purpose of the experiments. The results for one observer are shown in Figure 15: similar results were obtained for the other observers. Results depended on which element served as target and which served as distractor. As shown in Figure 15a with a target open circle and distractor closed circle the percent correct for detection is 73%. For two closed circle distractors the percent correct is 86%. If the role of target and distractor are reversed (Figure 15b) then the percent correct for a target and distractor is 87%. By comparison the percentage correct for two open circle distractors is considerably less (57%). In fact this is near chance (50%). This high false alarm rate for detecting a closed circle in a stimulus consisting of just two open circles suggests that open circles are being perceptually closed by subjective contours. The perceptual phenomenon of subjective contours was first discovered by Schumann ( 1904) and significantly elaborated by Kanizsa ( 1976). Additional support for the conclusion that open circles are perceptually closed follows if we take the proportion of incorrect responses for the target open and distractor closed stimulus (27%) to be the probability, p. that one circle is being perceptually closed. Then the probability that for two open circles a t least one appears to be closed is l-(l-p)(l-p). Taking p to be 27%. then the probability that at least one of two open circles is being perceptually closed (approximately 52%) is shown by the arrow in Figure 15b. There is close agreement between data and theory. To eliminate the possibility that a response bias could be responsible for the high false alarm rate, a two alternative forced choice method was also used. The results of these experiments confirm that detection is easier if the target is the open circle rather than the closed one. Subjective closure of the open circles would explain the search asymmetry observed for a target open circle in a field of closed circles compared to a target closed circle in a field of open circles. Because of subjective closure some elements in the field of open circle distractors will be perceived as closed. Therefore. searching for a closed circle in a field of open ones is equivalent to searching for a closed circle in a field of both open and closed circles. This discrimination task is far more difficult than searching for an open circle in a field of closed ones since there is no unique figure element to pop out from the background elements. Thus rather than the gap being a n emergent feature, it is its closure which leads to the asymmetry. This subjective closure would also explain why discrimination becomes more symmetric as the gap is
DEPTH, MOTION AND TEXTURE PERCEPTION
199
CONDITION 1
1 closed
both
8 1 open
closed
CONDITION 2
1 closed (It
both open
1 open
Figure 15. Results are shown for the effect of reversing the role of target and distractor when the elements are an open and closed circle. For the case in which the open circle served as target and the closed circle as distractor are shown in (a), at the top, and for the reverse case in (b),at the bottom. Note the percentage correct for two distractor open circles is considerably less than the other three stimuli, at 57%.
200
CHAPTER 6
widened (Treisman and Souther, 1985). As the gap size is increased the likelihood of subjectively closing the gap diminishes and a closed or open circle become equally detectable relative to their respective background elements. In addition to the gapped and intact circle, subjective closure can account for the search asymmetry for many other texture element pairs (Williams and Julesz.1990). A example is the elongated S &) and 10 (=I. For these elements the gap also figures prominentIy. These elements consist of the same three horizontal line segments and same two vertical line segments. The aspect ratio of vertical length to horizontal length is approximately 1/3. The difference between the elements is the location of the vertical gaps. Textures composed of these elements are indistinguishable (Julesz 1981). I t was proposed that the failure to segregate textures of these elements results from the fact that both elements have the same number of line terminators. To determine if there is subjective closure at the terminators of the contour we carried out a n experiment analogous to the two element experiment with the gapped and intact circle. The discriminability of each of the elements S and 10 were separately tested against an 8 ( 8 1 element which contains no gaps. First consider the case for the S and 8 . As with the gapped and intact circles two stimulus conditions were considered. For the first condition, on half the trials chosen a t random the stimulus consisted of a target S and a distractor 8 . For the remaining trials both elements were 8 . Observers were required to determine if at least one of the elements contained a gap or gaps on each presentation. In the second condition the role of target and distractor was reversed. On half of the trials a target 8 and distractor 8 was presented. On the remaining trials the stimulus consisted of two distractor 8s. In this condition, the observers were required to determine if one or more of the elements on each presentation was completely closed. For the first condition, with a target S and distractor 8 the percent correct for detection is approximately 63%. For two 8 distractors the percent correct is 83%. If the role of target and distractor are reversed then the percent correct for a target and distractor is 8Ooh while that for two distractors drops to 51%. The high false alarm rate for detecting a t least one 8 in a stimulus consisting of just two 8 suggests that the gaps are being perceptually closed by subjective contours as was the case for the gapped and intact circles. The analogous experiment for the 10 and 8 gave similar results. With a target 10 and distractor 8 the percent correct for detection is approximately 58%. For two 8 distractors the percent correct is 88%. If the role of target and distractor are reversed, then the percent correct for a target and distractor is 79% while that for two distractors drops to 44%. Again the high false alarm rate for detecting at least one 8 in a stimulus consisting of just two 10 suggests that the gaps are being perceptually closed. The results suggest that the reason that the 10 and S are not discriminable is that the gaps are being closed by subjective contours and the elements become perceptually indistinguishable. This is also consistent with a n observation by Julesz (1986)that if the aspect ratio of the line lengths is changed to 1. these elements can be discriminated since the gap is less likely to be closed by subjective closure. Two
CHAPTER 6
202
element experiments like the ones described above confirm that the gaps are no longer perceived as closed for such elements. Thus the explanation that the failure to discriminate the S and 10 is due to having an equal number of terminators holds only if the gaps created by the terminators are sufficiently small to permit subjective contours to occur. To further investigate the property of closure in texture segregation we examined segregation with a n element pair for which one has a closed contour (Williams and Julesz. 1990). The elements were a triangle and a n arrow. Both the triangle and arrow are constructed of the same three line segments and angles. As can be seen in Figure 16. each can be effortlessly detected in a field of the others. We sought to determine if the closed nature of the triangle is critical for segregation. To do so gaps were introduced in the three
-
Condition 1 End Gaps
/ Condition 2 - Center Gaps
n L
/ l
Figure 17. Two conditions for the location of the gaps in the triangle and arrow texture elements. Gaps are inserted in the center of the line segments in Condition 1. while the gaps occur at the ends of the line segments in Condition 2.
DEPTH. MOTION AND TEXTURE PERCEPTION
2 03
line segments of each element to disrupt the perception of closure. As with the previous experiments, elements were restricted to a ring, which has a radius of 3 degrees. The short side of the triangle and arrow subtends 1 degree of visual angle. A total of twelve elements, one target and eleven distractors were evenly spaced around the ring. Measurements were made for the case in which the triangle served as the target and arrow as distractor as well as for the reverse case with the arrow as target and triangle as distractor. In each experiment, for half the trials chosen a t random, both the target and distractors were presented. On the remaining trials only the distractors were presented. The subject was required to identify trials on which the target appeared. Two conditions for the location of gaps were considered (see Figure 17). In the first condition, gaps were introduced at the end of each of the line segments. For the second condition, gaps were placed in the center of the line segments. The different locations of the gaps were used to determine if geometric properties such as corners may be critical for segregation or if only closure is relevant. In order to monitor the detectability of the gaps which were introduced into the elements, we also measured discriminability for the intact element compared to the same element with gaps. Figure 18 shows the results in which the triangle served as target. Consider the first condition for which gaps occurred at the ends of lines (Figure 18a). If the target is a gapped triangle and the distractor is a gapped arrow, increasing gap size decreases discriminability (solid squares in Figure 18a). However, if the target is an intact triangle and the distractor is a gapped triangle, discrimination increases as gap size increases (open squares in Figure 18a). Failure to discriminate an intact triangle from a gapped one for small gap sizes is consistent with the subjective closure of the lines comprising the gapped triangle. Comparison of the two data curves for this graph demonstrates that it is more difficult to discriminate a triangle and arrow as the closure of the triangle decreases, Le. as gap size increases. As seen in Figure 18b. similar results were obtained when the gaps were placed in the center of the line segments. Since the results are the same irrespective of whether the gaps were placed at the end or in the middle of lines, suggests that the geometric properties of the triangle are not critical segregation. However, the fact that discrimination decreased with increasing gap size suggests that closure is important. Results were different when the arrow served as target (Figure 19). Note first that, comparing Figures 18 and 19. it is clear that there is a search asymmetry for the arrow and triangle even without a gap present (gap size = 0). Specifically, it is easier to detect an intact target arrow in a field of intact distractor triangles (95%) than an intact target triangle in a field of intact distractor arrows (75%). We will return to this asymmetry below. In general, the results for a target arrow with gaps at the ends of lines, (Figure 19a). are similar to those in which the triangle served a s the target (Figure 18). In particular, as gap size increased, discriminability of a target gapped arrow (triangle) from distractor gapped triangles (arrows) decreased: reaching 10% for the largest gap size (-35degree). However, as shown in Figure 19b. if the gap was placed in the center of the line segments, discriminability of a
CHAPTER 6
204
Condition 1 - End Gaps
loot
--o--
Target: trlPlype Dbtractor: gapped biangle
-
U.0
0.2
0.1
0.3
0.4
Gap Size (degl Condition 2 - Center Gap + 100.
Target: gappedtriangle Dietractor: gapped axrow --IJ-. hrget: trianlpe Dbtractor: gapped triangle
Gap Size (de@
Figure 18. Results are shown for the effect of introducing gaps into the elements when the target element was a triangle. In (a). at the top, data is shown for Condition 1 in which gaps occur a t the ends of lines. In b),at the bottom, data is shown for Condition 2 in which gaps are inserted in the center of line segments. Results are similar for both conditions.
DEPTH, MOTION AND TEXTURE PERCEPTION
205
Condition 1 - End Gaps f-
~oo
Target: g a p p e d m w Distractor: gapped triangle
--m..
Target:
"u.0
m
0.1
W
0.2
0.3
0.4
Gap Size (degl
-
Condition 2 Center Gap +
Target: g a p p e d m w Distractor: gapped triangle
--o-. Target:
:!0
0.1
MOW
0.2
0.3
4
Gap Size (degl
Figure 19. Results are shown for the effect of introducing gaps into the elements when the target element was an arrow. In (a). at the top, data is shown for Condition 1 in which gaps occur at the ends of lines. In (b), at the bottom, data is shown for Condition 2 in which gaps are
inserted in the center of line segments. Note that gaps have a less detrimental effect on detection of the arrow for Condition 2.
206
CHAPTER 6
target arrow from distractor triangles was less affected by gap size (decreasing only to 50% at the largest gap size compared to approximately 10% for the other three conditions). This is true despite the fact that discriminability of an intact and gapped arrow is as strongly affected by gap size as for the previous conditions [open squares in Figures 18 and 19). Thus for the arrow, simple closure does not necessarily facilitate discriminability. However, for the triangle which is closed, results suggest that this closure is significant. As pointed out above, there is a n asymmetry for discrimination of intact arrows and triangles. It is easier to detect a n target arrow in a field of distractor triangles than the reverse. Two element experiments demonstrate that the asymmetry can be attributed to the subjective closure of the open side of the arrow. Subjective closure of the side of the arrow would account for the result that discriminability between an arrow and triangle is diminished if the aspect ratio of the elements (the ratio between the lengths of the perpendicular lines in each element) is increased (Julesz. 1986). The increase in aspect ratio decreases one of the gaps in the arrow and the probability of subjective closure is increased, rendering the arrow and triangle less discriminable. In summary, the results suggest that closure, including subjective closure, is important for texture segregation. In addition, subjective closure can account for the certain perceptual asymmetries of texture segregation. Thus, rather than the asymmetry representing emergent feature of the visual system, it reflects nonlinear processes of the system which are also responsible for certain visual illusions. For subjective closure to occur, "units" which are not stimulated but which are in the neighborhood of those which are stimulated, must become active. Such spreading of activation is easily explained with in the context of a parallel cooperative system. Models incorporating appropriate cooperative interactions do exhibit subjective closure phenomenon (Grossberg and Mingolla. 1985).
Spatial fill-in In contrast to the open and closed circle elements of the previous section, for numerous texture pairs search asymmetry is present if there are many distractors and one or more targets, but not if there is only a single target and a single distractor. For these texture pairs, since there is no asymmetry when only two elements are present, the asymmetry cannot be due to the elements themselves. The simplest element pair which shows this property of asymmetry is a long line (1 degree of visual angle) and a short line (1/2 degree) (Williams and Julesz. 1991). If a set of target long lines are presented in a field of distractor short lines, then it will be easier to detect the target than if the role of target and distractor are reversed (Figure 20) (see also Treisman and Souther, 1985). With only a single target and distractor, discrimination is the same at approximately 90% when the role of target and distractor are reversed. This perceptual symmetry is present even if the SOA is decreased to 80 msec. The percentage correct for this SOA is reduced to approximately 70°h irrespective of whether the long or short line was the target. Therefore, the asymmetry is not due to a change in the perception of these elements. Since asymmetry can not be attributed to the elements
DEPTH, MOTION AND TEXTURE PERCEPTION
/ / '0\/ ' '
\ \
'' /
\
207
'
/ / 0 \ I ' / / \ '4 \ '\
'
'
'
''
Figure 20. Perceptual asymmetry is shown for texture pairs composed of short line segments and long line segments. It is easier to detect the target long lines in the field of short lines than the reverse. themselves, we decided to investigate if it depended on the relationship between elements. To study this possibility, we first decided to examined the effect of spacing between elements. As shown in Figure 21. increasing the spacing facilitates the detection of short lines in the field of long lines. In fact, increasing the spacing between elements eliminates the asymmetry. To more precisely determine the effect of spacing, the discriminability for two conditions was compared (see Figure 22). In one case, six elements were equally spaced over half the circumference of a circle which was 3 degrees in radius. The portion of the circumference on which the elements occurred varied randomly over trials. This case is referred to as the dense Presentation condition. For the second case, the sparse presentation condition, elements were spaced at twice the separation of the first condition. Six elements were equally spaced over the entire circumference of the circle. For both conditions, there was one target and five distractors.
208
CHAPTER 6
Figure 21. Texture pairs composed of the same short line segments and long line segments as shown in Figure 20 but for large spacing between elements. Increasing element separation facilitates detection of the target short line segments. In fact, the perceptual asymmetry as seen in Figure 20 is eliminated by increased separation.
Figure 23 shows the effect of reversing the role of target and distractor for the two conditions. In the first condition, the dense
DEPTH, MOTION AND TEXTURE PERCEPTION
209
Dense Presentation
Sparse Presentation
Figure 22. The two stimulus presentation conditions for measuring the effects of inter-element spacing are shown. In the dense presentation condition, six elements are evenly spaced over half the circumference of the circle. In sparse condition, the spacing is double and the six elements are equally spaced over the entire circumference.
CHAPTER 6
210
0
target: long & dlstractor: short
target: short & &tractor: long
Resentatton
l
r
target: long & distractor: short
n
1
target: rhort & dlstractor: long
Results are shown for the two element separations described in Figure 22. For the dense presentation shown in (a), at the top, discriminability was easier if the long line served a s target and the short line as distractor rather than the reverse. For the sparse presentation shown in (b). at the bottom, performance is independent of target choice.
Figure 23.
DEPTH, MOTION AND TEXTURE PERCEPTION
211
presentation, performance critically depended on whether the short or the long line served as the target. If the target was a long line and the distractors were all short lines, the percent correct for discrimination was 87%. Performance fell to 52% if the roles were reversed; i.e., if the target was a short line and the distractors were long lines. For the second condition, the sparse condition, there was no evidence of asymmetry in discrimination. Whether the short line or the long line served a s the target stimulus, performance was constant a t approximately 70%. Notice that increasing the spacing reduced discrimination from 87Oh to 72% correct when the target was a long line, but increased it from 52% to 71Oh correct if the target was a short line. Thus. the closer spacing has two opposing effects, depending on which stimulus was designated as the target: 1) enhancing performance (target - long line); or 2) diminishing performance (target - short line). To further investigate element interactions on texture segregation, we examined performance for three elements traditionally used in texture investigations (T. L. and +. shown in Figure 12) (Julesz, 1981; Bergen and Adelson. 1988; Gurnsey and Browse, 1988; Julesz and Rase, 1988; Treisman and Gormican, 1988). As in the case of the short and long line, for these elements search asymmetry is also present if there are many distractors and one or more targets, but not if there is only a single target and a single distractor. Thus the asymmetry and texture discrimination in general for these elements can not depend solely on the canonical form of the elements. First consider the element pair, T and L . Performance signiflcantly changes if the role of target and distractor is switched. Percent correct discrimination for a single target T in a circular field of eleven distractor Ls is 60% correct. With the role of target and distractor reversed (i.e.. target L and distractor T) performance drops to 30%. To investigate the effects of interelement relationships on the perceptual asymmetry, we compared discriminability for a target T in a circular field of distractor Ls for two conditions. each of which generates different interelement spacing around the target T . The conditions depended on the orientation of the target T. The T is not rotationally symmetric for a 90 degree rotation. On the other hand, the + is rotationally symmetric and the L is mirror symmetric under a rotation of 90 degrees. In the preceding experiment, the orientation of the T and L were randomized at each location. The elements could take eight possible orientations which sampled 360 degrees of rotation In steps of 45 degrees. For one condition in the next experiment, the orientation of the T was chosen at random from one of the four possible orientations for which the top cross bar is oriented more tangentially to the circumference of the circle, rather than normal to it (Figure 24). The other condition was just the obverse, the possible orientations (four) consisted of those for which the top cross bar is oriented more normally to the circumference of the circle, rather than tangential to it (Figure 24). Of course the four possible orientations for a particular target T which is more tangential or normal to the circumference will depend on the location of the T on the circumference of the circle. Thus for both of the conditions all eight possible orientations of the T will occur during the experiment as they did for the original experiment. The orientation of L was randomized in the same manner as in the original experiment. Results
CHAPTER 6
212
Y
Figure 24. Examples of the two categories of orientation of T relative to the circumference of the circle are shown at the top. For one category the cross bar of the T is more tangential to the circumference. In the other category it is more normal. At the bottom are examples of the two categories of orientation in a circle of distractor Ls. revealed that discrimination is better if the cross bar is oriented more tangentially to the circumference of the circle, rather than more normal (Figure 25). For comparison the results of the original experiment for which the T took on all eight possible orientations independent of location are included. The percentage correct discrimination for this completely random choice case was 60%. With the cross bar tangential in orientation, discrimination improved to 84%. With the bar normal, discriminability is reduced to 43%. The experiment confirms that discrimination of a T from a field of Ls is not just depend on the canonical form of the elements but also is affected by interelement relationships. However, from this paradigm it is not possible to determine if the change in performance is due to the difference in the
DEPTH, MOTION AND TEXTURE PERCEPTION
213
shape of the spaces between elements for the two orientations. or to the line segments of the T being more easily perceived in the tangential case, or to some more complex interaction.
Effect of Orientation 100
75
so
25
0
random
tangential
normal
IFigure 25. Results are shown for the effect of orientation of the T (see Figure 24) on the discrimination of a target T from distractors L. Performance is better if the cross bar of the T is oriented more tangential to the circumference of the circle than if it is more normal to it. Also shown is performance when orientation was chosen completely at random. To carefully examine the contribution of the shape of the space between elements we need a pair of elements which are themselves different, but which can generate the same interelement spaces. Such a pair can be derived from the element pair T and +. For the T and + as for the elements T and L. there is a performance asymmetry with many distractors and one or more targets. Specifically for a circular arrangement of elements, the single target T was correctly
214
CHAPTER 6
discriminated from eleven + distractors 83% of the time, while correct discrimination was only 56% when + served as the target. There is no asymmetry if there is only a single target and a single distractor. Before analyzing interelement spaces in texture patterns composed of Ts and +s, it is instructional to examine a modified version of these elements for which the shape of their interelement spaces could be the same while the elements themselves were different. These stimuli, shown in Figure 26, were a +. and a short T (equivalent to an + with one of the four arms missing). The effect of reversing the role of target and distractor for these elements were measure with elements presented on the circumference of a circle 3 degrees in radius. There was 1 target and 11 distractors equally spaced around the ring. On half the trials both a target and distractors were presented. These trials were randomly interspersed with trials in which only the distractors were presented. In both cases the stimuli were masked with patterns for which the elements consisted of juxtaposing the target and distractor.
Figure 26. The configuration is shown for the short T element, on the left, and the + element, on the right. For these elements, the shape of their interelement spaces can be the same even though the elements themselves are different.
If + is the target in a circle of short T distractors, it is virtually impossible to detect the target. In fact performance on this task was at chance (50%). Observation of the interelement spaces in this condition reveals a logical explanation. An intact target + will present right angles to each of its neighboring elements. One side of the short T is also composed of right angles. With respect to any neighboring short T, the + will be indistinguishable from one side of a short T. Thus the shape of the interelement space between a + and a short T can always be duplicated between two neighboring short Ts. The similarity of interelement spaces renders the target indistinguishable from the distractors. Performance improved to 69% if the role of target and distractor were switched. It is easier to distinguish a target short T from distractor +s. This improvement is expected when the shape of interelement spaces are examined. The interelement spacing between
DEPTH, MOTION AND TEXTURE PERCEPTION
215
# # # # # # % % #%%%###% #%%><##% ###/c\c#%% # % # % # ~ # ~ %%%####% #%%%%%## % % # # % % % ~
Figure 27. Perceptual asymmetry is shown for texture pairs composed of + elements and short T elements. It is easier to detect the group of four short T elements in the field of + elements in (a), at the top, than the group of four + elements in the field of short T elements in (b),a t
the bottom.
CHAPTER 6
216
%%%%%%%% %%A?%%%% %%< ?%%%% %%%%%%%% %%%%%%%% %%%%%%%% #%%%%%%% %%%%%%%%
Figure 28. Perceptual asymmetry is shown for texture pairs composed of T elements and + elements. A s shown in (a), at the top, the perception of the T elements in a field of + elements produces stronger texture segregation than the + elements in the field of T elements shown in (b). at the bottom.
DEPTH, MOTION AND TEXTURE PERCEPTION
2 17
neighboring distractor +s consists of regions bordered by right angles. The target short T can, for certain orientations, present a flat line segment to the neighboring + element. This shape of interelement space can not be generated between two +s. The inconsistency of the exact shape of the interelement space between target and distractor in this condition enables the detection of the short T for these trials. Segregation based on interelement spacing can account for the asymmetric results observed using the element pair + and short T (Figure 27). A similar analysis can be used to account for the asymmetry between a + and a regular T. For example in Figure 28, since the + is symmetric, a field of distractor +s consists of regular spaces between elements. The T however is not symmetric. so four target T s form irregular interelement spaces and are easily discriminated from the +s (Figure 28a). In comparison, a field of distractor Ts contain both regular and irregular spaces between elements. The four target +s form regular spaces, but these are not unique since these can also occur in the distractor field of Ts. As seen in Figure 28b the +s are therefore more difficult to detect. Similar analyses can be applied to the other texture pair combinations of the elements T. L. and +.
Extension of sllbjective contour and spatial f~ZZ-in By considering the properties of subjective contour and fill-in we can provide a more parsimonious explanation of texture segregation and related phenomena. For example, a fundamental issue in texture perception is why texture elements which segregate in isolation become less and less discriminable as the number of elements in the aggregates increases. Consider the texture pair T and L. When they are presented together as a single pair, at most only one side of each element will form a space with the other element. All other sides are surrounded by empty space, emphasizing the unique aspects of the element. But for this same pair encompassed in a field of elements, each side of nearly every element interacts with u p to four other elements to form a variety of interelement spaces. Thus,as the number of elements increases, the number of indistinguishable interelement spaces will also increase. interelement spacing can provide a n explanation for other phenomenon in visual perception - for example, perceptual grouping, initially investigated by Pomerantz in 1974. In these experiments, the same microelements in different arrangements can group to form unique objects in certain cases but not in others, This is demonstrated in Figure 29. where the microelements are two curved lines. For each set (A and B) subjects are asked to distinguish the top two pairs from the lower two. Note that in Figure 29A all lines are parallel to each other. When the lines face each other (as in the lower left pair), they group into a entity which is easily discriminated from the other three arrangements of these same lines. On the other hand, when the lines are perpendicular to each other (as seen in Figure 29B). no pair results in perceptual grouping and cannot be distinguished from each other without scrutiny. This result becomes intuitively sensible when one considers the spaces between the microelements. as well as the spaces between the stimuli. In Figure 29A, the interelement space between the two curved
CHAPTER 6
218
lines in the lower left pair is much larger than for the other pairs, resulting in easy discriminability. Conversely, in Figure 29B the interelement spaces - and even the interstimulus spaces - are all quite similar. accounting for the increased difficulty to distinguish between these pairs.
A Figure 29. In (A)are four stimuli created by juxtaposing two curved line elements. The stimuli exhibit perceptual grouping. In (B).the same four stimuli as in (A) but with the right hand element rotated 90 degrees. These stimuli do not show perceptual grouping. (From Pomerantz, 1974). Recently, researchers in the areas of artificial intelligence and computer vision have stressed the need to develop edge detectors which mimic humans' ability to perceive the edges of objects (Waiters, 1987). The efficacy of these edge detectors have been tested using the segmentation of line drawn cartoons. It is true that the human visual system is extremely good at edge detection, as well as the segmentation and recognition of line drawn cartoons and caricatures. However, successful navigation about the typical human environment requires more than just the detection of a collection of edges and boundaries. Visual information processing is object based, which requires fill-in between edges and boundaries. For example, detecting the edges of a closed versus an open doorway is useless without the ability to detect the door itself. Our results strongly imply that human texture discrimination depends on the shape of the interelement space, suggesting that cartoon and caricature segmentation is dependent on the fill-in process.
Serial processing As mentioned in the introduction, texture segregation can be divided into two categories depending on whether discrimination is effortless or requires scrutiny. Results reported by Bergen and Julesz (1983) and Treisman and Gelade (19801 have suggested that tasks requiring scrutiny involve the serial processing of individual elements. More recent findings by Kr6se and Julesz (1989) imply that these tasks may represent a combination of serial and parallel processing.
DEPTH, MOTION AND TEXTURE PERCEPTION
219
100
7s
so
25
-
0' 0
2
4
6
8
10
12
Total Number of Elements (#
of Targets = # of Distractors)
Figure 30. Data from one observer showing the percentage of Wals on which the observer (correctly) distinguished a group of target Ls from an equal number of distractor T s as a function of the number of total elements. Results are shown for four different interstimulus intervals: 40 ( 0 ) .60 (01,80 ( B), and 100 (0) ms. The solid line represents the prediction of a hypergeometric distribution for detecting a target in an equal number of distractors as the total number increases. For all interstimulus intervals tested, the results cannot be fit by a hypergeometric distribution. Does textural segregation based on the shape of interelement spaces represent a serial process? In the studies of Bergen and Julesz (1983)and Treisman and Gelade (19801, detection of a target element was measured for increasing numbers of distractor elements. For both studies, results were consistent with a serial search of the elements. For Bergen and Julesz (1983).the results (percentage correct as a
220
CHAPTER 6
function of number of distractor elements) were well fit by a hypergeometric distribution which describes a sampling without replacement serial process. Therefore, to access the nature of processing in our task, we measured the detection of a target L against distractor Ts for increasing numbers of elements. To further test the hypergeometric model, we kept the number of targets and distractors equal as the number of elements increased. This had the added benefit of ensuring that the signal to noise ratio of target to distractor remained contrast. Results are shown in Figure 30. The solid line represents the prediction of a hypergeometric distrlbution for detecting a target in an equal number of distractors as the total number increases. The other curves are the detection data for different interstimulus intervals (40. 60, 80. and 100 ms) between the stimulus and the mask which terminated the search. For all durations tested, the results cannot be fit by a hypergeometric distribution. Even when detection fell below 100%with only two elements, the remainder of the results were better than would be predicted by a hypergeometric. In other words, subjects were able to discriminate better than would be expected if each element was being scanned by focal attention independently. This suggests that textural discrimination in this task is not simply a serial process. A better explanation of these results is based on the shape of interelement spaces. Since the elements are oriented at random, the shape of interelement spaces also changes at random from trial to trial. Thus, the percent correct may reflect the portion of the trials on which element orientations yield unique and discriminable interelement spaces rather than the probability of success in a serial search task for a given number of elements. This result is supported by the earlier finding that a normally oriented T is more easily detectable than a tangentially oriented T when both are oriented on a field of Ls.
summary I t is well documented that a perceived visual image can differ from the actual scene. For example, incomplete boundaries in a scene can be completed by illusory or subjective contours. Illusory feature "fillin" can also occur - for example, our completion of the visual field over the blind spot where the optic nerve exits the retina. Our results suggest that these fundamental properties of visual processing play a crucial role in texture discrimination and can account for the perceptual asymmetries described above. These results have forced us to re-examine the process of texture segregation in general. Our results suggest that visual processing of textures is not limited to simply the collective responses of isolated elements. Instead, it appears that an active visual process encodes the stimulus without necessarily keeping the individual elements intact. The completion of local boundaries into global structures (subjective contours) as well as the fill in between boundaries to segment images (spatial fill-in) are fundamental to the process of texture perception. Both processes require the spreading of activation to unstimulated areas which can be demonstrated to result from cooperative processing (Grossberg and Mingolla. 1985). Perceptual asymmetries in texture segregation are therefore not merely troubling anomalies but rather reveal information about parallel cooperative processes which are critical to visual perception.
DEPTH, MOTION AND TEXTURE PERCEPTION
22 1
Conclusion This chapter was devoted to aspects of parallel processing in depth, motion and texture perception which are not dependent on the color / broadband dichotomy. In each case, information is processed by means of vastly parallel regimes of units which are themselves interconnected. This interconnectivity (cooperativity) gives rise to states and properties in the system which could not take place if the units were isolated from each other. These states and properties are directly relevant to the complicated process of real-world perception. For example, global structures are generated over the network even though input information is localized. This enables the integration of local information into global entities. Cooperative systems also exhibit the useful dynamic property called hysteresis. Evidence for this form of memory has been found in both depth and motion perception. Perceptually, hysteresis enables the system to lock up and maintain a perceptual state in the presence of noise and small perturbations. A crucial advantage of such interaction for perception is that it can be tailored to implement constraints on the processing of information. In this chapter we have presented numerous examples for which the configuration of interactions enables the visual system to eliminate ambiguities which are a result of the physical limitations of the initial processing stage of the system. This characteristic is well illustrated by the ability of a cooperative system to extract the correct three dimensional structure from two dimensional motion as well as uniquely resolve the correspondence problem. Finally, properties such as subjective closure and fill-in also arise out of cooperative parallel processing. This allows the division of information into meaningful substructures which permit the segmention of a visual scene into localized objects. Thus the value of cooperative parallel systems is to allow the integration of fragmented information into real-world perception.
References Adelson. E.H., and Movshon. J.A. (1982).Phenomenal coherence of moving gratings. Nature, 300, 523-525. Baker, C., and Braddick. 0. (1982).Does segregation of differently moving areas depend on relative or absolute displacement? Vision Research, 22. 851-856. Bergen. J.R. and Adelson, E.H. (1988).Early vision and texture perception. Nature, 333. 363-364. Bergen. J.R., and Julesz, B. (1983). Parallel versus serial processing in rapid pattern discrimination. Nature, 303,696-698. Blakemore. C., and Campbell, F.W. (1969).On the existence of neurones in the human visual system selectivity sensitive to the orientation and size of retinal images. Journal of Physiology,London. 203,237-260.
222
CHAPTER 6
Caelli. T. and Julesz. B. (1978).On perceptual analyzers underlying visual texture discrimination: part I. Biological Cybernetics, 28. 167175. Caelli, T..Julesz. B.. and Gilbert. E. (1978).On perceptual analyzers underlying visual texture discrimination: Part 11. Biological Cybernetics, 29,201-214. Campbell, F.W., and Robson. J.G. (1968).Applications of Fourier analysis to the visibility of gratings. Journal of Physiology, London, 197. 551-566. Chang. J.J.. and Julesz, B. (1984). Cooperative phenomena in apparent movement perception. Visbn Research. 24. 1781 - 1788. Cohen, M.A., and Grossberg, S. (1985).Neural dynamics of brightness perception: features, boundaries, diffusion and resonance. Perception and Psychophysics, 36,428-456. Dev. P. (1975). Perception of depth surfaces in random-dot stereograms: A neural model. International Journal of Man-Machine Studies. 7.51 1-528. DeYoe, E.A.. and Van Essen. D.C. (1988).Concurrent processing 11, 219streams in monkey visual cortex. Trends in Neuroscience, 226. Fender. D.. and Julesz, B. (1967). Extension of Panum's fusional area in binocularly stabilized vision. Journal of the Optical Society of America, 57, 819-30. Fogel, I.. and Sagi. D. (1989).Gabor filters as texture discriminators. Biological CybeWtics. 61,103-113. Gibson, J.J., and Gibson, E.J. (1953).Continuous perceptive transformations and the perception of rigid motion. Journal of Experimental Psychology, 54. 129-138. Graham, N., and Nachmias, J. (1971).Detection of grating patterns containing two spatial frequencies: a comparison of single-channel and multiple-channel models. Visbn Research. 11, 251-259. Green, B.F. (1961).Figure coherence in the kinetic depth effect. Journal of Experimental Psychology, 62.272-282. Grossberg, S.. and Mingolla. E. (1985). Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations. Perceptbn and Psychophysics, 38, 141-171. Grossberg. S.. and Todorovic, D. (1988).Neural dynamics of 1D and 2D brlghtness phenomena: A unified model of classical and recent phenomena. Perceptbn and Psychophysics. 43,241-277. Grzywacz, N.M.. and Hildreth, E.C. (1987).Incremental rigidity scheme for recovering structure form motion: position-based versus veleocity based formulations. Journal of the Optical Society of .America A. 4. 503-518. Gurnsey, R., and Browse, R. (1987).Micropattern properties and presentation conditions influencing visual texture discrimination. Perception and Psychophysics, 41.239-252. Helmholtz, H. von (1896).Handbuch der Physiologischen Optik. Drftter Abschnitt. Zweite A u f a g e (Voss, Hamburg). [English translation: Helmholtz's Treatise on Physiological Optic by J.P.C. Southall, 1924,(The Optical Society of America). republished by Dover Publications, 1962.1 Hildreth. E.C., and Koch. C. (1987).The analysis of visual motion: from
DEPTH, MOTION AND TEXTURE PERCEPTION
223
computational theory to neuronal mechanisms. Annual Review of Neuroscience, 10,477-533. Jansson, G.. and Johansson, G. (1973).Visual perception of bending motion. Perception, 2. 321-326. Johansson, G. (1975).Visual motion perception. Scientijk American, 232. 76-88. Julesz. B. (1962).Visual pattern discrimination. IRE Trans. Info. Theory IT-8, 84-92. Julesz, B. (1981).Textons, the elements of texture perception and their interactions. Nature, 290. 91-97. Julesz, B. (1986).Texton gradients: The texton theory revisited. Biological Cybernetics, 64. 245-251. Julesz. B., and Bergen. J.R. (1983).Textons. the fundamental elements in preattentive vision and perception of textures. Bell Systems Technical Journal. 62(6).1619-1645. Julesz. B., Gilbert, E.N., and Victor, J.D. (1978).Visual discrimination of textures with identical third-order statistics. Biological Cybernetics, 31(3).137-40. Julesz. B. and KrBse, B. (1988).Visual texture perception: Features and spatial filters. Nature, 333,302-303. Kanizsa, G. (1976).Subjective contours. Scientffl American, 234, 4852. Kriise. B.J.A. and Julesz, B. (1989).The control and speed of shifts of attention. Vision Research, 29(11),1607-1619. Lennle. P. (1980).Parallel visual pathways: a review. Vision Research, 20(7),561-594. Livingstone, M.S. (1988).Art, illusion and the visual system. Scientific American. 258. 78-85. Logothetis. N.K., Schiller, P.H., Charles, E.R.. and Hurlbert. A.C. (1990).Perceptual deficits and the activity of the color-opponent and broad-band pathways at isoluminance. Science, 247,214-217. M a r r , D.. and Poggio, T. (1976).Cooperative computation of stereo disparity. Science 194,283.-287. Marr. D., and Poggio, T. (1979).A theory of human stereopsis. Proceedings of the Royal Society of London, 204. 301-324. Nelson, J . I . (1975).Globality and stereoscopic fusion in binocular vision. Journal of Theoretical Biology, 49, 1-88. Pomerantz, J.R. (1 974). Perceptual organization in information processing. In M. Kubovy and J.R. Pomerantz (Eds.), Perceptual orgunkation Hillsdale, NJ.: Erlbaum .pp. 141-180. Posner, M.I. (1980).Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3-25. Rubenstein. B., and Sagi, D. (1990).Spatial variability as a limiting factor in texture discrimination tasks: imDlications for performance asymmetries. Journal of the Optical Sock& of .America . 9, 16321643. Rumelhart. D.E., and McClelland, J.L. (1986). Parallel Distributed Processing. Cambridge: MIT Press. Sachs. M.B., Nachmias, J., and Robson. J.G. (1971).Spatial frequency channels in human vision. Journal of the Optfcal Society of America, 61, 1176-1186. S c h u m a n n . F.. (1904). Einige Beobachtungen fiber die
-
224
CHAPTER 6
Zusammenfassung von Gesichtseindrucken zu Einheiten. Psychological Studies, 1. 1-32. Treisman. A.. and Gelade, G. (1980).A feature integration theory of attention. Cognitiue Psychology , 12. 97-136. Treisman, A., and Gormican, S. (1988).Feature analysis in early vision: Evidence from search asymmetries. Psychological Reviews, 95. 15-48. Treisman, A., and Souther, J. (1985).Search Asymmetry: A diagnostic for preattentive processing of separable features. Journal of Ewperimental Psychology: General, 114. 285-310. Ullman, S. (1979).The Interpretation of Visual Motion. Cambridge: MIT Press. Van Essen, D.C., and Maunsell. J.H.R. (1983).Hierarchical organization and functional streams in the visual cortex. Trends in Neuroscience, 63. 370-375. Voorhees, H.. and Poggio, T. (1988). Computing texture boundaries from images. Nature, 333. 364-367. Wallach. H.. and O'Connell, D.N. (1953).The kinetic depth effect. Journal of Experimental Psychology, 45. 205-217. Walters, D. (1987).Selection of image primitives for general-purpose visual processing. Computer Vision, Graphics, and Image Processing 37,261-298. Weisstein. N.. and Harris, C.S. (1974).Visual detection of line segments: A n object-superiority effect. Science. 186 ,752-755. Williams, D.W.. and Julesz. B. (1989).The significance of closure for texture segregation. Supplement to Inuestigatiue Ophthalmology and Vision Science, 30, 160. Williams, D.W., and Julesz, B. (1990).Perceptual asymmetries in texture segregation can be explained by bottom-up processing alone. Optfcd Society of America Annual Meeting Technical Digest , 15. 51. Williams, D.W., and Julesz, B. (1991).Fill-in between texture elements is critical for texture segregation. Supplement to Investigative Ophthalmology and Vision Science, 32. 1038. Williams, D.W.. and Phillips, G.C. (1986).Structure from motion in a stochastic display. Journal of the Optical Society of America, 13. 12. Williams, D.W.. and Phillips, G.C. (1987a).Cooperative phenomena in the perception of motion direction. Journal of the Optical Society of America. 4, 878 -885. Williams, D.W.. and Phillips, G.C. (1987b).Rigid three-dimensional percept from stochastic one-dimensional motion. Journal of the Optical Society of America, 13, 35. Williams, D.W.. and Sekuler, R. (1984).Coherent global motion percepts from stochastic local motions. Vision Research. 24. 55-62. Williams, D.W.. Phillips, G.C., and Sekuler. R.W. (1986).Hysteresis in t h e perception of motion direction: Evidence for neural cooperativity. Nature, 324, 253 -255. Williams, D.W.. Tweten. S.D.. and Sekuler, R.W. (1984).Using motion metamers to investigate the mechanisms of motion. Supplement to Investigative Ophthalmology and Vision Science, 25. 14. Wilson, H.R. (1973). Cooperative phenomena in a homogeneous cortical tissue model. In H. Haken (Ed.). Synergetics, pp. 207-219. Wilson, H.R. (1977).Hysteresis in binocular grating perception: contrast effects. Vision Research, 17. 843-851.
DEPTH, MOTION AND TEXTURE PERCEPTION
225
Wilson, H.R.. and Bergen, J.R.(1979).A four mechanism model for threshold spatial vision. Vision Research, 19, 19-32. Wilson, H.R..and Cowan, J.D. (1973).A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue. Kybernetik, 13, 55-80. Zeki, S.M.(1978).Functional specialization in the visual cortex of the rhesus monkey. Nature, 274, 423-428.
This Page Intentionally Left Blank
lications of Parallel Processing in Vision ~.Trman (Editor) Ca 1992 Elsevier Science Publishers B.V. All righls reserved
A
227
Parallel and Serial Connections Between Human Color Mechanisms QASIM ZAIDI
Introduction The concept of a discrete number of independent mechanisms functioning in parallel was first used in human vision by Young (1802)as a n explanation for the trivariance of color vision. Young's idea has proven to be correct for the first stage of color processing by the visual system. The current picture of the color system consists of a series of stages, each with a number of mechanisms functioning in parallel. Successive stages are composed of mechanisms that combine the outputs of the previous stage. The first few stages of color processing are simple and understood better than any other sensory system. Analogies from color mechanisms have been used in models for spatial frequency discrimination, motion perception and adaptation. However, a number of color phenomena are extremely complex (Woodworth. 1938; Albers, 1963) and cannot be satisfactorily explained in terms of known mechanisms. Based on accumulated knowledge and technological advances in stimulus generation, it has now become possible to study the functional properties of color mechanisms a t higher levels. If mechanistic explanations of complex color phenomena are successful, they will provide general insights into the processing of complex information by other parts of the nervous system. This essay presents a historical and formal examination of color vision mechanisms with an emphasis on the theoretical and empirical reasoning used to identify independent parallel mechanisms and their interactions at successive stages of a serial system. The intention is to give an extended discussion of basic concepts and details of a small number of experiments instead of providing an exhaustive survey. More complete references can be found in a number of books including LeGrand (1957), Graham (1965). Brindley (1970). Rodieck (1973). Evans (1974). Boynton (1979). Pokorny et al. (1979). Wyszecki and Stiles (19821, and Nassau (1983). Some important historical papers have been collected by MacAdam (1970).
228
CHAPTER 7
Trichromatic representation of lights Two lights may appear of identical color to a human observer, yet be distinguishable when passed through a simple prism. For observers with normal color vision, given any four spectrally distinct lights, it is always possible to place two in one half of a visual field and two in the other, or three in one half and one in the other, and by adjusting the radiances of three of the four lights to make the two halves of the field appear identical. This property of human vision, where the adjustment of three independent controls makes a n exact color match possible, whereas two are generally not enough, is called trichromacy. Newton (1675)was the first to note that when sunlight is dispersed by means of a prism, the color of different rays corresponds to their physical properties. In modern terms, visible light is a small section of the electromagnetic spectrum and lights of different wavelengths correspond to different perceived colors, from violet at the short wavelength end of the visible spectrum (400nm) to red at long (700nm) (Figure 1). Newton (1704) showed that the different rays of the spectrum can be combined to form a single beam of uniform color and the result of combining arbitrary proportions of different colors can be represented in a diagram (Figure 2). This diagram shows that the domain of colors is more restricted than the domain of lights, and would have demonstrated trichromacy if it had been exactly correct.
Figure 1. Visible wavelengths of the electro-magnetic spectrum with associated colors. Based on Newton's work, Maxwell (1860) deduced that the result of any mixture of colors, however complicated, can be defined by its relation to three suitably chosen colors: any color which has the same relation to the standard colors will be identical in appearance, though its physical composition may be different. In modern terminology, the standard colors are called primaries. the numbers that relate the unit amount of a color to unit amounts of each of the primary colors are called tri-stimulus values, and the set of tri-stimulus values for monochromatic lights is called the set of color-matching functions. Maxwell (1860) measured color matching functions by using a colorimeter that was
HUMAN COLOR MECHANISMS
229
essentially a reverse spectroscope enabling sunlight to be matched to a mixture of three spectral colors. The primaries were 456.9.528.1,and 630.2 nm and tri-stimulus values were expressed in terms of the slit widths of the instrument. The color-matching procedure is diagrammed in Figure 3a. First, a mixture of the three primaries 11.4. and Ik was matched to W. a unit amount of a standard white:
ad( + aj4 + a k I k = w
(1)
In Equation 1. "=" refers to a color match between the two sides of the equation, is summation of energy over each wavelength. and the coefficients are the energy of each primary in the match. To measure the tri-stimulus values of a spectral light of wavelength A. a match was made to the standard white after replacing the appropriate primary with the test wavelength: 'I+"
b k A + bjrJ
+ bklk
=
w
(2)
By equating the left sides of equations 1 and 2 and solving for A, the tri-stimulus values for A were derived algebraically as the coefficients of the primaries in Equation 3:
Maxwell's work provided the first empirical evidence for trichromacy and has been replicated by a number of later studies with more precise control over the stimuli (Konig and Dieterici, 1886; Abney, 1906: Wright, 1929;Guild, 1931;Stiles and Burch, 1955, and others). Stiles and Burch (1955)used the "maximum saturation" procedure devised by Guild (1931)to measure tri-stimulus values in a particularly direct way. As shown in Figure 3b, a mixture of the test light and the appropriate primary was matched to a mixture of the other two primaries:
CkA + C I i l = CJQ + C k r k
(4)
where the coefficients were the measured radiances of the monochromatic lights. The tri-stimulus values were derived directly by solving Equation 4 for A:
A standardized set of color matching functions (CIE. 1931)for a 2O foveal field is shown in Figure 4. A two dimensional representation similar to Newton's, the CIE (1931)chromaticity diagram is shown in Figure 5. The derivations that lead to Equations 3 and 5 assume that the physical operations in Equations 1, 2. and 4 can be treated as mathematical operations: a color match like equality, W. A, and the r s like unit vectors, the wavelength-by-wavelength s u m like vector addition, and the coefficients like scalar multipliers of the unit vectors
230
CHAPTER 7
t h a t represent the primaries a n d the test wavelength. This isomorphism between experimental and mathematical operations was implicit in Newton's work. For a mixture of known amounts of primary colors, Newton determined the color of the compound by a center of gravity principle that was "accurate enough for practice, though not mathematically accurate." Grassman (1853) showed that the center of gravity rule, which was mathematically equivalent to vector addition, was justified if the "impression" of colors obeyed three propositions: three-dimensionality, continuity and additivity. In modem mathematical terminology, Grassman's propositions are satisfied if a color match is an equivalence relation and if the space of all color matches is a positive cone in a three-dimensional linear vector space (Schrodinger, 1920; Resnikoff, 1974; Krantz, 1975): An equivalence relation in a set X. is a relation x # y between certain pairs of elements of X satisfying the following conditions (Lang. 1966): ( i ) Identity: x#x for all x in X. [ii) Commutativity: If x#y then y#x. (iii) Transitivity: If x#y and y#z, then x#z. A linear vector space V is a set of objects that satisfy the following propositions (Lang, 1966): ( i ) Superposition: If u and w are elements of V. their sum u + w is an element of V. (ii) Scalar multiplication: If u is an element of V and c is a number, then cu is a n element of V. (iii) Null element There is a n element of V,denoted by 0. such that 0 + u = u + 0 = u. Further, if u1. , . ., Un. are elements of V, and al, . . ., a n . are real numbers. then a linear combination of ul. . . ., Un. is written as: U1Ul
+ . . . + UnUn
(6)
. . ., U n . are called linearly independent if a 1 U l + . . . +anUn =O (7) only happens for all a].. . .. a,. equal to zero, i.e. no u [ can be written as u1.
a linear combination of the others. If every element of V can be written a s the linear combination of a set of linearly independent elements V I , . . ., un. then u l . . . ., un. are called a basis of the vector space. Every basis of a vector space has the same number of elements and this number is called the dimension of this vector space. A vector u written as a linear combination of one basis u1, . . ., Un.. can be easily written as the linear combination of w1. . . ., wn.Each w [ can be written a s the linear combination of U I , , . ., u,; therefore, by solving a system of n simultaneous linear equations, each y can be written as the linear combination of w1. . , ., wn. Simple substitution then gives the required expression for u. A color match requires only the judgement of identity without the extraction of any subjective quality, and should satisfy the requirements of an equivalence relation. The identity and commutativity conditions should be satisfied by the precision of the equipment and procedure used in the experiment. The transitivity condition is limited
HUMAN COLOR MECHANISMS
23 1
by the finiteness of color discrimination. A number of experiments have shown that unless the luminance is high enough to significantly alter the optical density of the visual photopigments, or low enough to allow rods to intrude, foveal color matches satisfy the requirements for a three dimensional linear vector space: a) Any color can be expressed as a linear combination of three colors (Maxwell, 1860; and others). b) Brightening or dimming of all constituents by a constant factor (scalar multiplication) does not disturb a color match (von Kries. 1896; Trezona. 1954). Stiles (1955).in a single experiment on a 20 field (581.4nm plus 445 nm matched to 526 nm plus 649 nm), found a small breakdown, but one that could well be within experimental error. c) Trezona (1953. 1954) did a number of experiments to test the invariance to superposition of color matches, i.e. if A matches B and C matches D, will A plus C match B plus D 3 The deviations measured were generally smaller than a just discriminable difference. Von Kries (1878)investigated matches of sunlight against mixtures of pairs of complementary spectral lights before and after adaptation to various colored lights, and failed to detect any alteration in the amounts or wavelengths of the complementaries required. Additionally when color matching functions measured in two different primary systems are transformed to a common primary system, the two sets of color matching functions are similar. For example, Smith, Pokorny and Zaidi (1983)showed that the average of the color matching functions measured by Wright and Guild were similar to the average of the color matching functions measured by Stiles and Burch corrected for differences in pre-receptoral filters. These sets of evidence justify the representation of any color as a vector in a three-dimensional linear space. The mixture of two colors can be represented by the vector sum of the two colors.
Figure 2. Newton's (1704)color mixture diagram. The circumference represents colors in the Sun's spectrum. The mixture of any two colors lies on the straight line joining their locations in the diagram.
232
CHAPTER 7
In a color matching experiment, the only formal condition that limits the choice of primaries is that it should not be possible to match any one of them by any combination of the other two. Color matches can be represented in a space with any three primaries as orthogonal axes. Moreover, the scale in which a primary is measured is independent of the scales for the other primaries. Consequently, angles and distances are arbitrary in a color matching space. However, ratios of segments along any line and the parallelism of lines are well defined. Transformation of a color matching space to another trio of primaries must preserve these last two properties, i.e. belong to the class of affine transfbrmations (SchrodingeK 1920).
Figure 3. (a) Maxwell's method for measuring color matching functions. (b) Maximum saturation method for measuring color matching functions.
Mechanisms underlying trichromacy A three-dimensional linear representation is compatible with many different mechanistic substrates. It is simplest to assume that the visual system is a t some stage restricted to three independent wavelength sensitive linear channels, and that the information carried by each channel at a given instant and for a given point on the retina is expressible as the value of one continuous variable. The other stages of
HUMAN COLOR MECHANISMS
233
the visual system cannot contain fewer than three independent channels, otherwise color discrimination would be mono- or dichromatic. They may, however, contain more than three independent channels peripheral to the critical stage, and as a result of interactions, more than three non-independent channels central to it. The first clear mechanistic explanation of trichomacy was given by Young (1802). who suggested that there was a continuous series of kinds of light, but only three kinds of sensitive particles in the retina, each preferentially but not exclusively sensitive to light from one part of the spectrum. The essential property of Young's mechanisms was explicated by Maxwell (1857): "Each nerve acts, not a s some have thought, by conveying to the mind the knowledge of the length of an undulation of light, or of its periodic time, but simply by being more or less affected by the rays which fall on it. The sensation of each elementary nerve is capable only of increase or diminution, and no other change." This statement was given the physical interpretation that a photopigment can signal the number but not the wavelength of the quanta caught, and called the Principle of Univariance by Naka and Rushton (1960). The postulate generally used to link the response of mechanisms to empirical facts about color matches is Von Kries' (1878) implicit assumption that a match is established between two spectrally distinct lights, if and only if, the matched lights are alike in their effects on all three mechanisms. Brindley (1957, 1960) showed that the assumption is justified if the underlying mechanisms possess the following properties: a) Unidimensionality: For a receptor, given two distinct lights A and B there is always a number n such that nB provokes the same response as A; b) Substitutability: If a receptor x gives the same response to a light A a s to a light €3, then for any light C, (A+CI has the same effect as IB+CI and reciprocally, if (A+CI and IB+C) have equal effects, then so do Aand B. A mathematically more elegant way to build a mechanistic scheme is to represent the mechanisms as three linearly independent linear functionals (also see Resnikoff. 1974: Krantz, 1975). Let S be the infinite-dimensional linear vector space generated by the infinite basis 1400. . . ., 1700. For all real numbers between 400 and 700. each In. is equal to one at wavelength A and zero for all other wavelengths from 400 nm to 700 nm. i.e. S is the set of all visible spectral energy distributions. Then each mechanism can be regarded a s a bounded linear functional F that assigns a real number to every vector (light) in S: If 1,. and I, are two elements of S, and aand b are real numbers, then a linear functional is defined by the property that the scalar assigned to a linear combination of vectors is equal to the linear combination of the scalars assigned to the Individual vectors (Friedman. 1956):
A functional is bounded if there exists a constant rn such that for all l x in S:
It can be shown that a linear functional is continuous if and only if it is bounded. Moreover, if F(1J is a continuous linear functional on S, there
234
CHAPTER 7
exists a vector F such that F(1.J = @,I,>
(10)
where is the scalar product of the vectors F a n d 1,. Therefore, the functional can be characterized by one vector. The characteristic vector expressed in the basis 1400. . . .. 1700 is the spectral sensitivity of the mechanism, with each cF,I,> equal to the response to a unit amount of monochromatic light of wavelength I In the continuous case, the scalar product is equal to IF[&l(A) dil. The response of each mechanism is a scalar product, so no information about the constituents of the incident light is present in the response, consistent with the Principle of Univariance. A continuous linear functional satisfies both of Brindley's postulates: If the receptor has a non-zero response to both A and B, unidimensionality is satisfied if n is taken equal to /: substitutability is a direct consequence of Equation 8. Consequently, these mechanisms will also satisfy Von Kries' assumption. Three linearly independent continuous linear functionals F1. Fz. and F3 generate a three-dimensional linear vector space. A set of linear functionals is linearly independent if Ft are linearly independent vectors. Addition and scalar multiplication of linear functionals is defined as follows:
(cFJ(1) =
@tU)
(12)
The null functional is the functional SfZ) = 0 for all 1 in S. i.e. Ft = (0 , . . ..O). Every light 1 in S can be represented as a three dimensional point IFl(1). F2(1), F3Il)) in this vector space. I t is straightforward to prove that three functional mechanisms are consistent with the tri-variance and linearity of color matches. A corollary is the proof of Konig and Dieterici's (1886)assumption that, in trichromatic vision, the spectral sensitivities of the three channels determining trichromacy must be linear functions of the tri-stimulus values, with coefficients independent of test wavelength. The proof begins by assuming that given a system consisting of three independent linear functionals Fl, Fz. and F3. the number of primaries required for a color match is n. A color match for test wavelength A and primaries 11 , . . ., In, is defined as before:
By Von Kries' assumption, the response of each color mechanism should be equal for the two sides of the match. Therefore for each mechanism Ft:
Because the three mechanisms are linear, responses to a mixture can be separated into responses to each primary:
HUlVLAN COLOR MECHANISMS
Therefore, if I, matches 1, and Iy matches I, then ( I , + I=).
235
+ Iy) will match
(Ix
Figure 4. Tristimulus values for spectral lights. The values x. y. and z are the amounts of the three CE (1931)primaries required to color match a unit amount of energy having the indicated wavelength. To see that trichromacy can be derived from the proposed scheme, consider solving the three simultaneous linear equations (Equation 15)for unknowns a%as a function of the different numbers of primaries in the match. If for a wavelength A only one primary 11 is needed for a color match, then either two of the mechanisms have zero response for A and 11, or &(A)/FI(I,) is equal for all three mechanisms.
236
CHAPTER 7
If for a test wavelength A only two primaries are needed, then either one of the mechanisms has zero response for A. 11, and 12, or the response of one mechanism is the same linear combination of the responses of the other two for A, 11. and 12. If three primaries are used in the match, the set of equations will have a unique solution expressing the QJ a s linear combinations of F$A). If four or more primaries are used, there will be a n infinite number of solutions with all the extra primaries expressed as linear combinations of three primaries. Therefore, this set of mechanisms form a sufficient substrate for trichromatic matches. Furthermore, given a three primary match, for any wavelength A , the spectral sensitivities of each of the three mechanisms &(A) are expressed in Equation 15 as linear combinations of the tri-stimulus values ua, with coefficients Fi(1~)independent of wavelength A.
Figure 5. CIE (1931)chromaticity diagram showing the curved locus of spectral colors. The straight line joining 400 to 700 nm is the line of purples. All additive mixtures of the colors R and G lie on the line RG. Similarly, all real colors, i.e. all mixtures of the spectral colors, lie within the solid boundary.
Derivation of photopigment spectral sensitivities Light absorption by photopigments has linear properties similar to those attributed to the three color mechanisms above, and there is evidence from microspectrophotometry and electrophysiology that
H U M A N COLOR MECHANISMS
237
there are three classes of cone photopigment in the human fovea (Dartnall et al., 1983; Schnapf et al., 1987). Therefore, it is generally assumed that trichromacy is determined by the quantum catch of three cone photopigments S,M and L (short, middle and long wavelength sensitive). The derivation of the spectral sensitivities of these photopigments a s unique linear combinations of color matching functions, requires additional information to pick from among the infinite number of possible combinations. A number of derivations have been based on Konig's hypothesis that congenital dichromats have reduced forms of trichromatic vision, i.e. dichromats accept all color matches made by normal trichromats but not vice versa (e.g. Konig and Dieterici, 1893; Thomson and Wright, 1953: Vos and Walraven. 1971; Smith and Pokorny, 1975: Vos et al., 1990). In a three-dimensional representation of trichromatic vision, due to the linearity of dichromatic color matches (Nagy. 19841, color confusions of dichromats form a simple picture and the derivation of cone spectra is easy to intuit. Since almost all published derivations represent colors in a two-dimensional projection plane (Vos and Walraven. 1971; Wyszecki and Stiles, 1982). a simpler three-dimensional derivation based on Maxwell's suggestions is presented in this section. For notational convenience, the three mechanisms determining trichromacy are assumed to be L, M. and S cones. The characteristic vector, or spectral sensitivity of each of the cone types is given by:
M = ( m m , ..., m7d
s = (s400,
+
-
(18)
(19)
s700)
Each element in a vector represents the relative proportion of quanta absorbed by a class of cones from light of the subscripted wavelength. For example: D
1520
(20)
= 4.1520>
Because of the linearity of the mechanisms, the quanta absorbed by a class of cones from a heterochromatic light I is simply the sum of the quanta absorbed from each of the monochromatic constituents of I, i.e. 11
(21)
=
It would be easy to measure the spectral sensitivity of each cone type if three lights P, D,and T could be found such that each of them excited one type of cone only. If: sp =
0:
(22)
I D = 1. m D = 0 . S D =
0:
(23)
lp=
1. m p = O .
I T = 1, m T = O . s T = o :
(24)
then all three spectral sensitivities could be measured directly by an
CHAPTER 7
238
experiment in which a normal trichromat matched a unit amount of monochromatic light of each wavelength to a mixture of the three primaries P. D and T.Each match would yield an equation: Id
= PAP + dr.D + taT
(25)
From Equations 20 and 22-24. it is obvious that lr. = pa. ma = dr.,and sr. = Q, therefore Equation 25 can be rewritten so that the coefficients in the match are the cone spectral sensitivities for A:
However, this trio of real lights P, D and T does not exist, therefore the derivation will resort to a method suggested by Maxwell(l860) and revived by Nuberg and Yustova (1955) and Judd (1964). For a trio of arbitrary real primaries R . C. and B . if the color match for any monochromatic light Ir. can be written as: In = raR + gAG + br.B (27) Then the complete set of color matching functions can be depicted as:
B = (b400,
. - -, b700)
(30)
Maxwell's method requires assuming that spectral distributions (but not necessarily real lights) P. D and T exist, and can be matched to the three real primaries R. G and B,such that: P = rpR
+ gpG + bpB
Substituting Equations 31, 32. and 33 into 26: Id
= larpR
+ lkgpC + labpB
+ mArDR + mNgDG + mabDB + SarTR + S @ T G + s a b B Equating the amounts of R. G and B in Equations 27 and 34:
(34)
HUMAN COLOR MECHANISMS
239
This triplet of simultaneous linear equations can be solved for l ~ , SA by any of the traditional methods. A particularly convenient
rnn, and
method is to represent Equations 35-37in matrix notation:
The solution requires pre-multiplying both sides by the inverse of the matrix of coefficients:
Therefore, the spectral sensitivities of the three cone types In, ma, and sk can be derived for any wavelength h, from a color match of 1 ~ with . any three real lights R, C and B , if the nine coefficients of the matrix can be determined. These coefficients can be obtained from color matches made by congenital dichromats using the same primaries R. Gand B. Congenital dichromats accept the matches made by trichromats. but also match some pairs of colors that are distinct to a trichromat. therefore, their color vision is a reduced form of trichromatic vision. It is assumed that protanopes lack L cones, deuteranopes M and tritanopes S . For example, a protanope may match two lights J and K that are distinct to a trichromat. i.e.. J = K for a protanope but J z K for a trichromat. This can only be true if m ~ mK = and S J = s~ but 1J # ~ K .According to Maxwell's method, if P = J - K . then P will satisfy the requirements of Equation 22. Notice that even though J and K are real lights, P may be a spectral distribution with negative entries and hence not a physically realizable light. P is important only as a mathematical entity. Because cone absorptions are linear operations, the M cone absorption from P is equal to the M cone absorption from J minus the M cone absorption from K, i.e. zero:
mp = < M , P >= <M,(J - K)> = <M,J> - <M,K> = rnt - mk = 0
(40)
Similarly s p = 0 but lp+0, i.e. P excites only L cones. Since P is just the difference between two colors, the three coefficients of the matrix that correspond to P in Equation 39 can be derived from the tri-stimulus values for J and K. If for normal trichromats the color matches for J and K are represented by Equations 41 and 42: J = rjR + gjG
+ bJB
( 4 1)
K = TkR + g k c
+ bkB
(42)
then, because of the linearity of dichromatic color matches, the
240
CHAPTER 7
coefficients for P in Equation 31 are simply the difference between the coeMcients for the pair of confusion colors J and K
- TK)
(43)
gP = ( g J - g K 1
(44)
TP =
I rJ
It is worth noting that the difference between all pairs of protanopic confusion colors will give an estimate of the same n P where n is some constant. Similarly the coefficients corresponding to D and T can be derived by finding pairs of confusion colors for congenital deuteranopes and tritanopes. Then the spectral sensitivities L. M and S (Equations 17-19)for the three cone types can be derived from empirical color matching functions R , C. and B (Equations 28-30).by rewriting Equation 39 so that each row of variables is an array instead of a single wavelength:
This Ldebraic derivation can also be pictureL geometrically. Each I can be represented as a vector in the three-dimensional space formed by R. G, and B as axes. The three vectors P, D, and T are the vector differences of pairs of colors confused by congenital protanopes, deuteranopes and tritanopes respectively. Another way to visualize the confusion vectors is as follows: For any color E and any scalar n, let F = E + nP. Then using Equation 22 to add cone excitations from nP to excitations from E , mE = mF and SE = SF.but 1~ z IF. i.e. E and Fcan be distinguished by a trichromat but not by a protanope. So for any color vector in RGB space, adding a vector parallel to P results in a confusion pair for protanopes. Similarly, adding a vgctor parallel to D or T results in con fusion pairs for deuteranopes and tritanopes respectively. Reciprocally, the difference vectors between pairs of confusion colors for an individual dichromat are all parallel. If the space is transformed to P. D. and T as axes, then each I will be represented in a cone-excitation space. Because color spaces are affine. the relative heights of the cone sensitivities cannot be ascertained from color matching data. In practice there are a number of problems with the derivation of Konig fundamentals. The first problem is a logistic one. Generally, dichromatic color confusions were measured in chromaticity co-ordinates (Pitt. 1935: Wright, 1952) because colorimetric procedures were more accurate than available radiometric instruments. Since accurate instruments to measure tri-stimulus values are now available, it would be preferable to measure difference vectors, since in tri-stimulus values, every pair of confusion colors for a dichromat would provide a n independent estimate of the same confusion vector and lead to greater numerical precision. In a chromaticity diagram, P, D, and T
H U M A N COLOR MECHANISMS
0.9 0.8
0.7 0.6
0.5
Y
0.4
0.3 02 0.I
0
X
241
242
CHAPTER 7
Y
0.9
-
08
-
0.7
-
0.6
-
05
-
I
1
I
I
I
I
1
1
I
-
0.3 0.2 0.1 0.4
-
0 -
0
0.1 02 03 0.4 0.5 0.6 0.7 08
X
Figure 6. Dichromatic color confusions converging to single points in the CIE chromaticity diagram for (a) (top panel, preceding page) protanope, (b) (bottom, preceding page) deuteranope. and (c) tritanope. each project to a single point called the convergence point for that class of dichromat (Figure 61,unless the plane of projection is parallel to one of the confusion vectors. The three pairs of dichromatic convergence coordinates only provide six of the coefficients needed in Equation 46, so auxiliary assumptions and data are needed for the remaining three. A usual strategy is to assume particular weights for the contributions of the three cone types to the luminosity function (Vos and Walraven, 1971: Smith and Pokorny. 1975; Vos et al., 1990). The second problem is that even if observers possess photopigments of identical spectral sensitivity, their color matching functions will be different due to differences in the amount of lens and macular pigments and in the optical density of the photopigments. A set of average color matching functions for a particular class of observer then depends on the characteristics of the sample of observers measured (Zaidi et al.. 1989). More importantly, it is not possible to test Konig’s reduction hypothesis by direct confrontation of trichromatic and dichromatic matches (Alpern and Pugh. 1977: Pokorny and Smith, 1977). A reduction system can be tested for an individual dichromat by testing that the differences between pairs of confusion colors are all parallel in RGB space, or equivalently, that in a chromaticity space the lines joining confusion colors converge to a single point. This problem has been dealt with by either using WDW normalization (Wright, 1928-1929) or using published estimates to correct to a n average observer (Smith and Pokorny. 1975). A third problem is the potential of relying too heavily
HUMAN COLOR MECHANISMS
243
on the precision of measurement of the dichromatic convergence points (Nimeroff. 1970: Walraven. 1974). Historically, the empirical estimates of convergence points have been adjusted iteratively so that the derived cone sensitivities are consistent with pigment nomograms and can predict not only color mixture data, but also a variety of types of other measurements on dichromats and monochromats. including spectral sensitivity and luminosity (Smith and Pokorny, 1972. 1975).A fourth problem is that whereas protanopes and deuteranopes have been shown to lack one of the normal pigments In the long wave part of the spectrum (Rushton, 19721.tritanopes with complete absence of S cone pigment may not exist (Pokorny, Smith and Went, 1981). Despite all these problems, estimates of cone sensitivities, derived from color mixture and other methods, converge to a set very close to that shown in Figure 7 (Stockman. 1989).This set (Smith and Pokorny, 1975) is based on the following transformation from Judd's (195la) modification of the CE (1931)color matching functions: .15514 .54312-.03286 .45684 ,03286 0 .01608
(47)
Recently, additional support for the theory explicated above has come from electrophysiological recordings from single cones of the human retina (Schnapf et al.. 1987). Four facts about the electrophysiological measurements are particularly germane. First, the response of a cone is the same for equal quanta caught from lights of different wavelengths, consistent with the Principle of Univariance. Second, the measured spectral sensitivities cluster into three discrete classes with minimal variability within each class. Third, the measured sensitivities provide a good fit to human color matching functions on the assumption that for a color match the quantal catch for each class of cone is identical for both sides of the match, and with adjustments for optical density of photo pigments and pre-retinal absorption. Fourth, within the freedom provided by these adjustments, the action spectra of the cones, that is the reciprocal of the amount of light of each wavelength required to produce a constant response, closely resemble the Smith-Pokorny fundamentals.
Cone interactions and opponent mechanisms Because the output of cones of different classes are combined very early in the retina, it is difficult to find other psychophysical tasks besides color matching where each class of cones functions independently. One notable exception is dark adaptation after intense bleaches, where sensitivity as measured by a method of revived afterimages seems to be regulated within each cone class independently (Williams and MacLeod, 1979).Another possible case is the detection of small, brief pulses of light (Krauskopf and Srebro. 1965).In most color experiments, the results indicate some degree of interaction between different classes of cones. For example, Stiles' (1939,1959) two color increment threshold technique identified seven foveal n-mechanisms, some of which are now thought to involve post-receptoral interactions
244
CHAPTER 7
(e.g. Pugh and Mollon, 1979). Wright (1946) examined the possibility of parallel independent adaptation processes in the three cone systems by testing the assumption of superposition (additivity) for binocular color matches. With the two eyes in different states of adaptation, an observer matched stimulus s1 in one eye by the mixture (rl, 91. b l ) in the other eye and stimulus sz by (rz, gz. bz). However, s3 equal to q + s 2 . was matched by (r3. g3. b3) some of whose components were considerably less than the sum predicted by the assumption of additivity, e.g., r3 << r[ + r2. This failure of additivity was consistent with inhibition between different classes of cone, and indicated a breakdown of the Von Kries ( 1905) coefficient rule of independent cone adaptations. Additional violations of the Von Kries rule have been described by a number of authors, including Brewer (1954). Hurvich and Jameson (1958) and Shevell (1978). Failures of additivity have also been used by Boynton et a1 (1964). Guth et al. (1969). Pugh (1976). Mollon and Polden (1977). and others to demonstrate cone interactions in increment threshold measurements.
L t
WAVELENGTH (nm) Figure 7 . Log of the spectral sensitivities of the visual pigments hypothesized to be active in foveal color matches of normal trichromats. Identification of the chromatic properties of the mechanisms formed by interactions between cones had its beginning in Hering's (1878) phenomenal observation that any color can be described in terms of six basic sensations: red, green, yellow, blue, white and black
HUMAN COLOR MECHANISMS
245
(English translation in Abney, 1895). Red-green and yellow-blue are opponent sensations along a single dimension, because only one sensation of each pair can be elicited by a uniform color. The two pairs function in parallel, so that red can be seen simultaneously with yellow or blue and the same is true of green. Instead of tri-stimulus color matching functions, Hering proposed a trio of "valence" curves (Figure 8). For each monochromatic light, the red-green valence curve
Figure 8 . Hering's valence curves derived from Konig fundamentals (Schrodinger. 1925). indicates the amount of redness or greenness and the yellow-blue curve the amount of yellowness or blueness. The third valence curve indicates lightness or darkness. The red-green valence curve has three lobes, red in the long and short wavelengths and green in the middle: the yellowblue curve has two lobes, yellow in the long and blue in the short wavelengths; the lightness curve has a peak in the middle wavelengths and tapers off to both sides. Hering was noncommittal about a peripheral or central locus for opponent color mechanisms, but other investigators incorporated opponent mechanisms into stage theories (Donders. 1881: Von Kries. 1905). In stage theories, a peripheral stage consists of three types of cones with sensitivities given by Konig fundamentals; a more central stage consists of mechanisms that linearly combine the outputs of the first stage into two subtractive chromatic mechanisms and one additive achromatic mechanism. Other versions of
CHAPTER 7
246
these theories include those of Adams (1923. 1942). Muller (1924. 1930).Judd (1949.1951b). Jameson and Hurvich (1955).Hurvich and Jameson (19581,Guth and Lodge (19731,and Ingling (1977). Schrodinger (1925)showed that Hering type valence curves are equivalent to straight lines through white in a chromaticity diagram. The essential insight was that color matching functions are a linear transform of cone spectral sensitivities. If valence curves could be identified with linear combinations of cone spectra as proposed by Von Kries (1905).then valences should be a linear transform of color matching functions too. The explicit derivation is easy to describe in matrix notation. If R G , Y E . and LD. the red-green, yellow-blue and light-dark valence associated with a spectral light 1, are independent linear combinations of the quanta caught by L,M. and S cones, then:
Substituting the values of L, M, and S from Equation 46 into 48. the valences can be described a s linear combinations of color matching functions:
where: -1
(50)
In three-dimensional R G B space, lights with zero RG valence are described by Equation 51,which defines the null RG plane: RG=rRcR+gRcG+ bRcB=O
(511
In a two dimensional chromaticity space plotted in barycentric coordinates r, g. and b given by Equation 52:
r=
g = l - r - b ;
R+G+B'
b=
R R+G+B
(52)
The null plane projects to a straight line given by Equation 53: QRC
rh'C-gRG
= (gRG-bRG) + (gRC-
bRG)
(53)
HUMAN COLOR MECHANISMS
247
Similar derivations lead to straight null lines in chromaticity space for YB and LD. Schrodinger derived the coefficients for RC and Y B in Equation 49 by using the chromaticity coordinates of psychologically unique hues (Figure 9). Unique blue and unique yellow are hues that are neither reddish nor greenish, and unique red and green are neither yellowish nor bluish. The line passing through unique yellow, achromatic white and unique blue was taken as the null line for RC and the coefficients derived from an equation similar to Equation 53. Similarly, the line passing through unique red, achromatic white and unique green was used to derive the coefficients for YE The coefficients for LD were derived from a new kind of line, the alychne, representing the null of Exner's luminosity function. The three valences derived using Konig and Dieterici's chromaticity diagram. when plotted on a wavelength axis, resembled Hering's qualitative valence curves (Figure 8).
Figure
9. Konig's color triangle with the alychne and null lines for opponent mechanisms passing through white (Schrodinger, 1925).
Schrodinger's procedure provides a method for critically testing the efficacy of Hering type hue judgments to reveal the properties of second stage mechanisms. Dimmick and Hubbard (1939a) measured the spectral locations of the unique hues: 477, 515 and 583nm for blue, green, and yellow respectively. The line joining unique blue and yellow
248
CHAPTER 7
passed through the reddish side of white. The chromatic location of unique red was ascertained by mixing unique blue to a spectral red to get a color that was neither "yellowish" nor "bluish" (Dimmick and Hubbard, 1939b).Unique red was complementary to 493.6 nm and not to unique green, implying that the line joining unique red to achromatic white is not collinear with the line joining achromatic white to unique green. Consequently, the yellow-blue hue system is not a linear combination of cone inputs. Burns et al. (1984)measured the loci of constant unique hues at equal luminance. Their data plotted on a chromaticity diagram do not fall on straight null lines as required to generate hue valence curves. Jameson and Hurvich (1955)attempted to measure valence curves directly by extending the Dimmick and Hubbard type of hue cancellation technique to all four unique hues. For each of the unique hues, hue-cancellation results in colors of a constant hue but varying saturation. In a chromaticity diagram, these points are just a subset of constant hue data like that of Abney (1910).Burns et al. (1984).and others. Further work with the hue-cancellation technique, e.g. Larimer, Krantz and Cicerone (1975).Ikeda and Ayama (1980)and Pokorny et al. (1981).has provided more evidence that hue judgments do not satisfy assumptions of linearity. The spectral sensitivity or valence of a post-receptoral mechanism enables the response of the mechanism to a compound light I to be predicted from its response to monochromatic lights Ia. Nonlinear post-receptor mechanisms cannot be said to have a spectral sensitivity or a valence in the usual sense. A post-receptor mechanism whose response is a function H of the quantum catches of L. M. and S cones, has a response H(,<M,I>,<S.I>) to light I. The response of this mechanism can be written as a function of the spectral distribution of the light I: , if and only if H is a linear function. The sensitivity of a nonlinear mechanism can be described only as a function of quanta caught by the three classes of cones. A related issue is the independence of hue judgments across the three Hering dimensions. It was recognized by Purdy (1931).that this issue could be tested by measuring the perceived change in hue with changes in radiance (Bezold-Brucke effect) and by the perceived change in saturated hues due to addition of a n achromatic white (Abney effect). Purdy showed that hues that were invariant to changes in radiance or saturation were not the same a s the psychologically unique hues, contradicting the independence of Hering's h u e mechanisms. Measurements of the Bezold-Brucke and Abney effects for a n extensive set of wavelengths (Nagy, 1980:Burns et al., 1984)cannot be explained by simple response non-linearities in independent hue mechanisms, but require interactions across hue dimensions. Judgments based on the appearance of colors are probably the result of higher level processing of cone signals and not a suitable method to isolate the properties of an early opponent stage. Schrodinger's mathematical analysis of second stage mechanisms need not be restricted to hue judgments alone. Linear post-receptoral mechanjsms can be identified with straight null lines in chromaticity space as a general rule. In addition, hue judgments are not the only data on which second stage mechanisms have to be based. Steps in this direction were taken by Guth and Lodge (1973)and Guth, Massof and Benzchawel (1980),who based their model on a variety of
HUMAN COLOR MECHANISMS
249
threshold measurements and hue judgments. A different approach to second-stage mechanisms was taken by Krauskopf et al. (1982) who directly searched for three linear second-stage mechanisms defined with reference to cone signals rather than unique hues. The experiments that follow are more easily described if colors are represented on a set of axes defined using estimates of cone spectra. The space is depicted in three planes in Figure 10 (a three-dimensional picture of a subset of this space can be found in Derrington et al.. 1984). To derive this space, the Smith-Pokorny cone excitations, L, M , and S (Equation 47) are converted to luminance units 1, m, and s for each light: l=- L . L+M'
rn=l -
I;
s=-
S L+ M
(54)
1 and rn are defined relative to one another, whereas s can be independently multiplied by any constant. The MacLeod and Boynton (1978) chromaticity diagram uses s and 1 (or rn ) as axes and depicts colors on a plane of unit luminance. Similar transformations were used previously by LeGrand (1949) and Rodieck (1973). The three axes are called rg. yy and Id So that their names can function as mnemonics based on the approximate colors, red, green, yellow, violet, light, dark, a t the ends of the axes. In the space depicted in Figure 10. the transformation from the units of Equation 54 to the axes is:
[=I:[
YU
1 0 5 i 8 -11.27 0 -1.235-1.235 58.82
)[$i-:I
(55)
0
The reverse transform is easy to see in the figure: .665 ,059
0
(56) .021
0
.017
yu
A number of properties of this space are apparent on examining the
figure: a) The origin of the space is at a white of mid-radiance; b) along the rg hds. s is constant and 1 and rn change in opposite directions so as to keep their sum, i.e. luminance, constant; c) along the yuaxis, 1 and rn are constant and only s varies. d) along the Id axis, the cone excitations increase proportionally from zero keeping the ratios equal to the ratios of excitation at white. In the Krauskopf et al. (1982) experiments, two lines in color space were taken to be mutually independent, if color discriminations along each line were not affected by prolonged exposure to a field temporally modulated along the other line. First, the isoluminant plane shown in Figure 10a was examined. With the observer adapted to the mid-white, thresholds were measured for pure chromatic changes towards different test directions. The field was then sinusoidally modulated in time symmetrically around white along one of the test directions. After habituation, the chromatic discrimination thresholds
CHAPTER 7
250 60 -
50 -
40-
yv 30-
20 -
10-
"0 -(.508,.492,.021j.
'sia...
w,
(.6(16..315..020
.,I wd..052..021)
........ ...(DO.... ......&o .__.___.......
h-b-k-'l A
1 ; k
I
5
I
6
rg
1-
1.33..670..042)
.5 -
Id 0-
r,
.665..335..021)
-.5-
-1 -
io.o.0)
(.988..032,.0211..."'
HUMAN COLOR MECHANISMS
Id
251
O-
Figure 10. Three planes depicting a three-dimensional color space in cardinal axes. (a) (top panel, preceding page) rg vs y u on the isoluminant plane, cone excitations from MacLeod and Boynton (1978): (bl (bottom panel, preceding page) rg vs Id: ( c ) y u vs Id . The axes are solid lines: the dotted lines depict the boundary of real lights. The units of the main axes are given on the left and bottom: the triplets in parentheses are (l,m,s) excitations: the numbers on the curved boundary are wavelengths of spectral lights. were remeasured. and the log of the ratio of the post to pre thresholds taken as an index of desensitization. The main result was that for rg and yu axes, thresholds were raised in the direction of habituation but virtually unchanged in the orthogonal direction. Habituation along one of the intermediate directions, however, elevated thresholds fairly uniformly around the plane, including in the orthogonal direction. There were thus only two independent or cardinal directions in the isoluminant plane as defined by selective habituation. By a similar procedure, the Id axis was found to be the third independent or cardinal axis of color space. As is clear from Figure 10. modulations along Id lead to the maximum modulation of cone signals. Since this modulation had negligible effects on discriminations along chromatic axes, the desensitizing effect of chromatic modulation must occur after cone interactions. The results of this experiment would obviously be consistent with the existence a t some stage, of three linear post-receptoral mechanisms with cone weights given by the 3 x 3 matrix of coefficients in Equation 55. Two questions could be raised about the cardinal directions experiment: 11 Can independent directions be found using lines joining unique hues a la Schrodinger (1925)" The
252
CHAPTER 7
answer was given in the negative by Krauskopf et al. (1982). 2) Are linear cardinal mechanisms approximations to underlying nonlinear mechanisms like those revealed by hue judgments? If the habituating mechanisms are non-linear combinations of cone signals, then modulating along any straight line will excite more than one mechanism and, therefore, reduce the selectiveness of the desensitization. To the extent that the independence of the directions is almost perfect, the probability of underlying nonlinear mechanisms is low. What stage of the visual system is being isolated by the habituation procedure? Additional evidence that selective habituation isolates mechanisms past the first stage comes from a n experiment designed to search for static response non-linearities in the visual system (Zaidi and Hood, 1988). Thresholds were measured for probe increments on brief flashes superimposed on a steady white background. Both probes and flashes were changes along the cardinal directions. The threshold results showed t h e presence of post-opponent response compression along both rg and yv axes, and independence between the two chromatic cardinal directions. However, thresholds for probes along rg or yvwere raised by Id flashes, consistent with pre-opponent response compression. Unlike increment threshold measurements, the habituation procedure seems not to be affected by first stage nonlinearities. Evidence from electrophysiology. however, indicates that the site of habituation may not be an early opponent stage. Since the work of DeValois et al. (1966) and Wiesel and Hubel (1966).it has been known that chromatic signals from the retina to the cortex pass mainly through P-cells of primate lateral geniculate nucleus. Derrington, Krauskopf and Lennie (1984) studied macaque LGN cells using modulations around white in cardinal color space. Based on their chromatic properties, P-cells clustered into two classes corresponding closely to the two chromatic cardinal mechanisms. They found. however, as had Wiesel and Hubel (1966). that the structure of the receptive fields of P-cells enabled them to carry luminance as well a s chromatic signals (see also Ingling and Martinez-Uriegas, 1983). Luminance and chromatic signals are not separated until the striate cortex (D'Zmura and Lennie. 1986). making the retina or the LGN an unlikely site for cardinal mechanisms. Additionally, a t least in anesthetized macaques, the response of P-cells is not reduced by prolonged exposure to temporal modulation. It is probable that habituation occurs at a later stage, but that the desensitization data reflect signals arriving from a stage consisting of independent cardinal mechanisms. This scheme would also be consistent with results on the effect of color direction on the perception of coherent motion. Krauskopf and Farell (1990) found that drifting gratings modulated along different cardinal directions appear to slip past one another, whereas when the directions of the modulation are rotated by 45" in color space, the gratings cohere. This task isolates the properties of motion-selective mechanisms, but seems to reflect the independence of chromatic signals from an earlier stage.
Higher order interactions At present, one of the most interesting questions in color vision is the identification of mechanisms beyo@ the linear opponent stage. A
HUMAN COLOR MECHANISMS
253
reanalysis of the data of Krauskopf et al. (1982) showed that the desensitizing effect of isoluminant chromatic modulation along lines intermediate to the two cardinal directions was greatest in the same direction as the habituating modulation. An explanation of this effect required the desensitization of not only second stage mechanisms but also of higher level mechanisms maximally sensitive to different isoluminant colors distributed about the color circle. These higher level mechanisms were also consistent with the transient elevation of thresholds after changes in adaptation state, and with data on the detection and discrimination of isoluminant changes in color (Krauskopf et al., 1986). Another clue to the properties of higher level mechanisms was provided by experiments on pure chromatic induction using a modulation nulling technique (Krauskopf. Zaidi. and Mandler, 1986). When average adaptation was kept constant, induced colors were in the direction complementary to the inducing color with respect to the test color. However, the functions relating nulling to inducing amplitude were different for the three cardinal directions. Therefore, simultaneous chromatic contrast in different directions along the color circle could not be predicted by the lateral interaction of either like receptors, or like second-stage opponent mechanisms and was a consequence of lateral interaction within higher-level chromatic mechanisms. Similar multiple mechanisms have been invoked to explain changes in perceived color following habituation to temporal modulation (Webster and Mollon, 1990). and for contrast detection in luminance and chromatic noise (Gegenfurtner and Kiper. 1990). In addition to these psychophysical results, cells in the striate cortex do not cluster into two chromatic classes as in the LGN, but seem to be tuned to many different directions in color space (Lennie et al., 1990). A different sort of property was revealed by the desensitizing effect of sawtooth chromatic modulation. Prolonged viewing of sawtooth modulation along a color line raised thresholds more for step changes in color when they were in the same direction as the slow phase of the habituation stimulus than when they were in the opposite direction (Krauskopf et al., 1982). When the sawtooth modulation in the test area was induced by surround modulation, the relation between the sign of the sawtooth and the magnitude of threshold elevation was reversed (Krauskopf and Zaidi, 1986). These results could be explained by color mechanisms that respond to change in one color direction but not its complement. Similar mechanisms have been revealed by an experiment that involved the detection of purely temporal step changes in color on a background that was modulated chromatically in time around circles in an iso-luminant color plane (Zaidi and Halevy, 1990). Higher order color mechanisms could reflect linear, nonlinear, temporal, and spatial interactions between second stage mechanisms. Questions about the number and function of higher order mechanisms and the rules of combination of second-stage mechanisms remain unresolved. What could be the functional reason to have multiple mechanisms at one stage of a system that was restricted to just three at a prior stage? Part of the answer could be that inhibition and excitation within narrow bands of color space may serve some purpose. Part of the answer may have to wait for a better understanding of the functional role of color in a visual system. Knowledge of the rules that govern color appearance in real scenes is very incomplete. It is sometimes stated
254
CHAPTER 7
without explicit evidence that the appearance of all colors can be described in a three dimensional system. The tri-variance of color matches is well established when both the test and the match color are seen with the same adaptation. However, if the test and the match color are seen with widely different adaptations, it may be impossible to match the test not only with a combination of three primaries, but also with a light of any spectral composition (Hunt, 1953:Bartleson, 1978). When considering the dimensions of color signals, the dimensions of space and time, i.e. the effect of spatial configuration and temporal sequence, should be added to the dimensions provided by the three independent classes of cones (see also Judd. 1961).It is likely that the manifold of apparent colors has more than three dimensions. Acknowledcements - The help of Noreen Flanigan. John Krauskopf. Ben Sachtler. Art Shapiro. and graduate students in the Sensation and Perception seminar at Columbia is gratefully acknowledged. This work was partially supported by the National Eye Institute under grant EY07556 to the author.
References Abney, W. dew. (1895).Colour Vision, S. Low, Marston and Co.. London. Abney, W. dew. (1906). Modified apparatus for the measurement of colour and its application to the determination of color sensations. Philisophical Trans.,205. 333-355. Abney, W. dew. (1910).On the change in hue of spectrum colors by dilution with white light. Proceedings of the Royal Society of London, A 83, 120-127. A theory of color vision. Psychological Review, 30. Adams. E.Q. (1923). 56. Adams. E.Q.(1942).X-2 planes in the 1931 ICI system of colorimetry. Journal of the Optical Society of America, 32. 168. Albers. J. (1963). Interactions of Color. Yale University Press, New Haven. Alpern. M.. and Pugh Jr.. E.N. (1977). Variation in the action spectrum of erythrolabe among deuteranopes. Journal of Physiology (London). 266. 613-646. Bartleson. C.J. (1978).Comparison of chromatic-adaptation transforms. Color Research and Application, 3, 129-136. Boynton. R. M.. Ikeda. M. and Stiles, W.S. (1964).Interactions among chromatic mechanisms as inferred from positive and negative increment thresholds. Vision Research, 4, 87-1 17. Boynton. R. M. (1979).Human Color Vision. Holt. Rinehart and Winston, New York. Brewer, W.L. (1954). Fundamental response functions and binocular color matching, Journal of the Optical Society of America. 44. 207. Brindley. G . S . (1957). Two theorems in colour vision, Quarterly Journal of Experimental Psychology, 12, 101-104. Brindley, G.S. (1960).Two more visual theorems. Quarterly Journal of Experimental Psychology, 12. 110-112. Brindley, G. S. (1970).Physiology of the Retina and Visual Pathway, Williams and Wilkins, Baltimore.
HUMAN COLOR MECHANISMS
255
Bums, S. A., Elsner. A.E., Pokorny, J., and Smith, V.C. (1984).The Abney effect chromaticity coordinates of unique and other constant hues. Vision Research, 24, 479-489. CIE (1931).CIE Proceedings, Cambridge University Press, Cambridge. D'Zmura, M.. and Lennie. P. (1986).Mechanisms of color constancy. Journal of the Optical Society of America, 1, 1662-1672. Dartnall, H.J.A. (1983).Human visual pigments: microspectrophotometric results from the eyes of seven persons. Proceedings of the Royal Society of London. B 220, 115-130. Derrington, A. M., Krauskopf. J.. and Lennie. P. (1984).Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology, 357. 241-265. DeValois, R.L., Abramov, I., and Jacobs, G.H. (1966).Analysis of response patterns in LGN cells. Journal of the Optical Society of America, 56. 966-977. Dimmick, F.L.. and Hubbard. M.R. (1939a).The spectral location of psychologically unique yellow. green and blue. American Journal of Psychology, 52, 242-254. Dimmick, F.L., and Hubbard, M.R. (1939b). The spectral components of psychologically unique red. American Journal of Psychology, 5 2 , 348-353. Donders, F.C. (1881). Uber Farbensysteme. Archiues of Ophthalmology, 27. 155. Evans, R. M. (1974).The Perception of Color. John Wiley and Sons, New York. Friedman. B. (1956). Principles and Techniques of Applied Mathematics, John Wiley and Sons, New York. Gegenfurtner, K.. and Kiper, D.C. (1990).Contrast detection in luminance and chromatic noise. Inuestigatiue Ophthalmology and Visual Science, 31,109. Graham, C. H. (1965).Vision and Visual Perception. John Wiley and Sons, New York. Grassman, H. (1853).Zw Theorie der Farbenmischung. Poggendorf Annals Physics, 89.69. Guild, J. (1931).The colorimetric properties of the spectrum. Philisophical Trans.Royal Society of London, A 230, 149. Guth, S.L., Donley, N.V., and Marrocco, R.T. (1969).On luminance additivity and related topics. Vision Research, 9, 537-575. Guth, S.L.. and Lodge. H.R. (1973).Heterochromatic additivity, foveal spectral sensitivity, and a new color model. Journal of the Optical Society of America. 63, 450-462. Guth. S.L., Massof, R.W., and Benzschawel. T. (1980).Vector model for normal and dichromatic color vision. Journal of the Optical Society of America, 70. 197-212. Hardy, A.C., (1936).Handbook of colorimetry. Massachusetts Institute of Technology: the Technology Press, Cambridge, Massachusetts. Hering, E. (1878).Zur k h r e uom Lichtsfnne, Carl Gerold's Sohn. Wien. Hunt, R. W. G. (1953).The perception of color in 1 degree fields for different states of adaptation. Journal of the Optical Society of America, 43. 479. Hurvich, L.V., and Jameson. D. (1958). Further development of a quantified opponent-colows theory. In Visual Problems of Colour. 2. 693-723,NPL Symposium No. 8, Her Majesty's Stationery Office, London.
256
CHAPTER 7
Ikeda, M.. and Ayama. M. (1980). Additivity of opponent chromatic valence. Vision Research, 20, 995-999. Ingling. Jr.. C.R. (1977). The spectral sensitivity of the opponent color channels. Vision Research, 17. 1083- 1089. Ingling, C.R., and Martinez-Uriegas. E. (1983). The relationship between spectral sensitivity and spatial sensitivity for the primate r-g X-channel, Vision Research, 23, 1495-1500. Jameson. D., and Hurvich. L.M.(1955). Some quantitative aspects of an opponent-colors theory. I . Chromatic responses and spectral saturation, Journal of the Optical Society of America, 45, 546-552. Judd, D.B. (1949). Response functions for types of vision according to the Muller theory. Journal of Research National Bureau of Standards (Washington. DC), 42. J u d d . D.B. (1951a). Report of U.S. Secretariat Committee on Colorimetry and Artificial Daylight. In CIE Proceedings, Stockholm, 1, Part 11. Bureau Central de la CE. Paris. Judd. D.B. (1951b). Basic correlates of the visual stimulus. In S . S . Stevens (Ed.), Handbook of Experimental Psychology, Chapter 22. 881-867, John Wiley and Sons, New York. Judd. D.B. (1961). A five-attribute system of describing visual appearance. American SOC. Test. Mater. Spec. Tech. Publ., 2 95. 1-15. Judd. D.B. (1964). Relation between normal trichromatic vision and dichromatic vision representing.a reduced form of normal vision. Acta Chromatua, 1 , 524-527. Konig. A., and Dieterici, C. (1886). Die Grundempfindungen und ihre Intensitats-Verthenung im Spectrum. Sitz. Akad. Wiss. , 29, 805-829, Berlin. Konig. A.. and Dieterici. C. (1893). Die Grundempfindungen in normalen and anomalen Farben Systemen und ihre Intensitats Verteilung in Spectrum. 2. Psychol. Physiol. Simmesorg, 4, 241-347. Konig. A. (1903).Gesammelte Abhandlungen, Barth-Verlag. Leipzig. Krantz, D.H. (1975). Color measurement and color theory: I. Representation theorem for Grassman structures. Journal of Mathematical Psychology, 12, 283-303. Krauskopf. J.. and Srebro. R. (1965). Spectral sensitivity of color mechanisms: derivation from fluctuations of color appearance near threshold. Science, 150, 1477-1479. Krauskopf, J., Williams, D.R.. and Heely. D.M. (1982). The cardinal directions of color space, Vision Research, 22. 1123-1131. Krauskopf. J.. Zaidi, Q . . and Mandler, M.B. (1986). Mechanisms of simultaneous color induction, Journal of the Optical Society of America. A3, 1752-1757. Krauskopf. J., Williams, D.R.. Mandler, M.B., and Brown, A.M. (1986). Higher order color mechanisms. Vision Research, 26, 23-32. Krauskopf. J.. and Zaidi. Q. (1986). Induced desensitization. Vision Research, 26, 753-762. Krauskopf. J., and Farell. B. (1990). The influence of color on the perception of coherent motion Manuscript submitted for publication. von Kries. J . (1878). Beitrag zur Physiologie der Gesichtempfindungen. Arch Anat. Physiol. Lpz., 23, 15-24. von Kries. J. (1896). Uber die Fuktion der Netzhautstabchen, 2. Psychol. Physiol. Sinnesorg,, 9. 81-123.
HUMAN COLOR MECHANISMS
257
von Kries. J . (1905).Die Gesichtsempfindungen. In W. Nagel (Ed.), Hanb. d. Physiol. des Menschen.. 3. 102-282. Lang. S. ( 1 966). Linear Algebra. Addison-Wesley, Reading, Massachusetts. Larimer, J . , Krantz, D.H., and Cicerone, C.M. (1975).Opponent-process additivity. I. Yellow/blue equilibria and nonlinear models. Vision Research, 18, 723-731. LeGrand, Y. (1949).Les seuils differentiels de couleurs dans la theorie de Young. Rev. d'Opt., 28,261. LeGrand, Y. (1957).Light, Colour, and Vision. John Wiley and Sons, New York. Lennie, P., Krauskopf. J.. and Sclar, G. (19901.Chromatic mechanisms in striate cortex of macaque. Journal of Neuroscience, 10, 649-699. MacAdam. D.L. (1970).Sources of Color Science. M. I. T. Press, Cambridge, MA. MacLeod. D.I.A.. and Boynton. R.M. (1979).Chromaticity diagram showing cone excitation by stimuli of equal luminance. Journal of the Optical Society of America. 69, 1183-1186. Maxwell, J.C. (1857).The diagram of colors. Trans. Royal Society of Edinburgh, 21. 275-298. Maxwell, J.C. (1860).On the theory of compound colours and the relations of the colours of the spectrum. Phil. Trans., 150. 57-84. Mollon, J.D., and Polden, P.G. (1977).An anomaly in the response of the eye to light of short wavelengths. Phil. Trans. Royal Society [London), B278, 207-247. Mulla, G.E. (1924).Darstellung und Erklarung der verschiedenen Typen der Farbenblindheit. Vandenhoek and Ruprecht, Gottingen. Muller, G.E. (1 930). Uber die Farbenempfindungen. 2. Psychol., Erganzungsb, 17,18. Nagy. A.L. (1980).Short-flash Bezold-Brucke hue shifts. Vision Research, 20, 361-368. Nagy, A.L. (1984). Additivity of dichromatic color matches to short-wavelength lights. Journal of the Optical Society of America, 1, 1087-1090. Naka, V.I., and Rushton, W.A.H. (1966).An attempt to analyze colour reception by electrophysiology. Journal of Physiology, 185, 556-586. Nassau, K. (1983).The physics and chemistry of color: the fi$een causes of color. John Wiley and Sons. New York. Newton, I. (1674).An hypothesis explaining the properties of light. In T. Birch (Ed.), History of the Royal Society of London, 1757. 111, London. Newton, I. (1704).Opticks. Sam. Smith and Benj., Walford, London. Nimeroff. I. (1970).Deuteranopic convergence point. Journal of the Optical Society of America. 60, 960-969. Nuberg. N.D., and Yustova. E.N. (1955).Trudy Gos. Opticheskogo Instituta, 24, 33-93,Moskow. Pitt. F.H.G. (1935).Characteristics of dichromatic vision. Medical Research Council, Rep. of Committee on Physiology of Vision. No. XlV, Her Majesty's Stationery Office, London. Pokomy. J.. and Smith, V.C. (1977).Evaluation of single-pigment shift model of anomalous trichromacy. Journal of the Optical Society of America, 67, 1196-1209.
258
CHAPTER 7
Pokorny, J., Smith, V.C., Verriest. G.. and Pinckers. A. J. L. G. (1979). Congenital and Acquired Color Vision Defects. Grune and Stratton, New York. Pokorny. J.. Smith, V.C.. Burns, S.A., Elsner. A.. and Zaidi. Q. (1981). Modeling Blue-Yellow opponency. Proc. 4th Int. Congr. Int. Color Assoc. (AIC). Berlin. Pokorny, J.. Smith, V.C., and Went, L.N. (1981). Color matching in autosomal dominant tritan defect. Journal of the Optical Society of America, 71, 1327-1334. Pugh. E.N. (1976). The nature of the pi1 mechanism of W. S . Stiles. Journal of Physiology, 257. 713. Pugh. E.N., and Mollon. J.D. (1979). A theory of the pi1 and pi3 color mechanisms of Stiles. Vision Research. 19, 293-312. Purdy, D. McL. (1931). Spectral hue as a function of intensity. American Journal of Psychology, 43, 541. Resnikoff, H.L. (1974). Differential geometry and color perception. Journal of Mathematical Biology. 1. 97-131. Rodieck, R.W. (1973). The Vertebrate Retina, Principles of Structure and Amction W. H. Freeman and Co.. San Francisco. Rushton, W.A.H. (1972). Visual pigments in man. In H. J. A. Dartnall (Ed.), Handbook of Sensory Physiology. 1. Photochemistry of Vision, 7, 364-394, Springer-Verlag. New York. Schnapf, J.L., Kraft. T.W., and Baylor, D.A. (1987). Spectral sensitivity of human cone photoreceptors. Nature, 325, 439-491. Schrodinger. E. (1920). Grundlinien einer Theorie der Farbenmetrik im Tagessehen. Ann. Physik. 63, 397-447, 481-520. Schrodinger, E. (1925). Uber das Verhaltnis der Vierfarben zur Dreifarbentheorie. Sitz. Akad. Wiss. Wien 134. Shevell. S.K. (1978). The dual role of chromatic backgrounds in color perception. Vision Research, 18, 1649-1661. Smith, V.C., and Pokorny, J. (1972). Spectral sensitivity of color blind observers and the cone pigments, Vision Research, 12,2059-2071. Smith, V.C.. and Pokorny, J. (1975). Spectral sensitivity of the foveal cone photopigments between 400 and 700 nm. Vision Research, 15, 161-171. Smith, V.C.. Pokorny, J., and Zaidi. 9. (1983). How do sets of color-matching functions differ? In L. T. Sharpe (Ed.) Color Vision, Academic Press, London. Stiles, W.S. (1939). The directional sensitivity of the retina and the spectral sensitivities of the rods and cones. Proceedings of the Royal Society (Londonl, B 127,64. Stiles, W.S., and Burch. J . M . (1956). Interim report to the CIE on the NPL investigation of colour matching, 2. Zurich. Stiles, W.S. (1955). The basic data of colour-matching. Physical Society Yearbook, 44-65, London. Stiles, W.S. (1959). Color vision: The approach through increment threshold sensitivity. Proceedings of the National Academy of Science, 46. 100. Stockman, A. (1989). Middle and long wavelength cone spectral sensitivities. Rank Prize Funds Symposium on Color Vision. Thomson. L.C., and Wright, W.D. (1953). The convergence of the tritanopic confusion loci and the derivation of the fundamental response function. Journal of the Optical Society of America, 43. 890-894.
HUMAN COLOR MECHANISMS
2 59
Trezona. P.W. (1953). Additivity of colour equations. Proceedings of the Physics Society (Londonl, B66. 548. Trezona, P.W. (1954). Additivity of colour equations. Proceedings of the Physical Society (Londonl, B67, 513. Vos, J.J., and Walraven. P.L. (1971). On the derivation of the foveal receptor primaries. Vision Research, 11. 799-818. Vos, J.J.. Estevez. 0.. and Walraven. P.L. (1990). Improved color fundamentals offer a new view on photometric additivity. Vision Research, 30, 937-943. Walraven. P.L. (1974). A closer look a t the tritanopic convergence point. Vision Research, 14. 1339-1343. Webster, M.A., and Mollon, J.D. (1990). Changes in perceived color and lightness following selective adaptation of postreceptoral visual channels. Investigative Ophthalmology and Visual Science, 31, 109. Wiesel, T.N.. and Hubel, D.H. (1966). Spatial and chromatic interactions in the lateral geniculate body of the rhesus monkey. Journal of Neurophysiology. 29, 1115-1156. Williams, D.R.. and MacLeod, D.I.A. (1979). Interchangeable backgrounds for cone afterimages. Vision Research, 19, 867-877. Woodworth. R.S. (1938). Experimental Psychology. Henry Holt and Co.. New York. Wright, W.D. (1928). A re-determination of the trichromatic coefficients of the spectral colours. Trans. Optical Society, 30, 141. Wright, W.D. (1946). Henry Kimpton. London. Wright, W.D. (1952). Characteristics of tritanopia. Journal of the Optical Society of America, 42, 509. Sons, New York. Young, T. (1802). On the theory of light and colors. Phil. Trans. , 92, 20-7 1. Zaidi. Q. (1986). Adaptation and color matching. VisionResearch , 26. 1925-1938. Zaidi. Q., and Hood, D. (1988). Sites of instantaneous nonlinearities in the visual system. Investigative Ophthalmology and Visual Science, 29. 163. Zaidi, Q.. Pokorny, J., and Smith, V.C. (1989). Sources of individual differences in anomaloscope equations for tritan defects. Clinical Vision Sciences, 4, 89-94. Zaidi, Q., and Halevy, D. (1990).Mechanisms that signal color changes. Investigative Ophthalmology and Visual Science, 31, 109.
This Page Intentionally Left Blank
Parallel Processing and Visual Abnormalities
This Page Intentionally Left Blank
Applications of Parallel Processing in Vision I. B r m a n (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved
263
Sensory and Perceptual Processing in Reading Disability MARY C. WILLIAMS and WILLIAM LOVEGROVE
Introduction Specific reading disability is a broad term which encompasses reading disabilities arising from a number of sources. A specific-reading-disabled child (SRD)is defined here as one of normal or better intelligence with no known behavioral or organic disorders who, despite normal schooling and average progress in other subjects, has a reading disability of at least 2.5 years (Badcock and Lovegrove, 1981; Critchely, 1964; Lovegrove et al., 1978. 1986; Slaghuis and Lovegrove, 1984; Stanley, 1975). Since reading involves a dynamic visual processing task that requires the analysis and Integration of visual pattern information across fixation-saccade sequences, studies in the area of reading disability have explored the possibility that visual processing abnormalities contribute to reading difficulties. A number of studies have provided evidence for basic visual processing differences between normal and disabled readers, especially a t early states of visual processing. Differences have been reported in visual information store duration (Lovegrove and Brown, 1978; Stanley, 1975; Stanley and Hall, 1973a). in the rate of transfer of information from visual information store to short term memory (Lovegrove and Brown, 1978; Stanley and Hall, 1973a). and in the characteristics of visual short term memory itself (Stanley and Hall, 1973b). These results indicate that some disabled readers process information more slowly and have a more limited processing capacity than normal readers. Studies that used tasks relying less on dynamic visual processing and temporal resolution, and more on pattern-formation processes and long-term visual memory, however, have failed to show visual processing differences between normal and disabled readers (Benton, 1962, 1975; Vellutino, 1977, 1979a. 1979b. 1987; Vellutino et al.. 1975a. 1975b). although the validity of these studies has been called to question (Fletcher and Satz, 1979a, 1979b). Thus the long-standing debate as to whether visual factors play a significant role in reading disabilities has been complicated by the differences in methodological factors and the failure to distinguish between the measurement of temporal versus pattern-formation processes.
264
CHAPTER 8
Two processing systems It has been suggested that the processing of temporal and pattern information is accomplished by two separate but interactive subsystems in vision with different spatiotemporal response characteristics (Kulikowski and Tolhurst, 1973: Tolhurst. 1973). Psychophysical work on visual spatiotemporal processing channels has indicated that low spatial frequency channels may elicit faster visual responses than high spatial frequency channels and are more sensitive to temporally modulated or moving stimuli, while high spatial frequency channels seem to be best designed for the detection of stationary patterns and resolution of fine pattern detail (Breitmeyer, 1975: Breitmeyer and Ganz, 1977; Breitmeyer et al., 1981: Tolhurst. 1975: Vassilev and Mitov, 1976; Watson and Nachmias. 1977). Additionally, for a given spatial frequency stimulus, response is faster and more transient when motion is being detected than when pattern is being detected (Kulikowski and Tolhurst. 1973; Watson and Nachmias, 1977). This conception of separable motion and pattern processing systems has been extensively studied and elaborated over the past two decades (Breitmeyer and Ganz, 1976: Maunsell, 1987; Stone et al., 1979), and has been incorporated in what has come to be known as the transient/sustained theory of visual perception. Breitmeyer and Ganz (1976) and Weisstein. Ozog and Szoc (1975). among others, have proposed two separate but overlapping subsystems in the visual system that respond selectively to different spatial and temporal frequencies. The transient system is most sensitive to low spatial frequencies, has a high temporal resolution, and responds transiently to quickly moving targets and to stimulus on- and offsets. The sustained system is most sensitive to high spatial frequencies, has a long response persistence and low temporal resolution, and responds in a sustained fashion to stationary or slowly moving targets. This dual processing system has recently been reconceptualized in terms of the magnocellular and parvocellular systems of the primate visual system, differing in color, acuity, speed, and contrast sensitivity (Livingstone and Hubel, 1987. 1988). The magnocellular and parvocellular systems are closely analogous to the previously proposed transient and sustained systems, respectively. [ I t should be noted t h a t the validity of the transient/sustained distinction in human vision has been questioned (Burbeck, 1981; Lennie. 1980). While this distinction is still quite schematic and requires additional work for confirmation, it has provided a useful organizing hypothesis in the study of reading disability, and has been found to be a remarkable predictor of the visual processing characteristics of the reading disabled. The usefulness of the transient/sustained analysis as an organizing tool thus seems to warrant its continued use at the present time.] It has also been suggested that these subsystems contribute to different aspects of vision and have different perceptual functions. The transient system is thought to be involved in the perception of motion and depth, brightness discrimination, the control of eye movements, and the localization of targets in space, and seems to function to accomplish a quick global analysis of a visual scene. The sustained system seems to be best designed for the identification of patterns,
READING DISABILITY
265
resolution of fine detail, and the perception of color (Breitmeyer, 1984; Breitmeyer and Ganz, 1976;Weisstein et al.. 1975: Livingstone and Hubel. 1988). Although these two subsystems operate in parallel. it is believed that the transient system has temporal precedence: it operates preattentively and functions as an early warning system. I t performs a global analysis of the incoming stimulus, parsing the field into units and regions and coding the position and movement of objects in space. The transient system may function to direct the sustained system to particularly salient areas where it might be most efficacious to perform a more detailed analysis of the shape and color of objects. The functioning of the sustained system, then, would depend to a degree on the prior output of the transient system.
Transient and sustained channels and reading It has been demonstrated physiologically (Singer and Bedworth. 1973) and psychophysically that transient and sustained systems may mutually inhibit each other (Breitmeyer and Ganz, 1976). In particular. if the sustained system is responding when the transient system is stimulated, the transient activity can terminate the sustained activity. These two subsystems and the interactions between them may serve a number of functions essential to the reading process. When reading, the eyes move through a series of rapid eye movements called saccades. Saccades are separated by fixation intervals lasting 200-250msec. It is during these stationary periods that information from the printed page is seen. The average saccade length is 6-8characters, or about 2 degrees of visual angle (Rayner and McConkie, 1976). Saccadic eye movements function to bring unidentified regions of text into foveal vision for detailed analysis during fixations. Foveal vision is the area of high acuity in the center of vision extending approximately 2 degrees (6-8letters) around the fixation point on a line of text. Beyond this, foveal acuity drops off rather dramatically. The role of transient and sustained subsystems in reading has recently been considered by Breitmeyer (Breitmeyer. 1980, 1983; Breitmeyer and Ganz, 1976). Figure 1 represents the hypothetical activity in the transient and sustained channels over a sequence of 3 fixations of 250 msec duration separated by 2 saccades of 25 msec duration. The sustained channel response occurs during fixations and may last for several hundred milliseconds. This response provides the details of what is being seen. The transient channel response is initiated by eye movements and lasts for much shorter durations. Consequently both systems are involved in reading. The duration of the sustained response may outlast the physical duration of the stimulus. This is one form of visible persistence produced by the activation of sustained channels. The duration of visible persistence can reach several hundred milliseconds and increases with increasing spatial frequency (Bowling et al.. 1979;Meyer and Maguire. 1977). If sustained activity (as shown in Figure 1, panel 2)generated in a preceding fixation persists into the succeeding one, it would interfere with processing in the second fixation. Consequently, it is evident that
266
CHAPTER 8
for tasks such as reading, persistence across saccades presents a problem a s it may lead to superimposition of successive inputs. Breitmeyer proposes that the problem posed by visible persistence is solved by rapid saccades, as shown in the bottom two panels of Figure 1.
-I
-I -I
Figure 1. A hypothetical response sequence of sustained and transient channels during 3 250 msec fixation intervals separated by 25 msec saccades (panel 1). Panel 2 illustrates persistence of sustained channels acting as a forward mask from preceding to succeeding fixation intervals. Panel 3 shows the activation of transient channels shortly after each saccade which exerts inhibition (arrows with minus signs) on the trailing, persisting sustained activity generated in prior fixation intervals. Panel 4 shows the resultant sustained channel response after the effects of the transient-on-sustained inhibition have been taken into account. Saccades not only change visual fixations, they also activate short latency transient channels (panel 3). which are very sensitive to stimulus movement. This transient activity, in turn, inhibits the sustained activity persisting from a previous fixation and prevents it from interfering with the succeeding one (Breitmeyer and Ganz. 1976; Matin, 1974). The result is a series of clear, unmasked, and temporally
READING DISABILITY
267
segregated frames of sustained activity, each one of which represents the pattern information contained in a single fixation (Figure 1, panel 4).
In these terms, clear vision on each fixation results from the interactions between sustained and transient channels. The two subsystems and the interactions between them, therefore, seem to be important in facilitating normal reading. A deficit in either the transient or the sustained system, or in their interaction, may have harmful consequences for reading.
Transient and sustained channels and reading disability There is evidence that this transient-sustained relationship is different in normal and disabled readers. Lovegrove and coworkers have shown that visual processing differences between normal and disabled readers are evident when transient system processing is involved, but fail to surface under sustained processing conditions. For example, several studies have compared SRDs and controls on measures of visible persistence. Visible persistence is one measure of temporal processing in spatial frequency channels and refers to the continued perception of a stimulus after it has been physically removed. Visible persistence is assumed to reflect on going neural activity initiated by the stimulus presentation. In adults, the duration of visible persistence increases with increasing spatial frequency (Bowling et al.. 1979: Meyer and Maguire. 1977). In a series of experiments, Babcock and Lovegrove (1981) and Slaghuis and Lovegrove [ 1985) have compared visible persistence in SRD and normal readers aged 8 to 15 years. The duration of visible persistence was determined by measuring the temporal separation required for the detection of a blank interval between two successively presented gratings. In normal readers, the duration of visible persistence varied as a function of the spatial frequency of the test grating. Lower spatial frequency stimuli produced shorter visible persistence durations. Visible persistence increased monotonically with increasing spatial frequency (Figure 2). The SRDs had a significantly smaller increase in visible persistence duration with increasing spatial frequency than did controls. In the SRD group, visible persistence was longer for low spatial frequencies and shorter for the higher spatial frequencies as compared with controls (Figure 2). The slope of the function relating visible persistence to spatial frequency was significantly flatter in SRDs than in normal readers (Lovegrove et al.. 1980). The slope was 14.9 for normals but only 4.8 for SRDs. A normal sloping function can be explained in terms of transient-on-sustained inhibition. At low spatial frequencies, transient inhibition is relatively strong. At the offset of the stimulus, a transient system response inhibits the prolonged activity of the sustained channels. Because of this action. visible persistence is relatively short. As spatial frequency increases, transient system activity decreases, producing less sustained system inhibition. At high spatial frequencies, sustained channel activity persists longer after stimulus offset, resulting in an increase in visible persistence duration. This difference in visible persistence as a function of spatial frequency between controls and SRDs can be explained by suggesting a
268
CHAPTER 8
disparate type of transient-sustained interaction present in those with specific reading disability. In this group, the existence of a transient system deficit would elevate visible persistence at low spatial frequencies by creating deficient transient-on-sustained inhibition. At higher spatial frequencies, if SRDs do have a weak transient system, their sustained systems are "disinhibited" from the normal tonic transient-on-sustained inhibition compared to controls (Lovegrove. Martin, and Slaghuis, 1986). This would increase the activity in their sustained systems and produce shorter persistence durations. This is argued to be the case because a manipulation known to reduce transient system activity - uniform field flicker masking (Breitmeyer et al.. 1981)has a much greater effect on visible persistence in controls than in SRDs. Furthermore, uniform field flicker masking reduces the persistence differences between the two groups.
h
0
%
E
v
W
0
z W
trw K W
n
SPATIAL FREQUENCY (cldeg)
Duration of visible persistence as a function of spatial frequency for controls and reading disabled.
Figure 2.
Other evidence for a transient system deficit in specific reading disability has been advanced by the study of contrast sensitivity. Contrast sensitivity is the minimum amount of contrast needed to perceive a grating pattern. Sensitivity is greatest for patterns of intermediate frequency and decreases for patterns that are of lower or
READING DISABILITY
269
higher spatial frequency. Contrast sensitivity plotted as a function of stimulus spatial frequency is referred to as the contrast sensitivity function (CSF).
4.24.0 \
3.8 - + 3.6 3.4 3.2 3.0 2.8 2.6
-
\ \
2.4 -
\ \ \ \
2.2 -
\ \ \
\
b
2.0 1.8
+
1
c I
0
f 2
Reading Disabled Controls I
I
4
8
I
12
SPATIAL FREQUENCY (ddeg) Figure 3. Contrast sensitivity function for controls and reading disabled. CSFs have been measured in five separate samples of disabled and normal readers aged 8-14 years of age (Lovegrove et al., 1980, 1982; Martin and Lovegrove. 1984). SRDs showed a consistent pattern of lower sensitivity to low spatial frequencies (1-4 c/deg) than did
270
CHAPTER
8
controls (Figure 3). The pattern of differences to the high spatial frequencies (12-16 c/deg) is less exact. In some studies, the two groups did not differ in the CSF and in others the SRDs were slightly more sensitive than controls in that range. The differences between the groups were greatest with stimulus durations ranging from 150-500 msec. I t should be noted that the magnitude of the differences between the groups on measures of pattern CSF are not as great as those found on measures of visible persistence (see Lovegrove. Martin, and Slaghuis. 1986). The finding of a small but consistent sensitivity loss a t low spatial frequencies in SRDs is consistent with the proposal of a transient system deficit a s argued by Lovegrove et al. (1982). The pattern CSF data are consistent with the visible persistence data reported above. SRDs are less sensitive than controls at low spatial frequencies but not a t high spatial frequencies, where they are sometimes more sensitive. As the pattern CSF experiments used a two-alternative temporal forced choice procedure, the differences between the groups are unlikely to result from criterion differences. The finding that SRDs are at least as sensitive as controls to high spatial frequencies is also consistent with the general finding that SRDs have normal or correctable-to-normal acuity. This finding also further explains why some studies have, and some have not, found differences in visual processing between the two groups. The presence or absence of visual processing differences should reflect the channel whose activity has been measured. Transient system functioning can be investigated more directly by the measurement of flicker contrast sensitivity where, instead of a static display of a stimulus, a test grating is counterphased. It has been argued that flicker thresholds are mediated by the transient system (Kulikowski and Tolhurst, 1973). A transient system deficit, then, should result in decreased sensitivity to flicker in SRDs. and this decrement should increase as temporal frequency increases. In these experiments, 13-year-old subjects detected a 2 c/deg sine wave grating that counterphased a t 5, 10, 15. 20, and 25 Hz. On the average, controls were found to be more sensitive than SRDs across the range of temporal frequencies tested (Figure 4; Martin and Lovegrove. 1987). The sensitivity difference between the two groups increased with increasing temporal frequency. Similar results were obtained by Brannan and Williams (1988a) using uniform field flicker. These results add further to the argument that a difference in transient system function exists between SRDs and normal readers. In a second experiment, the flicker contrast sensitivity function was determined for the same two groups using spatial frequencies from 1-12 c/deg counterphasing a t 20 Hz (Martin and Lovegrove, 1987). These results also indicated that the controls were more sensitive than SRDs across all spatial frequencies, the differences being larger at the higher spatial frequencies. A further series of experiments has been conducted comparing sustained system processing in controls and SRDs (Lovegrove et al.. 1986). Using similar procedures, equipment, and subjects as the experiments outlined above, this series h a s failed to show any significant differences between the two groups in orientation or spatial frequency tuning. This implies that either there are no differences
READING DISABILITY
27 1
between the groups in the functioning of their sustained systems, or that such differences are small compared to the transient system differences demonstrated,
4.1 4.0
3.9
>.
t 2
3.8 3.7
v)
z
v)
3.6
I-
2K
3.5
z 0
3.4
I-
\ \
0
f
\ \ \
3.3
\ \
\
\
3.2
+ 0
3.1
\
Reading Disabled Controls
\ \ \
6
T I
I
I
I
I
5
10
15
20
25
TEMPORAL FREQUENCY (Hz) Figure 4. Flicker contrast sensitivity for a counterphasing 2 c/deg grating as a function of temporal frequency for controls and reading disabled. In summary, four converging lines of evidence suggest a transient system deficit in SRDs. The results are internally consistent and consistent with the proposal of a transient system deficit. The differences between the groups are quite large and discriminate well between individuals In the different groups, with approximately 75% of SRDs showing reduced transient system sensitivity (Slaghuis and Lovegrove, 1985). At the same time, evidence to date suggests that the two groups do not differ in sustained system functioning. The two
272
CHAPTER 8
findings taken together may help to explain some of the confusion reported in the literature over many years. In these terms whether or not differences are found will depend on which system is investigated.
Perceptual consequences of a transient deficit The studies reported above demonstrate that a large subgroup of disabled readers do have visual deficits. The visual deficits are specific spatiotemporal processing abnormalities, are systematic, and occur early in the visual processing hierarchy: that is, in transient system operations. Given what is known of the perceptual functions that the transient system performs, it is reasonable to expect that children with a reading disability stemming from transient malfunctions would show deficits in global, preattentive processing operations and on tasks requiring fine temporal resolution. A number of studies have expanded on Lovegrove's work by studying the perceptual consequences of a transient deficit in reading disabled children. These studies employed subject populations consisting of children reading a t least one year below grade level ("disabled readers"), and children reading a t or above grade level ("normal readers"). Since normal performance on standardized reading tests includes a range of scores +/- one standard deviation from the mean. this classification criterion would theoretically designate 84% of a normally distributed population as normal readers, and 16% as disabled readers, which is consistent with most estimates of the prevalence of reading disability in the general population (e.g.. Critchely. 1964). All children in these studies were aged 8-12 years, were of normal or above normal intelligence, had normal color vision and normal or corrected-to-normal visual acuity, and scored within the normal range on tests of auditory discrimination. The normal and disabled reader groups were matched for age and IQ. These studies have demonstrated the existence of a number of perceptual deficits in disabled readers that would be predicted by a transient deficit hypothesis, t e . . the perceptual skills affected are those that are most likely to be mediated by the transient system. Williams and Bologna (1985) found that disabled readers show stronger perceptual grouping effects than normal readers, and are less proficient a t selective attention operations. Given that perceptual grouping and selective attention operations have been linked to the activity of transient visual channels (Williams and Weisstein. 1980). these findings suggest that the suspected transient deficit in disabled readers may be producing perceptual deficits in perceptual grouping and selective attention operations. There is evidence that this inclination towards global processing in disabled readers is a consequence of a sluggish response from the transient system. Brannan and Williams (1987) found that disabled readers were not able to utilize information provided by a cue to target location in a target detection task if the cue preceded the target by less than 50 msec.. whereas normal readers could utilize such information a t shorter temporal separations. Likewise, it has been shown that disabled readers require more time than either normal readers or adults to make accurate judgements when asked to specify which of two simple words were presented first (Brannan and Williams, 1988b; May,
READING DISABILITY
2 73
CLEAR IMAGE ARRAYS
-
20,
0 18-
-
v, 16-14--
w
I -
12-
I-
10-
I 0
5
E
86--
42 -0
0
3
10
15
20
POSITION OF TARGET IN ARRAY
BLURRED IMAGE ARRAYS
I-
I 0
108-
[1L
4
6--
0
5
10
15
20
POSITION OF TARGET IN ARRAY Figure 5 . Search time in seconds is plotted as a function of the target's position in the array for adults (triangles). normal readers (circles). and disabled readers (diamonds). (a) (top panel) Clear image arrays. (b) (bottom panel) Blurred image arrays.
274
CHAPTER 8
Williams, and Dunlap, 1988). Given that the transient system is a likely candidate for the mediation of temporal order, these results suggest that disabled readers may have a slow or sluggish transient system. Transient operations may then consume too much of available processing capacity in disabled readers, leading to a n over-restriction to global processing operations, and difficulty in proceeding to detail processing operations.
Beyond a unitary deficit explanation Although there is considerable evidence that the visual processing of disabled readers is characterized by poor temporal processing, which surfaces in what have been conceptualized a s transient processing operations, it is misleading to consider the visual processing abnormalities under a unitary transient deficit hypothesis. For example, a number of studies have found that the functioning of sustained pattern formation processes are affected by the poor temporal processing of disabled readers. Williams, Brannan and Lartigue (1987) investigated detail processing operations in normal and disabled readers using a visual search task, where subjects were required to search for a target letter embedded in a list of distractor letters. This task requires a scrutiny of local differences between target and distractor items. They found that search times were much longer in disabled readers than in either normal readers or adults (Figure 5a). This difference was diminished, however, when high spatial frequencies (above 15 cycles/degree) were removed from the display by image blurring (Figure 5b). This method of spatial frequency filtering produces 100% contrast reduction a t spatial frequencies above 15 cycles/degree. and reduces the contrast of the remaining high spatial frequencies as well. Such contrast reduction would decrease the amplitude of the high spatial frequency component of visual response, and as a result, it is quite plausible that the latency and/or rise time would be increased as well. If high spatial frequency channels mediate local processing operations, then this manipulation may weaken and/or slow the temporal development of local information. This would reestablish the temporal precedence of global information, and simulate a normal relationship between global and local operations. Thus the over-restriction to global processing and consequential difficulty in the progression to detail processing produced by a sluggish transient system in disabled readers may be overridden by manipulating the relative timing of low and high spatial frequency information. Although the primary visual deficit of disabled readers seems to involve temporal, or transient, processing operations, the sustained pattern formation processes are necessarily affected through a different pattern of interactions over time.
Visual masking studies More direct measures of the time course of visual processing in normal and disabled readers have been obtained in recent visual masking studies (Williams et al.. 1989. 1990a). In visual masking, two temporally separated visual stimuli are presented, and one stimulus, called the mask, interferes with the processing of the other stimulus,
READING DISABILITY
ADULTS
0-0
A-A
275
FOVEAL PERIPHERAL
5t
-240 -180 -120 -60
0
60
120
180
2
DELAY (MSEC) 25
l5
I NORMAL READERS
0-0
t
A-A
FOVEAL PERIPHERAL
54
T
I 1
I
-240 -180 -120 -60
I
I
0
60
I
DELAY (MSEC)
120
I I
180
240
276
CHAPTER 8
n
R
w
25
I DISABLED READERS
a-0 A-A
W
-25-1 -240 -180 I I
I
-120
I
I
I
I
-60
0
FOVEAL PERIPHERAL
I
I
60
120
I I
180
2 LO
D E LAY (M S EC) Figure 6. Masking functions obtained from adult subjects (a), (top panel, preceding page): normal readers (b), (bottom panel, preceding page); and disabled readers (c) under foveal (circles) and peripheral (triangles) viewing. Accuracy (measured a s percent correct) for detecting the target lines when preceded or followed by the masking stimulus at various delays is plotted relative to target lines-alone accuracy level. Negative delays indicate that the mask preceded the target (forward masking), and positive delays indicate that the mask followed the target (backward masking). called the target. Williams et al. (1990a)employed a masking of pattern by light paradigm to measure visual integration and persistence characteristics of normal and disabled readers. Masking of pattern by light is a special case of visual masking where the visibility of the target pattern is reduced by a spatially uniform luminance mask flash that overlaps the target. I t is assumed that target visibility is degraded to the extent that the sensory activity of the target and mask persist and overlap in time or are integrated during a brief temporal interval. Thus masking by light constitutes a measure of the temporal resolution limits of the visual system imposed by either response persistence or response integration. Disabled readers showed more prolonged masking as compared with normal subjects (Figure 6a, b. c), suggesting that visual processing is characterized by a longer integration time and/or longer visual persistence. Disabled readers also showed enhancement effects rather than masking effects when stimuli were presented in the peripheral retina, suggesting that peripheral visual processing is characterized by a disinhibition or enhancement of sustained pattern information due to a diminished inhibitory effect
READING DISABILITY
277
imposed by peripheral transient channels. This disinhibition effect provides additional evidence for the proposal that sustained pattern formation processes are affected by the temporal processing characteristics of disabled readers through an abnormal pattern of interactions over time. a
b.
20
40
60
80
Delay lmrecl
Figure 7 . A schematic U-shaped metacontrast function together with hypothetical visual responses to a target and to a mask. (a) Schematic U-shaped metacontrast function: accuracy is plotted against delay. (b) Hypothetical visual responses. (1) simultaneous onset of target and mask. Transient responses do not overlap with sustained responses, and there is no masking, as shown by the arrow labelled on the left. (2) Target leads the mask by 20 msec. The transient response to the mask slightly overlaps the beginning of the sustained response to the target and some interference occurs. There is a small amount of masking, as shown by arrow 2 on the left. (3) difference in onsets of target and mask produce maximum overlap of transient and sustained components, and thus. the greatest amount of interference. As shown by arrow 3 on the left, this is the point of maximum masking. (4) Target leads mask by 60 msec. The transient response to the mask again only slightly overlaps the sustained response to the target, and the amount of interference is again small. Masking begins to decrease, as shown by arrow 4 on the left. (5)Target leads mask by a long delay. No interference occurs, and from this point on, no masking occurs either.
Additional measures of the time course of visual processing in normal and disabled readers have been obtained by Williams et al. (1989). In this study a metacontrast masking paradigm was used to index processing rate in both foveal and peripheral vision. In metacontrast. a target is briefly presented, and is followed at various delays by a spatially adjacent masking stimulus. Accuracy for the target
278
CHAPTER 8
is measured as a function of the delay between the target and the mask. The time course of the accuracy function is thought to reveal the time course of the processing of the target and mask. The accuracy functions typically obtained in metacontrast experiments are U-shaped. much like the schematic one shown in Figure 7a. Accuracy first decreases, reaches a low point at a n intermediate delay, and then increases again to baseline level. Two-component metacontrast theories (Breitmeyer and Ganz. 1976; Matin, 1975; Weisstein, 1968, 1972; Weisstein et al., 1975) attribute U-shaped metacontrast functions to the interaction of transient and sustained components of visual response. These models posit metacontrast masking as the result of the transient response to the later occurring mask catching up with, and inhibiting, the slower sustained response to the target. For this to occur, the mask must be delayed in time relative to the target. Figure 7b illustrates these timing assumptions. The dip, or lowest accuracy point in the function, is the point of maximum inhibition. As the dip shifts rightward toward longer delays, it could be assumed that some aspect of the transient (inhibitory) response to the mask is traveling faster. This is simply because something that occurs later has to travel faster to catch up. Thus, according to these models, dips at long delays
n
25-
0-ONORMAL A-A DISABLED ADU LTS
W
>
a 1 W
w
+-+
-25! 0 10 30
II
I
1
I
I
I
60
90
120
150
180
210
240
DELAY (MSEC) Figure 8. Metacontrast functions obtained from adults, normal readers, and disabled readers. Accuracy (measured aa percent correct) for detecting the target lines when followed by the masking stimulus a t various delays is plotted relative to target lines-alone accuracy level (horizontal line). which was set at a level between 70 and 80% before each session.
READING DISABILITY
279
between the target and mask imply fast processing, and dips a t short delays imply slower processing. Williams et al. (19891, using diagonal line segments as targets and a surrounding outlined square a s a masking stimulus, obtained the metacontrast functions shown in Figure 8. The differences in dip location in the functions obtained from adults, normal readers, and disabled readers indicated that the rate of foveal visual processing was fastest in normal adults, slowest in reading disabled children, and intermediate in normal reading children. These findings are consistent with previous reports of increased temporal resolution with age (Brannan and Williams, 1988a,b), and sluggish temporal processing in disabled readers, as described above. The magnitude of metacontrast masking increased in the peripheral retina in adults and normal readers (Figure 9a,b), which is consistent with previous reports of increased masking effects in the periphery (Kolers and Rosner. 1960; Stewart and Purcell. 1970; Williams and Weisstein, 1981). There was, however, an absence of metacontrast masking in disabled readers with peripheral presentations (Figure 9c), a finding which is compatible with Gelger and Lettvin's (1987) finding that dyslexic subjects show a smaller magnitude of simultaneous lateral masking in the periphery. Geiger and Lettvin attribute the reduced masking effect to a n attentional strategy of dyslexic subjects to allocate more processing capacity to peripheral as compared to foveal areas of the visual field. An alternate explanation can be derived from the two-component masking theories described above, which attribute metacontrast masking to the inhibition of relatively slow pattern formation processes by short-latency temporal processing channels. These theories would predict that a temporal processing deficit would lead to a n attenuation or elimination of metacontrast.
The role of visual masking in the reading process The implication of these different patterns of response persistence and integration and of transient/sustained interactions in the visual processing of normal and disabled readers can best be understood within the context of the role of masking in the reading process (Breitmeyer and Ganz, 1976; Breitmeyer, 1980, 1983. 1984; Matin et al., 1972). As described above, sustained channel activity is generated during each fixation interval of a fixation-saccade sequence in reading (Figure 1). Due to the long response persistence of sustained channels, this sustained channel activity could interfere, via forward masking by integration, with the sustained activity generated during the following fixation. The finding that disabled readers show longer visual integration times (Figure 6) suggests that this masking effect may be more severe in the disabled readers than in normal readers. The models additionally propose that transient channel stimulation produced by a saccade normally serves to inhibit the trailing persistence of the sustained channels, thus producing clear fixation intervals. Metacontrast masking, then, serves as an afferent neural mechanism for saccadic suppression. Breitmeyer (1980, 1983) has suggested that metacontrast, as a short-range mechanism for saccadic suppression, is too weak in the fovea to produce the amount of
CHAPTER 8
280
25
ADULTS
-
1
-254
!
0-0
5
8
0 10 30
FOVEAL
v
I
II
60
90
1
120
150
I
180
I
210
2 0
DELAY (MSEC)
NORMAL READERS
.-.FOVEAL A-A
-25
0 10 30
60
90
120
150
DELAY (MSEC)
PERIPHERAL
180
210
240
28 1
READING DISABILITY
25
- DISABLED READERS
.-@FOVEAL A-A PERIPHERAL
15t
-15
t 0 1'0 30
60
90
120
150
180
210
240
DELAY (M SEC) Figure 9. Metacontrast functions obtained from adults (a). (top panel, preceding page); normal readers (b). (bottom panel, preceding page); and disabled readers (c) with foveal (circles) and peripheral (triangles) viewing. Accuracy (measured as percent correct) for detecting the target linea when followed by a masking stimulus a t various delays is plotted relative to target lines-alone accuracy level (horizontal line), which was set a t a level between 70 and 80% before each session. inhibition necessary to suppress forward masking by integration effects. He suggests that a n additional long-range suppression mechanism, where peripheral transient mechanisms generated by eye movements inhibit the foveal sustained response, would be required to produce the necessary saccadic suppression. Breitmeyer and Valberg (1979) have, in fact, provided evidence for the existence of such a long-range suppression mechanism. The fact that disabled readers show U-shaped metacontrast functions in the fovea (Figure 8) suggests that transient-on-sustained inhibitory interactions are occurring, and thus the short-range mechanism for saccadic suppression is operational in these readers. The fact that the dip in the metacontrast function of the disabled readers occurs a t a shorter delay than that of the other subject groups suggests that their foveal transient response is sluggish, which may render the metacontrast mechanism less effective in producing saccadic suppression. Increased fixation durations or intersaccade intervals would be required to compensate for a sluggish metacontrast mechanism in order to produce clear fixation intervals, a n effect commonly observed in the reading behavior of disabled readers (Plrozollo. 1979).
CHAPTER 8
282
One major performance difference between disabled and normal readers involved the peripheral transient inhibitory response (Figure 9). with disabled readers falling to show local transient inhibitory operations in the periphery. Since the long-range masking mechanism proposed by Breitmeyer (1980. 1983) depends on the spatial pooling of local transient activity in the periphery, it may be that the long-range mechanism for saccadic suppression is dysfunctional in disabled readers.
The time course of the processing of words Williams, Weisstein, and LeCluyse (1990) investigated how the temporal processing differences observed between normal and disabled readers affect the processing of words. A metacontrast masking procedure was employed to obtain estimates of the time course of the processing of words in these subject groups. Williams et al. used the single letters "S" and "N'as targets. The target letters were presented either alone, with a three letter mask that together with the target formed a word, or followed a t various delays by the three letter mask (Figure 10). Figure 11 shows the metacontrast functions collected on normal and disabled readers.
S
N
RUST
RUNT
RU T
RU T
Figure 10. Target letters alone (top panel), target letters with the three-letter masks that together with the target form words (middle panel), and three-letter mask-Q alone (bottom panel) The points located above the baseline represent enhancement of accuracy for detecting the targets when followed by a mask over accuracy for the target letters when presented alone. Normal readers showed significant enhancement of accuracy for the target letters when followed by a word mask a t simultaneous presentation of the target letter and word mask (0 delay) and a t the shortest delay between the target and mask (30 msec). This enhancement is a manifestation of the enhancement phenomenon known a s the word-letter effect (Johnston and McClelland. 1973: Matthews, Weisstein, and Williams, 1974: Reicher, 1969; Wheeler, 1970). where letters are detected better
READING DISABILITY
283
within the context of words than when presented alone. Disabled readers failed to show a word-letter effect. According to two-component metacontrast theories (Breitmeyer. 1984; Breitmeyer and Ganz. 1976; Matin. 1975;Weisstein, Ozog, and Szoc, 1975).the early portion of the metacontrast function is where the sustained, pattern formation response to the target and mask can interact, suggesting that the enhancement effect found in normal readers is related to the sustained, high spatial frequency component of visual response. Disabled readers did not show the enhancement effect that was found with normal readers, suggesting that the sustained pattern information of the target with a word mask is different in normal and disabled readers.
20 --
, t 02 nc
WORD
0-ONORMAL A-A DISABLED
-30 0
30
60
90
120
150
180
210
240
DELAY (MSEC) Figure 11. Metacontrast functions collected on normal and disabled readers with clear stimulus images. Accuracy for detecting the target letters when followed by a word mask is plotted relative to accuracy for the target letters-alone (horizontal line). Positive accuracy indicates that the mask enhanced the visibility of the target, and negative accuracy indicates that the mask impaired the visibility of the target. Another important aspect of these data is that the metacontrast functions are U-shaped. with maximum masking, or a dip in accuracy, occurring at intermediate delays. The temporal location of the dip occurs at longer delays in the function obtained from normal readers as compared with that of disabled readers, suggesting that the processing of words is slower in disabled readers. Specifically. metacontrast theories would interpret this difference as indicating that the transient
284
CHAPTER 8
response to the word mask has a slower rise time or longer latency in disabled readers. Disabled readers also showed a smaller magnitude of masking at the dip, which is consistent with either a weaker transient response to the word or a more resilient sustained response to the target . Thus the time course of the processing of words appears to be different in normal and disabled readers. Williams, Weisstein. and LeCluyse (1990) also measured the time course of the processing of words using blurred stimulus images to evaluate the contribution of high spatial frequency information. Image blurring was accomplished by covering the monitor screen with a sheet of frosted acetate. By measuring contrast reduction of sine wave gratings on a n oscilloscope screen, it was determined that this method of image blurring produced contrast reduction mainly in the high spatial frequency range. The blurred stimulus images were found to produce a distinctly different set of functions than were obtained with the clear stimulus images (Figure 12). In the function obtained from normal readers, the enhancement effect was eliminated and the dip shifted leftward to a shorter delay. The finding that diminishing high spatial frequency response diminishes the enhancement effect implies that high spatial frequency channels in the visual system are involved in the processing of words. The leftward shift of the dip can be accounted for in one of two ways. According to metacontrast theories. temporal shifts of the dip from one delay to another would result if either the timing of the transient response to the mask or the sustained response to the target changed. Since image blurring presumably leaves the transient component of response unaltered, it seems more likely that the sustained response to the target becomes relatively faster. Accordingly, the word-mask would have to be delayed less in order for it to coincide in time with the target, producing a shift in the maximal masking effect to a shorter delay. The function obtained from disabled readers showed enhancement and a late dip - characteristics evident in the data of normal readers in the clear image condition. The clear image data suggested that the transient response to the word stimulus may be faster in normal than in disabled readers. In disabled readers, there may be a lack of temporal separation and therefore less temporal distinction between transient and sustained responses (Williams et al., 1987) due to a sluggish response from the transient system. Image blurring, which is thought to diminish the contrast of mainly high spatial frequencies (Ikeda and Wright. 1972). may function to decrease the amplitude of the sustained component of visual response, and a s a result, may increase the latency and/or rise time as well. Decreasing the amplitude of the longer-persisting high spatial frequency response may function to eliminate this temporal smearing and to create a temporal separation between transient and sustained components of visual response. The result may be a disinhibition of each component of response, thus allowing for a normal pattern of interactions over time. The fact that there was an increase in the magnitude of masking with image blurring suggests that when sustained activity is diminished, the relationship between the two systems may be restored, and transient-on-sustained inhibition improved. Comparison of the normal reader data obtained with clear stimulus images and the disabled reader
READING DISABILITY
285
data obtained with blurred stimulus images shows quite clearly that image blurring renders the performance of disabled readers comparable to that of normal readers in the processing of words.
+
-20-
W
-30-
a 1
1 I
I
I I
1 1
I
LO
DELAY (MSEC) Figure 12. Metacontrast functions collected on normal and disabled readers with blurred stimulus images. Accuracy for the target letters when followed by a word mask is plotted relative to accuracy for the target letters-alone (horizontal line). Positive accuracy indicates that the mask enhanced the visibility of the target, and negative accuracy indicates that the mask impaired the visibility of the target. Williams, Weisstein. and LeCluyse (1990) investigated the nature of the contribution of transient, low spatial frequency information to the processing of words by observing changes produced by ramping the stimuli. Since the transient component of visual response is more sensitive to high temporal frequencies than to low temporal frequencies, and ramping a stimulus diminishes high temporal frequencies, then ramping a stimulus should diminish the activation of the transient component of visual response. Metacontrast theories attribute U-shaped metacontrast functions to transient versus sustained latency differences, and would predict that an increase in the contribution of sustained relative to transient channel processing should produce a shift in maximal masking effects to shorter delays. Moreover, since the masking effect depends on the inhibitory effect exerted by the transient response to the mask, diminishing the transient component of response should attenuate the overall masking effect. Both predictions were borne out in the functions produced by
286
CHAPTER 8
the word-mask in normal subjects (Figure 13). The temporal functions produced by the word-mask showed an attenuated masking effect and a shift in dip location to shorter delays. Moreover, the masking effect, though diminished, was broader or more prolonged, as would be predicted if the contribution of higher spatial frequency channels, having a longer response persistence, was increased relative to that of the shorter persistence low spatial frequency response. The fact that the data of normal subjects in the ramped condition resembles that of disabled readers in the clear image condition supports the contention that the visual processing of disabled readers is characterized by a slower transient response and increased visual persistence. This finding is consistent with previous studies showing that temporal processing differences between normal and disabled readers disappear when transient system activity is reduced (Slaghuis and Lovegrove, 1984). The data of disabled readers in the ramped condition (Figure 13) is not distinctly different from the clear image condition, indicating that with clear images, the visual processing of disabled readers is already characterized by a slower transient response and increased visual persistence. The masking studies described above provide additional evidence that the visual processing differences observed between normal and disabled readers are related to the relative timing of low and high spatial frequency channels in disabled readers. The results of these studies suggest that a sluggish transient system in disabled readers may result in a lack of temporal separation between transient and sustained processes. The .functioning of the sustained system is affected through a different pattern of interactions over time. image blurring, which was assumed to reestablish normal temporal relationships, also rendered the performance of disabled readers comparable to that of normal readers.
The effect of wavelength on the time course of visual processing Recent psychophysical and physiological data indicate that color or wavelength differentially affect the response characteristics of transient and sustained processing channels. and that wavelength can affect the relative contributions of transient and sustained channels to the processing of a stimulus. Physiological observations of the primate visual system indicate that there are differences in the color selectivity of these systems (Livingstone and Hubel, 19881, and that a steady red background light attenuates the response of transient channels (Dreher. Fukuda, and Rodieck. 1976: Kruger. 1977; Schiller and Magpeli, 1978). A recent investigation by Breitmeyer and Williams (1980) provides evidence that variations in wavelength produce similar effects in the human visual system. They found that the magnitude of both metacontrast and stroboscopic motion was decreased when red as compared with equiluminant green or white backgrounds were used. According to transient-sustained theories of metacontrast and stroboscopic motion, these results indicate that the activity of transient channels is attenuated by red backgrounds. Williams, Breitmeyer, and Lovegrove (1990). using a metacontrast paradigm, additionally found that the rate of processing in transient channels increases a s
READING DISABILITY
287
wavelength decreases, and that red light enhances the activity of sustained channels. Other human psychophysical studies have shown that transient channels are not sensitive to changes in hue when luminance transients are not also present (Bowen et al.. 19771, and that the time course of visual processing is wavelength-specific (Walters, 1970; Foster, 1979).
W
% 0
QL
20 --
WORD 0-0
A-A
-304 0
I 1
30
60
90
NORMAL DISABLED
I
I
I
120
150
180
210
240
DELAY (MSEC) Figure 13. Metacontrast functions collected on normal and disabled readers with ramped stimulus images. Accuracy for the target letters when followed by a word mask is plotted relative to accuracy for the target letters-alone (horizontal line). Positive accuracy indicates that the mask enhanced the visibility of the target, and negative accuracy indicates that the mask impaired the visibility of the target. Solman, Dain, and Keech (1990) recently investigated the effect of color on the visual processing characteristics of reading disabled as compared with normal subjects. Measuring color mediated contrast sensitivity in normal and disabled readers, they found that there is a decrement in contrast sensitivity with increasing spatial frequency in disabled but not normal readers when colored versus noncolored filters were used. It was postulated that restricting the wavelength of the incoming light reduces the level of sustained activity. This manipulation thus may have compensated for a transient system deficit by limiting the activity in the sustained system, thus improving transient-on-sustained inhibition. Williams, Faucheux and LeCluyse (1990) utilized a metacontrast paradigm to obtain direct measures of the effects of color on temporal
CHAPTER 8
288
visual processing in normal and disabled readers. Using white diagonal lines as targets, and a white, red, or blue 12 c/deg flanking grating as a mask, Williams et al. obtained the metacontrast functions shown in Figures 14 and 15. Normal readers showed differences in both enhancement and dip location with the different colored masks (Figure 14). The fact that the delay of maximum masking occurred a t a shorter
-20
--
-30 - _i
1
I
1
I
1
DELAY (MSEC) Figure 14. Metacontrast functions collected on normal readers with masks varying in wavelength. Accuracy for the target lines when followed by a flanking grating mask is plotted relative to accuracy for the target lines-done (horizontal line). Positive accuracy indicates that the mask enhanced the visibility of the target, and negative accuracy indicates that the mask impaired the visibility of the target. delay for the red as compared with the other masks suggests that the processing rate in transient channels is slowest for the red masks. This finding may be related to previous findings that red light inhibits the activity of transient channels (Dreher. Fukuda. and Rodieck, 1976; Breitmeyer and Williams, 1990). Along the same lines, the fact that the delay of maximum masking occurred a t a longer delay for the blue as compared with the other masks suggests that blue light may enhance the processing rate in transient channels. Next, consider the differences found in the magnitude of masking a t the dips in the functions. Again, this is the point in the metacontrast function where the transient response to the target maximally overlaps with, and inhibits, the sustained response to the target. Since the target was always the same, differences in the
READING DISABILITY
289
magnitude of masking at the dip can be attributed to differences in the response magnitude of transient channel activity generated by the mask. The fact that there was a smaller magnitude of masking with the red as compared with the longer wavelength masks suggests that transient channels respond less vigorously to short wavelength stimuli.
0-
- -0BLUE
> -
+ Q
W + J 0 2-30 -
QL
0
30
60
90
120
150
180
210
2
DELAY (MSEC) Figure 15. Metacontrast functions collected on disabled readers with masks varying in wavelength. Accuracy for the target lines when followed by a flanking grating mask is plotted relative to accuracy for the target lines-alone (horizontal line). Positive accuracy indicates that the mask enhanced the visibility of the target, and negative accuracy indicates that the mask impaired the visibility of the target. At simultaneous presentation of the target and mask, target identification accuracy was enhanced over the accuracy level for the targets when they were presented alone. This finding is consistent with previous reports of contextual information enhancing the detectability of briefly presented targets (Weisstein and Harris. 1974: Williams and Weisstein. 198 1, 1984). According to masking models based on transient/sustained theory, this is the part of the function where the sustained components of response to the target and mask can interact. Since the enhancement effect varied with the wavelength of the mask, it appears that the sustained component of visual response is sensitive to variations in wavelength. The results indicate that the sustained channels respond with greater sensitivity to red light as compared with blue and white light.
290
CHAPTER 8
Disabled readers also showed differences in dip location and magnitude of masking with the wavelength of the mask (Figure 15). Overall, dip locations occurred a t shorter delays for disabled as compared with normal readers, suggesting that the processing rate in transient channels is slower in disabled readers. As with the normal readers, however, the processing rate in transient channels appears to be slowest with the red mask and fastest with the blue mask. Disabled readers generally showed a smaller magnitude of masking than normal readers, again suggesting that, overall, transient channels respond less vigorously in disabled readers. As with the normal readers, there was a smaller magnitude of masking with the red as compared with the longer wavelength masks, suggesting that transient channels respond less vigorously to short wavelength stimuli. Finally, it is interesting to note that the function produced by the blue mask in disabled readers is similar in time course to the function produced by the white mask in normal readers. This finding suggests that blue light produces a normal time course of processing in disabled readers, and is consistent with the contention that blue light may enhance the processing rate in transient channels.
The effect of image blurring and wavelength on reading performance Given the systematic effects of image blurring and color on the perceptual performance of the reading disabled, and the fact that these manipulations can render their performance comparable to that of normal readers, Williams, LeCluyse and Faucheux (1990) investigated the effects of image blurring and color on actual reading performance. To assess reading performance. reading comprehension was measured for standardized reading passages under three temporal presentation conditions. In the first condition, the passages were presented one word a t a time, each word being centered on a computer monitor. In this reading condition, eye movements were not required for successful reading. In the second condition, the passages were presented one line a t a time. with the words painted from left to right in a moving window fashion. In this condition, eye movements were required, but were guided by the presentation of the text. In the third condition, the passages were presented one line at a time, with all of the words in each line being painted simultaneously. This was a free eye movement condition: eye movements were required and were under each subject's control. The grade level and presentation rate of the passages were determined by each subject's performance on a standardized reading test. The passages were presented with white, red, blue, and white-blurred text on a black background in separate blocks. Figure 16 shows the reading comprehension scores obtained with clear and blurred white text. When the passages were presented with clear images, the three presentation conditions did not have differential effects on the reading comprehension of normal readers. Disabled readers, on the other hand, performed more poorly in the free eye movement condition than in the other conditions. When the passages were presented with blurred images, the performance of normal readers deteriorated as eye movement control became more difficult Disabled readers showed a n opposite trend: their performanc improved in the free eye movement condition, again demonstrating t l
READING DISABILITY
29 1
n
K
NORMAL READERS
u
Z 0 -
.-.CLEAR A-A
90 --
BLURRED
v,
Z
W
I W
Qi
Q
I 0
0
50
!
I 1
I I
NONE
I
GUIDED
FREE
EYE MOVEMENTS n
K
DISABLED READERS
U
Z
0 v,
90 --
T
Z
80 --
W
70 --
w I Qi
n
I 0 0
0-@CLEAR A-A BLURRED
60 --
50
!
I
I I
NONE
GUIDED
I
FREE
EYE MOVEMENTS Figure 16. Reading comprehension (measured as percent correct on literal recall questions) for clear white text (circles) and blurred white text (triangles) under three temporal presentation conditions. (a) (top panel) Normal readers. (b) (bottom panel) Disabled readers.
292
CHAPTER 8
beneficial effects of image blurring on the performance of this subject group. Figure 17 shows the reading comprehension scores obtained with the white as compared with colored text. The pattern of results with the blue text is very similar to that found with the blurred text. The red text, on the other hand, had a detrimental effect on the reading performance of both normal and disabled readers. These results can best be interpreted within the model of fixation-saccade processes in reading described above (Breitmeyer and Ganz. 1976: Breitmeyer, 1980. 1983: Matin et al., 1972). First, the blur manipulation improved the reading comprehension performance of normal readers in the no eye movement condition, and decreased their performance in the free eye movement condition. Under normal circumstances, when no eye movements are involved in reading, there would be a lack of transient inhibition generated by eye movements, and thus a greater degree of masking by integration from one fixation to the next. Image blurring, which may function to diminish the trailing persistence of high spatial frequency channels, may serve to diminish this masking effect and thus render the reading process more efficient. Thus the reading performance of normal readers actually improves with blurred images in this condition. Similarly, the reading performance of normal readers may improve in this condition when blue text is used if the use of a short wavelength stimulus results in a faster, more vigorous response from the transient channels, resulting in improved transient-on-sustained inhibition (Williams. Breitmeyer, and Lovegrove, 1990: Williams, Faucheux and LeCluyse. 1990). Again, improving transient-on-sustained inhibition would function to diminish the trailing persistence of high spatial frequency channels. Finally, the deterioration of the reading performance of normal readers in this condition when red text is used may be due to an increased masking by integration effect. This increased masking by integration effect would be the result of an increase in sustained channel sensitivity and weaker transient-on-sustained inhibition produced by a long wavelength stimulus (Williams, Breitmeyer, and Lovegrove, 1990; Williams, Faucheux and LeCluyse. 1990). When eye movements are involved, image blurring and color may disrupt the normal pattern of transient-sustained interactions due to the fact that the timing and sensitivity of transient and sustained channel response are altered. Thus a decrement in performance is observed in the eye movement conditions with both color and the blur manipulation. Disabled readers performed poorly in the free eye movement condition when clear, white text was used, but performed better in the conditions requiring no eye movements or guided eye movements. In the latter conditions it could be assumed that transient activity was minimized. Disabled readers showed a n improvement in reading performance with blurred images when eye movements were required, again suggesting that the image blurring manipulation may be reestablishing a normal pattern of transient-sustained interactions in visual processing. When red text was used, disabled readers showed a smaller decrement in performance in the no eye movement condition than normal readers showed. This result may be related to the finding that disabled readers do not show the increase in sustained channel sensitivity that normal readers show with red stimuli (as indexed by an
READING DISABILITY n
R W
100
Z
0 -
90
Z
80 --
v)
W
0-0
A-A
-- B-B
WHITE BLUE RED
293
NORMAL READERS
I
W
CY
70 --
0
60 --
e I 0
50
n
R W Z
-
0 v)
Z
W
I
100-
WHITE BLUE 90-'B-RRED
0
60 --
0
I
DISABLED READERS
80 --
70 --
e I
0-0
II
A-A
CY
W
I
50
I
I 1
1
Figure 17. Reading comprehension (measured as percent correct on literal recall questions) for clear white text (circles), blue text (triangles), and red text (squares) under three temporal presentation conditions. (a) (top panel) Normal readers. (b) (bottom panel) Disabled readers.
294
CHAPTER 8
enhancement effect in the metacontrast function, Figure 15). although they do show a less vigorous response from transient channel activity (as indexed by a smaller magnitude of masking, Figure 15). The increased masking by integration effect, therefore, in disabled readers may be smaller in this condition, resulting in a smaller decrement in performance. When blue text was used, the effects on reading performance were similar to those produced by the blur manipulation, suggesting that the use of the short wavelength stimulus simulates a normal pattern of visual processing during reading. In a related study, Lovegrove and Macfarlane (1990) measured the number of errors, comprehension, and reading rate under the same three temporal presentation conditions. Here it was found that disabled readers made fewer errors in the no eye movement condition than controls, and more errors in the free eye movement condition. Furthermore, reading rate increased in the no eye movement condition as compared with the free eye movement condition in disabled readers, and there was a trend, although insignificant, towards higher comprehension scores in the no eye movement condition. Lovegrove and Macfarlane interpreted their results within the framework of a central-peripheral processing dichotomy. The no eye movement condition required the processing of only central (foveal) information, whereas the free eye movement condition required the integration of central and peripheral information from successive fxations. When disabled readers only have to process central information, which requires little transient system involvement, they make fewer errors and read at a faster rate than in the condition that approximates normal reading. This improvement increased over a five session training period, but no increase in reading age on a post-test of reading ability was noted. There were several differences between Williams et a l . 3 and Lovegrove and Macfarlane's studies that may account for the fact that normal a n d disabled readers did not show differences in comprehension scores in the latter study. Most importantly, in Lovegrove and Macfarlane's study reading rate was increased on each trial that subjects correctly answered 80% of the comprehension questions posed, while in Williams et a1.k study reading rate remained constant. Thus, improvement in comprehension would have been obscured by a speed-accuracy tradeoff in the former study. Williams, LeCluyse and Faucheux (1990) also measured several characteristics of eye movement patterns for reading passages presented one page a t a time. The image blurring manipulation was found to increase reading rate and decrease span of apprehension (number of characters processed in each fixation) in disabled readers. This manipulation decreased the span of apprehension in normal readers, but had no effect on reading rate. This result may be related to the finding that disabled readers show a lack of metacontrast masking effects in peripheral vision (Williams et al., 1989). I t may be that, with clear stimulus images, this lack of peripheral inhibition results in some degree of interference of peripheral visual information on the reading process. Decreasing the amount of peripheral information available in each fixation may diminish the degree of interference experienced, resulting in an increase in the efficiency of the reading process as indexed by increased reading rate.
READING DISABILITY
295
Remaining questions A number of important questions concerning reading disability remain. First, does a transient system deficit exist before children begin to read or is it produced because disabled readers are delayed in reading acquisition? A corollary question is whether subsequent reading performance can be predicted on the basis of visual measures. To answer this, Lovegrove et al. (1986b) measured contrast sensitivity in 123 kindergarten pre-readers with a mean age of 5 years, 11 months. The sample included all children in a particular classroom, thus ensuring a broad sample of school achievement. Contrast thresholds were determined using 2 and 4 c/deg gratings with exposure durations of 350 msec. Vocabulary and digit span performance were also measured. Two years later the reading performance of these children was assessed. The extent to which the vocabulary, digit span, and contrast sensitivity scores predicted reading ability 2 years later was determined using a stepwise multiple regression procedure. On the first step, only contrast sensitivity was entered, with the multiple R = 0.27 (p<.Ol). Contrast sensitivity, therefore, was a significant predictor of reading ability. Vocabulary was entered on the second step, making the multiple R = 0.35. On the third step, digit span was entered, increasing the multiple R to 0.40. The finding that the visual measures predict later reading ability better than the vocabulary test suggests a very close link between visual processing and reading ability. On the basis of these studies, a low level visual processing difference between future good readers and poor readers is present before formal reading instruction begins. The visual deficits are not the result of a disability in reading. Other studies have considered whether the visual deficits associated with reading disability are the result of a developmental lag which corrects with age. Brannan and Williams (1988a) measured flicker thresholds for homogeneous fields of light in 8. 10.and 12 year old normal and disabled readers. A multiple regression analysis of their data was performed using reading level, age, and flicker threshold, and showed that flicker threshold accounted for 56% of the unique variance in reading level, with age contributing another 10%. Post hoc comparisons of flicker detection performance were made between normal and disabled readers matched for reading level (e.g., 10 year old normal readers and 12 year old disabled readers reading a t the same grade level). Flicker sensitivity differences remained when the children were matched for reading ability, indicating that the differences between the groups in flicker sensitivity were not due primarily to differences in reading skill: there was some fundamental difference in visual processing between normal and disabled readers. Brannan and Williams (1988b) obtained similar results with measures of perceptual grouping and temporal order judgments; performance on these perceptual tasks was not found to be due to differences in reading skill, but to a fundamental temporal processing difference between the groups. Questions may also be raised about the possibility that reading disabled children also have attentional deficits. Attentional deficits
296
CHAPTER 8
could result in poor test sensitivity and therefore could account for an apparent transient system deficit. There are two major considerations that argue against this explanation for the experimental results. In addition to the fact that in many of the studies testing was conducted with simple, nonverbal stimuli using a two-alternative forced choice paradigm, there were clear instances when the reading disabled subjects performed better or were more sensitive than controls. For example, disabled readers were shown to have slightly higher contrast sensitivity than controls to high spatial frequencies (Figure 3). Furthermore, Williams and Bologna (1985) found that providing the optimal processing strategy in a selective attention task does not eliminate performance differences between normal and disabled readers, suggesting that the performance differences are attributable to automatic rather than directed attention factors. Additionally, although normal and disabled readers differ in their ability to allocate attention across visual space, their performance in a target detection task is equivalent when prior allocation of attention is not a factor (Brannan and Williams, 1987). In the metacontrast studies reported here, the location of the target was known and did not vary. As such, selective attention mechanisms would be concentrated on a single spatial location, and transient activity in shifting or allocating attention would not be involved. Thus differences between subject groups in the integrity of the transient system a s related to the ability to flexibly allocate and direct attention would not enter into task requirements. It is possible, however, that automatic attention processes, being mediated by the transient system, are involved in the performance differences reported here. Finally, the question remains as to how a transient system deficit in disabled readers relates to the observation that disabled readers frequently have various forms of language deficiency. Some of these language based difficulties are phonological coding deficits, phonemic segmentation deficits, poor vocabulary development, and difficulty in discriminating grammatical and syntactic differences among words and sentences, One possible involvement of a transient system deficit in language deficiencies can be seen by considering the problems that disabled readers would have in integrating the inputs from successive fixations. As described above, in normal readers this problem could be solved by the transient system inhibiting the trailing persistence of sustained responses generated during each fixation. In disabled readers, there is a probability that inputs from successive inputs would be superimposed, making the task of reading confusing. This could manifest itself in a number of ways, Disabled readers may see only parts of words, and if they did not know which fixation the information came from, they could know very little about the spatial arrangement of the letters. This may lead to reading errors and word or letter reversals. In addition, depending on exactly where the reader fixated each time, the amount and type of masking or interference which occurred may vary from one reading task to another. As a result, it would be diffkult to learn any systematic grapheme-to-phoneme rule if the appearance of the graphemes was in some way unstable. It has been found that many visually disabled readers also demonstrate a phonological coding deficit in the form of performance on a nonsense word test of non-words that followed regular grapheme-phoneme rules (Lovegrove et al., 1986).
READING DISABILITY
297
Consequently it is likely that many disabled readers can have both a visual processing deficit and some form of language difficulty.
Conclusions Research conducted over a number of years has demonstrated a transient system deficit in a majority of specifically disabled readers tested. Differences have been found between normal and disabled readers in measurements of visible persistence, pattern contrast sensitivity, and temporal or flicker contrast sensitivity. The foveal temporal processing of disabled readers has been found to be sluggish compared to that of normal readers. Peripheral visual processing has been found to be characterized by a lack of inhibitory processes. Differences in the temporal processing of normal and disabled readers surface in the processing of both simple stimuli and words. The primary deficit appears to be in the response properties of the transient subsystem of the visual system, although the response of the sustained system may be affected through a different pattern of interactions with the transient system over time. Measures of sustained system function that presumably do not involve a dynamic interaction between transient and sustained systems fail to demonstrate visual processing differences between normal and disabled readers. The temporal processing differences between the groups have been found to have consequences for the processing of words and for reading performance. Image blurring and color have the effect of reestablishing a normal pattern of visual processing in disabled readers. A transient system deficit was found in approximately 75% of the disabled readers tested, seems to be present before reading instruction, and predicts reading ability reasonably well.
References Badcock, D. and Lovegrove. W. (1981). The effects of contrast, stimulus duration, and spatial frequency on visible persistence in normal and specifically disabled readers. Journal of Experimental Psychology: Human Perception and Performance. 7 . 495-505. Benton, A. (1962). Dyslexia in relation to form perception and directional sense. In J. Money (Ed.), Reading Disability: Progress and Research Needs in Dysleuh. Baltimore: Johns Hopkins Press. Benton. A41975). Developmental dyslexia: Neurological aspects. In W.J. Freidman (Ed.). Advances in Neurology. Vol 1 . Current Reviews of Higher Order Nervous System Dysfunction. New York: Raven Press. Bowen. R.. Pokorny, J. and Cacciato. D. (1977). Metacontrast masking depends on luminance transients. Vision Research, 17, 971-975. Bowling, A.. Lovegrove. W. and Mapperson, B. (1979). The effect of spatial frequency and contrast on visible persistence. Perception. 8 , 529-539. Brannan, J.R. and Williams, M. (1987). Allocation of visual attention in good and poor readers. Perception and Psychophysics. 41. 23-28. Brannan, J.R. and Williams, M. (1988a). The effects of age and reading ability on flicker thresholds. Clfnical Vision Sciences, 3(2).137-142. Brannan, J.R. and Williams, M. (1988b). Developmental versus sensory deficit effects on perceptual processing in the reading disabled. Perception and Psychophysics. 44(5). 437-444.
298
CHAPTER 8
Breitmeyer. B. (1975). Simple reaction time a s a measure of the temporal response properties of transient and sustained channels. Vision Research 15. 1411-1412. Breitmeyer, B. (1980). Unmasking visual masking: A look a t the "why" behind the veil of the "how." Psychological Review. 87. 52-69. Breitmeyer, B. (1983). Sensory masking, persistence, and enhancement in visual exploration and reading. In Rayner K. (Ed.), Eye Movements in Reading: Perceptual and Language Processes. New York: Academic Press. Breitmeyer. B. (1984). Visual Masking: An Interactive Approach. Oxford: Oxford University Press. Breitmeyer. B. and Ganz. L. (1976). Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing. Psychological Review, 83, 1-36. Breitmeyer. B. and Ganz. L. (1977). Temporal studies with flashed gratings: Inferences about human transient and sustained channels. Vision Research, 17, 861-865. Breitmeyer, B.. Levi, D.M. and Harwerth. R.S. (1981). Flicker-masking in spatial vision. Vision Research, 21, 1377-1385. Breitmeyer, B. and Valberg, A. (1979). Local, foveal, inhibitory effects of global, peripheral excitation. Science. 203, 463-465. Breitmeyer. B. and Williams, M. (1990). Effects of isoluminantbackground color on metacontrast and stroboscopic motion: Interactions between sustained (P) and transient (M) channels. Vision Research. Burbeck, C.A. (1981). Criterion-free pattern and flicker thresholds. Journal of the Optical Society of America, 71. 1343-1350. Critchely. M. (1964). Developmental Dyslexia. Heinemann. London. Dreher, B., Fukuda, Y. and Rodieck, R. (1976). Identification, classification and anatomical segregation of cells with X-like and Y-like properties in the lateral geniculate nucleus of old-world primates. Journal of Physiology, 258, 433-452. Fletcher, J. and Satz, P. (1979a). Unitary deficits hypothesis of reading disability: Has Vellutino led u s astray'? Journal of Learning Disabilities. 12(3).155-159. Fletcher, J. and Satz, P. (1979b). Has Vellutino led u s astray? A rejoinder to a reply. Journal of Learning Disabilities 12. 168-171. Foster, D. (1978). Interactions between blue- and red-sensitive colour mechanisms in metacontrast masking. Vision Research, 19, 92 1-931. Gelger. G. and Lettvin. J . (1987). Peripheral vision in persons with dyslexia. New England Journal of Medicine, 316. 1238-1243. Ikeda, H. and Wright, M. (1972). Receptive field organization of "sustained" and "transient" retinal ganglion cells which subserve different functional roles. Journal of Physiology, 227. 769-800. Johnston, J.C. and McClelland. J.L. (1973). Visual factors in word perception. Perception and Psychophysics. 14. 365-370. Kolers. P. and Rosner. B. (1960). On visual masking (metacontrast): Dichoptic observations. American Journal of Psychology, 73, 2-2 1. Kruger, J. (1977). Stimulus dependent color specificity of monkey lateral geniculate neurones. Experimental Brain Research, 30, 297-3 11.
READING DISABILITY
299
Kulikowski, J.J. and Tolhurst. D.J.(1973). Psychophysical evidence for sustained and transient detectors in human vision. Journal of Physiology London, 232, 149-162. Lennie, P. (1980). Parallel visual pathways: A review. Vkbn Research, 20. 561-594. Livingstone. M.S. and Hubel, D.H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7 , 3416-3468. Livingstone. M . S . and Hubel. D.H. (1988). Segregation of form, color. movement and depth: Anatomy. Physiology and perception. Science. 240. 740-749. Lovegrove. W., Billing, G. and Slaghuis. W. (1978). Processing of visual contour orientation information in normal and disabled reading children. Cortex, 14, 268-278. Lovegrove, W., Bowling, A., Badcock, D., and Blackwood, M. (1980a). Specific reading disability: differences in contrast sensitivity a s a function of spatial frequency. Science. 210, 439-440. Lovegrove, W.. Bowling, A.. Slaghuis, W.. Geeves, E.. and Neison. P. (1986b). Contrast sensitivity scores of pre-readers as predictors of reading ability. Perception and Psychophysics. 40, 440-444. Lovegrove, W. and Brown, C. (1978). Development of information processing in normal and disabled readers. Perceptual and Motor Skills, 46, 1047-1054. Lovegrove. W., Heddle, M and Slaghuis, W. (1980b). Reading disability: spatial frequency specific deficits in visual information store. Neuropsychologica. 18. 111-115. Lovegrove. W. and Macfarlane, T. (1990). How can we help SRDs in learning to read? Unpublished honors thesis. Lovegrove, W., Martin, F., Bowling, A., Blackwood, M., Badcock, D. and Paxton, S. (19821. Contrast sensitivity functions and specific reading disability. Neuropsychologica. 20. 309-315. Lovegrove. W.. Martin, F. and Slaghuis, W. (1986). A theoretical and Experimental case for a visual deficit in specific reading disability. Cognitive Neuropsychology, 3. 225-267. Martin, F. and Lovegrove, W. (1984). The effects of field size and luminance on contrast sensitivity differences between specifically reading disabled and normal children. Neuropsychologica. 22, 73-77. Martin, F. and Lovegrove, W. (1987). Flicker contrast sensitivity in normal and specifically disabled readers. Perception, 16. 215-221. Matin. E. (1974). Saccadic suppression: A review and analysis. Psychological Bulletin, 81. 899-915. Matin, E. (1975). The two-transient (masking) paradigm. Psychological Review. 82, 451-461. Matin. E.. Clymer. A. and Matin, L. (1972). Metacontrast and saccadic suppression. Science, 178, 179-182. Matthews, M., Weisstein. N. and Williams, A. (1974). Masking of letter features does not remove the word-superiority effect. Bulletin of the Psychonomic Society. 4, 262. Maunsell. J.H.R. (1987). Physiological evidence for two visual subsystems. In L. Vaina (Ed.). Matters of Intelligence. Amsterdam: Reidel. May, G., Wil1iams.M. and Dun1ap.W. (1988). Temporal order judgements in good and poor readers. Neuropsychologica, 26(6). 917-924.
300
CHAPTER 8
Meyer, G. and Maguire. W. (1977). Spatial frequency and the mediation of short-term visual storage. Science, 198, 524-525. Pirozzolo, F. (1978). The neuropsychology of developmental reading disorders. New York: Praeger. Rayner, K and McConkie, G. (1976). What guides a reader's eye movements? Vision Research, 16. 829-837. Reicher, G.M. (1968). Perceptual recognition a s a function of meaningfulness of stimulus material. Journal of Experimental Psychology. 61, 275-280. Schiller. P. and Malpeli. J. (1978). Functional specificity of lateral geniculate nucleus laminae of the rhesus monkey. Journal of Neurophysiology, 41, 766-767. Singer, W. and Bedworth, N. (1973). Inhibitory interaction between X and Y units in cat lateral geniculate nucleus. Brain Research, 4 8 . 291 -307. Slaghuis. W. and Lovegrove. W. (1984). Flicker masking of spatial frequency dependent visible persistence and reading disability. Perception, 13. 527-534. Slaghuis, W. and Lovegrove, W. (1985). Spatial-frequency mediated visible persistence and specific reading disability. Brain and Cognition, 4. 238-240. Solman. R., Dain, S. and Keech. 5. (1990). Colour mediated contrast sensitivity in disabled readers. Submitted. Stanley, G. (1975). Visual memory processes in dyslexia. In D. Deutsch and J . Deutsch (Eds.). Short-term memory. New York: Academic Press. Stanley, G. and Hall, R. (1973a). A comparison of dyslexics and normals in recalling letter arrays after brief presentations. British Journal Experimental Psychology, 43, 301-304. Stanley, G. and Hall, R. (1973b). Short-term visual information processing in dyslexia. Child Development, 44. 841-844. Stewart, A. and Purcell. D. (1970). U-shaped masking functions in visual backward masking: Effects of target configuration and retinal position. Perception and Psychophysics. 7 , 253-256. Stone J., Dreher B. and Leventhal A.G. (1978). Hierarchical and parallel mechanisms in the organization of the visual cortex. Brain Research Review, 1, 345-354. Tolhurst, D.J. (1973). Separate channels for the analysis of the shape and movement of a moving visual stimulus. Journal of Physiology London. 231, 395-402. Tolhurst, D.J. (1975). Reaction times in the detection of gratings by human observers: A probabilistic mechanism. Vision Research, 15, 1143-1148. Vasselev. A. and Miltov. D. (1976). Perception time and spatial frequency. Vision Research, 16, 28-32. Vellutino, F.. Steger. J.. DeSetto. L. and Phillips, F. (1975a). Immediate and delayed recognition of visual stimuli in poor and normal readers. Journal of Experimental Child Psychology, 223-232. Vellutino, F.. Steger, J., Kaman. M. and DeSetto. L. (1975b). Visual form perception in deficient and normal readers as a function of age and orthographic-linguistic familiarity. Cortex, 11. 22-30. Vellutino, F. (1977). Alternative conceptualizations of dyslexia: evidence in support of verbal-deficit hypothesis. Harvard Educational Review, 47. 334-354.
READING DISABILITY
301
Vellutino. F. (1978a). The validity of perceptual deficit explanations of reading disability: A reply to Fletcher and Satz. Journal o f k a r n i n g Disabilities, 12(3),160-167. Vellutino, F. (1978b). Dyslexia: Theory and Research. London: M.I.T. Press. Vellutino, F. (1987). Dyslexia. Scienti& American, 256(31. 34-4 1. Walters, J . (1970). Metacontrast: The effects obtained with consecutively presented concentric disks and rings of different wavelengths. American Journal of Optometry, 46, 634-638. Watson, A.B. and Nachmias, J. (1977). Patterns of temporal interaction in the detection of gratings. Vision Research, 17. 193-202. Weisstein, N. (1968). A Rashevsky-Landahl neural net: Simulation of metacontrast. Psychological Review, 75. 484-521. Weisstein. N. (1972). Metacontrast. In D. Jameson and L.M. Hurvich (Eds.). Handbook of sensory physiology, Vol. 7/4. Visual Psychophysics, pp. 233-72. Weisstein, N. and Harris, C. (1974). Visual detection of line segments: an object-superiority effect. Science, 186. 752-755. Weisstein. N.. Ozog. G. and Szoc, R. (1975). A comparison and elaboration of two models of metacontrast. Psychological Review, 82, 325-342. Wheeler, D.D. (1970). Processes of word recognition. Cognitive PSyChOlogy, 1, 58-65. Williams, M. and Bologna, N. (1985). Perceptual grouping in good and poor readers. Perception and Psychophysics, 33,367-374. Williams, M.C.. Brannan, J.R.. and Lartigue, E. K. (1987). Visual search in good and poor readers. Clinical Vision Sciences, 1. 367-371. Williams, M., Breitmeyer, B. and Lovegrove. W. (1990). Metacontrast with masks varying in spatial frequency and wavelength. Submitted to Vision Research Williams, M.. Faucheux, A. and LeCluyse. K. (1990).The time course of the processing of stimuli with different wavelengths in normal and disabled readers. Submitted. Williams, M.. LeCluyse. K. and Bologna, N. (1990).Masking by light as a measure of visual integration time and persistence in normal and disabled readers. Clinical Vision Sciences. Williams, M. and LeCluyse. K. (1990). Perceptual consequences of a temporal processing deficit in reading disabled children. Journal of the American Optometric Association, 61, 1 11- 121. Williams, M.. LeCluyse. K. and Faucheux. A. (1990). Effects of image blurring and color on reading performance in normal and disabled readers. In preparation. Williams, M.. Molinet, K. and LeCluyse, K. (1989). Visual masking as a measure of temporal processing in normal and disabled readers. Clinical Vision Sciences, 4(2),137-144. Williams, M. and Weisstein, N. (1980). Perceptual grouping produces spatial-frequency specific effects on metacontrast. Investigative Opthalmology and Visual Science, 165. Williams, M. and Weisstein. N. (1981). Spatial frequency response and perceived depth in the time-course of object-superiority. Vision Research, 21,631-646.
302
CHAPTER 8
Williams, M. and Weisstein. N (1984). The effect of perceived depth on metacontrast functions. Vision Research, 24(10). 1278-1282. Williams, M., Weisstein, N. and LeCluyse, K. (1990). The time course of the processing of words in normal and disabled readers. Submitted.
Applications of Parallel Processing in Vision I. Brannan (Editor) 0 1992 Elsevier Science Publishers R.V. All rights reserved
303
How Can the Concept of Parallel Channels Aid Clinical Diagnosis? M. FELICE GHILARDI, MARC0 ONOFRJ. and JULIE R. BRA"AN
Introduction In a clinical setting, electrophysiological and psychophysical techniques can be successfully employed to assess parallel processing in several diseases which affect the human visual system. In particular, visual evoked potentials (VEPs) elicited by patterned stimuli can provide new insights into the pathophysiology, diagnosis, and management of a number of different neurological a n d neuro-ophthalmological disorders (e.g., glaucoma, optic neuropathy, maculopathy, multiple sclerosis, and Parkinson's disease). If appropriate stimulus parameters are provided, the VEP can provide valuable information on the function of separate channels of the visual pathway. The most conspicuous peaks of the human transient VEP are an early negativity a t around 70 msec (N70).and a following positivity at about 100 msec (P100). These two components probably reflect different processes in the visual system, since they respond differently to manipulation of stimulus characteristics. For instance, a stimulus with spatial frequency below 1 cycle per degree (cpd) generates a small N70 and a large P100; as spatial frequency increases, the contribution of N70 and PlOO gradually reverse, with N70 dominating the response a t higher spatial frequencies. Furthermore, N70 and PlOO can be differentially impaired in diseases which randomly affect the structures of the visual system, such as multiple sclerosis. Pharmacological manipulation can selectively affect a VEP obtained with one stimulus, but not another: the VEP of patients with Parkinson's disease and schizophrenia exhibits spatial frequency dependent abnormalities, related to dopaminergic deficiency. Lastly, in ocular hypertension and glaucoma abnormal responses to stimuli with low contrast and low spatial frequency seem to indicate early damage of magnocellular pathways. In this chapter, we will describe how considering visual parallel processes can improve our understanding of many diseases, and enhance the accuracy of clinical diagnosis.
304
CHAPTER 9
Stimulus characteristics of the VEP used in clinical studies In the majority of clinical studies, checkerboards modulated in counterphase a t low (less than 4 Hz) temporal frequencies have been used. A s stimulus contrast is increased from zero. VEP amplitude progressively increases until about 30% contrast, when it becomes constant (Spekreijse et al.. 1973). m i c a l l y . for clinical purposes VEP testing is performed at high contrast, thus primarily addressing parvo elements. Obtaining low contrast VEP is diagnostically useful both in multiple sclerosis by revealing early foveal defects (Kupersmith and Siegel, 1988) and in the early diagnosis of glaucoma (Quigley and Hendrickson. 1984). Counterphase presentation, or pattern reversal, is produced by modulating the luminance of adjacent checks or bars by waveforms that differ in phase by 1 8 0 O . If the temporal frequency of the stimulus is F Hz, the pattern reversal VEP will have a fundamental frequency of 2F Hz and the following even harmonics. The reasons for choosing checks rather than sinusoidal gratings vary from historical to practical and cost-related considerations: checks were the stimuli employed in the first clinical study published on pattern VEP (Halliday et al.. 1972); N70 and PlOO peaks are more defined and have higher amplitude (thus, easier to measure) when a checkerboard with high contrast is used; additionally, most of the companies which produce equipment for clinical neurophysiology labs provide C W s generating sinusoidal waveforms, but only a t additional cost. Comparison between these VEP results with the extensive body of single neuron recordings and psychophysical data presents some difficulties: in fact, the VEP to checkerboard stimulation is not equivalent to the sum of the VEPs obtained with the individual Fourier components of the checkerboard (Spekreijse et al.. 1973). However, from a survey of the existing literature it is apparent that in a normal human population, there are many similarities between the results obtained with check and sinusoidal gratings. The amplitude of the response has a peak at about 10' of arc (equal to 4-5 cpd) for both stimuli (Regan and Richards. 1971: Plant et al., 1983). which corresponds to the peak of contrast sensitivity curves obtained with psychophysical measurements and to the largest receptive field of retinal ganglion cells. Manipulation of spatial frequency also yields similar results for checks and gratings in human pathological conditions. In diseases affecting the macula, the VEP is typically abnormal when checks less than 14' (Papakostopoulos et al., 1984) or sinusoidal gratings of more than 2 cpd (Bodis-Wollner et al.. 1987) are used. Furthermore, in Parkinson's disease, the spatial frequency-dependent abnormality is evident with both gratings and checkerboard (Onofrj et al.. 1986; Tartaglione et al., 1986). The topographical distribution of the VEP obtained with full and hemifield stimuli is similar for both checkerboards and gratings. In both cases, full field responses consist of a negative (N70) - positive (P100) complex with a symmetrical distribution over the posterior derivations, while polarity reverted counterparts are recorded over the anterior derivations. With hemifield stimulation, the negative - positive complex
CLINICAL DIAGNOSIS
305
is distributed ipsilaterally to the stimulated visual field and polarity
reverted activities are distributed on the side of the unstimulated visual field. In patients with visual field defects, checkerboard and gratings yield similar results (Onofrj. 1990).
N70andPlOO Recent evidence suggests that N70 and PlOO may represent different functions of the visual system and have different origins. The majority of studies investigating the origin of the two peaks have used pattern onset-offset stimulation, which is not commonly employed in clinical VEP testing. Although there is general agreement about the cortical origin of N70. there are conflicting conclusions about the precise source. Using a short pulse onset stimulus (50 msec). Jeffreys (1977)asserted that the first peak (at 70 msec) was a n extrafoveal component originating in area 17. while Halliday and Michael (1970). using a longer pulse stimulus, located the origin of all the peaks in extrastriate areas. Darcey et al. (1980)proposed that the two peaks are generated either by one source that reverses its polarity or by two very close sources of opposite polarity. Lesevre (1982)suggested that the origin of the early negative peak is in the foveal cortex of area 17. while the following peak actually consists of two components: one at 100 msec probably coming from area 18 and the other at 120 msec, from area 19. Using principal component analysis, Maier and colleagues (1986)concluded that the origin of the first peak is in area 18, and those of the following peak are both in area 17 and 18. It is possible that the conflicting conclusions reached by these studies on the origin of the two peaks are related to important differences in stimulus and analysis selection, and in electrode montage. Lesevre (1982)pointed out that N70 amplitude does not change with increasing field size from 2 to 20 degrees (as PlOO does), but N70 disappears when the central 5 degrees of visual field are occluded. N70. best elicited by stimuli of spatial frequency higher than 1 cpd. should represent mostly a function of contrast processing, while PlOO should represent mostly luminance processing. In an extensive study of the effect of spatial frequencies, stimulus retinal location, and scalp recording from multiple electrodes in 30 normal subjects, Onofrj (1990)showed that N70 can be absent in 30-40% of normal subjects when stimuli of less than 2 cpd are used, while it is always present when spatial frequencies equal to or more than 2 cpd are used. When the central part of the retina is stimulated. N70 increases proportionally to spatial frequency, with a peak around 3-5 cpd. For stimuli eccentric to the fovea, N70 can also be elicited (but with smaller amplitude) by coarser stimuli, since the size of the receptive field of ganglion cells increases with eccentricity. Thus, the amplitude of N70 is probably related to the number of ganglion cells per unit area: their density is maximal at the central retina, and the size of ganglion cell receptive field increases as a function of retinal eccentricity (Hubel and Wiesel, 1968).It has been shown that PlOO has high sensitivity to luminance (Riemslag et al.. 1982: Kriss and Barrett. 1985).When coarse stimuli are used, the reversal of contrast will induce an abrupt focal variation of luminance at the foveal level, where the ganglion cells are dense and their receptive field is small. In more eccentric parts of
306
CHAPTER 9
the retina, the effect of focal luminance variation is less relevant because of different densities and sizes of the center-surround system. Due to abrupt focal luminance variation, coarse patterns elicit a PlOO with higher amplitude from the cortical representation area of the fovea. The foveal projections are represented mostly at the tip of the occipital pole a n d can, therefore, project electric fields over ipsilateral and contralateral derivations. Thus, a disproportionate focal luminance variation at the fovea (as compared to the parafoveal regions) can explain why PlOO often spreads all over the posterior derivations of the scalp and why this spreading is more common when coarse stimuli are used. Because of the cortical magnification coefficient of foveal representation, the activity of other retinal projections will be concealed. These data support the hypothesis that N70 is mostly a foveal contrast-dependent component, while P 100 probably also represents luminance parafoveal processing. Lastly, upper and lower field stimulation performed in normal subjects with checkerboards (Michael and Halliday. 1971: kiss and Halliday. 1980;Lesevre and Joseph, 1979;Lesevre. 1982)and gratings (Skrandies, 1984: Previc. 1988)reveals a further important distinction between N70 and P100:it is possible that PlOO is mostly generated by lower field stimulation, while major contributions for N70 come from upper hemifield stimulation. In fact, a t the midline electrode (about 5 cm above the inion) lower field stimulation elicits in normal subjects a well defined positive peak a t about 100 msec, in some cases preceded by a small negative deflection at 70-80msec. In contrast, upper field stimulation produces a prominent negative peak at about 70-80msec. followed by a broader, not well defined positive peak at about 110-120 msec a t the same electrode location. One interpretation of these changes relates to the cortical projection of the upper and lower visual fields to the ‘floor’ and the ‘roof of the calcarine fissure, respectively. N70 abnormalities of VEP recorded from the midline electrode could be related to the presence of visual upper field defects. Conversely, absence or attenuation of PlOO could be related to visual lower-field defects. An interesting interpretation of this phenomenon is given by Previc (1988).who attributes a magnocellular origin to the lower hemifield-dominated PlOO and a parvocellular origin to the upper hemifield-dominated N70. This author notes that demyelinative diseases seem to be mostly associated with magnocellular dysfunction, since contrast sensitivity studies have shown that low spatial and high temporal frequencies (which are tied to the magnocellular pathway, see Derrington and Lennie, 1984)were primarily impaired in patients with multiple sclerosis (Regan and Maxner, 1986: Medjbeur and Tulunay-Keesey. 1986). If PlOO and N70 represent mostly magnocellular and parvocellular processes, respectively, then in patients with multiple sclerosis, one should find a greater degree of abnormality for PlOO than for N70.Although a certain degree of caution should be exerted in fully accepting this hypothesis, our data (Ghilardi et al., in press), in agreement with the study by Collins and colleagues (1978).show that in patients with multiple sclerosis P l O O is more often abnormal than N70.
CLINICAL DIAGNOSIS
307
Visual pathways: Their relevance to the clinical VEP Separation between visual pathways occurs early in the mammalian retina, where bipolar cells connect exclusively to either rods or cones and carry their signals in parallel to the ganglion cell level. Each region of the retina is subserved by a range of ganglion cells with different receptive field sizes, which on the average increases with eccentricity from the fovea (Dow et al., 1984). The center size of the largest human foveal ganglion cells is smaller than 28'.As extensively discussed in Chapter 1 of this book, in the cat retina there are two main groups of ganglion cells, called X and Y ganglion cells (Enroth-Cugell and Robson, 1966). although a strict functional subdivision of these groups in higher visual processing has been questioned by Lennie (1980). X-cells which have linear properties a n d are mostly concentrated in the fovea, while Y-cells. which are characterized by strong non-linear processes (Shapley and Victor, 19781, are distributed more equally throughout the retina. Synaptic transmission and conduction velocity are, on the average, faster for Y cells (Sestokas et al., 1987). In the primate, large bodied neurons (alpha- or M-cells) roughly corresponding to Y-cells of the cat, synapse in the magnocellular layer of the lateral geniculate nucleus, whose neurons are more sensitive to low contrast stimuli than those in the parvocellular layer (Shapley et al.. 1981). The majority of retinal small bodied ganglion cells of primates (beta or parvo cells) have properties similar to those of X cells in the cat, and are mostly distributed in the fovea. In general, M cells have larger receptive fields t h a n P cells in corresponding retinal areas. Thus, by selecting the appropriate stimulus for obtaining a VEP. one can theoretically stimulate different pathways. This has clinical relevance: certain types of retinal pathology at the onset may subject to damage a specific type of ganglion cells. For example, glaucoma predominantly damages M-neurons: in both this disease and ocular hypertension, degraded foveal vision is characterized by spatial contrast sensitivity loss which is most evident to low spatial frequencies flickering a t high rates. Some neurotoxins, such as acrylamide monomer, can cause an optic neuropathy with apparent selectivity to small (parvo) ganglion cells. Studies of Merigan and Eskin (1986) in the monkey revealed that acrylamide affects color vision and mid-to high spatial frequency, with the VEP showing corresponding abnormalities. Mixing of photoreceptors input occurs through gap junctions between them and through the network of horizontal, amacrine and interplexiform cells. In the mammalian retina, amacrine cells play a very important role in the modulation of center-surround mechanisms. Several neurotransmitters have been identified in the primate amacrine cell population, such a s glutamate, GABA, and acetylcholine. Dopaminergic amacrine cells have a particularly important role in the rod pathway. In the rhesus monkey, these neurons, which correspond to the A18 amacrine cells described by Kolb and colleagues (1981). are mostly distributed around the fovea and virtually overlapping the rod distribution (Mariani et al.. 1984). They receive presynaptic input mainly from cone bipolar cells and from non-dopaminergic amacrine cells, while their major output is on the glycinergic rod amacrine AII cells, which connect the rod bipolars to cone bipolars and to ganglion
308
CHAPTER 9
cells (Hokoc and Mariani. 1987).I t has been shown dopamine reduces the sensitivity of ganglion cells to light stimuli (Jensen and Daw, 1988), suggesting that dopaminergic amacrine cells are responsible for the surround response into the rod system through the synapses with the rod A11 amacrine cells (Daw et al., 1990). The distinction between center and surround is sharpest for foveal ganglion cells. For this reason one can assume that dopamine plays a major role in the signal processing of parvocellular pathways. Jensen and colleagues also suggest that it is likely that serotonin increases the signal in the ON-pathway through a feedback synapse onto the rod bipolar terminal. As will be discussed later in the chapter, catecholaminergic retinal circuits play a n important role in visual dysfunction of patients with Parkinson's disease. Cholinergic circuits are also present in the mammalian retina: amacrine and both on- and off-center ganglion cells (Masland and Mills, 1979) respond to acetylcholine (Daw et al.. 1982).However, it seems that they have little interaction with the dopaminergic retinal system (Kamp, 1986). It has been shown in cats that either transient or permanent cholinergic blockers produce a decrement in low spatial frequency pattern VEP (Kirby et al.. 1986).These data can be relevant for Alzheimer's disease, where predominant loss of acetylcholinergic neurons and cortical reduction of choline acetyl transferase (the enzyme which synthesizes acetylcholine) have been confirmed by several biochemical and histological studies. In patients with Alzheimer's disease, flash VEP and pattern VEP obtained with very low spatial frequencies are more affected than pattern VEP to higher spatial frequencies (Wright et al., 1984; Laurian et al.. 1982; Visser et al.. 1976).Hinton and colleagues (1986)have found widespread axonal degeneration of the optic nerves in 8 out of 10 patients with Alzheimer's disease and a corresponding reduction in the number of ganglion cells in the retina of 3 out of 4 patients. While confirmation of these results is needed, it will be important to establish if a specific subset of retinal ganglion cells and optic nerve fibers are affected: from the results of electrophysiological studies previously mentioned, one should expect a major damage of large (alpha or magno) ganglion cells. GABA is present in the retina and throughout the entire visual system. I t s involvement in epilepsy can be documented with VEP: visual cortical responses recorded from patients with photosensitive epilepsy show an enhanced negative waveform (N70)(Ratliff and Zemon. 1984). similarly to the VEP obtained from cats following topical application of bicuculline to visual cortex (Zemon et al., 1980).Glycine, opiates and glutamate are neurotransmitters present in the mammalian retina and interact in many different circuits. Their relevance to corresponding human retinal and in general visual pathology is not yet defined.
Parkinson's disease Parkinson's disease is a disorder of the extrapyramidal system, characterized by loss of dopaminergic neurons in the basal ganglia. The clinical features of this disease are mainly motor disturbances, such as bradykinesia, rigidity, and tremor, which can be temporarily reversed by dopamine agonists and/or precursors. However, systems other than the nigrostriatal pathways are also affected: many biochemical,
CLINICAL DIAGNOSIS
309
anatomical, clinical, and neurophysiological studies have demonstrated a more extensive and complex impairment of sensory systems. In the last decade, several papers have also described an involvement of the visual system. Abnormal VEPs in over 50% of patients have been reported in several independent studies (Bodis-Wollner and Yahr, 1978: Delwaide et al., 1980: Gawel et al., 1981: Kupersmfth et al., 1982: Tartaglione et al.. 1984;Onofrj et al.. 1986).while others found delays only in 20% of the patients (Regan and Neima. 1984). Normal VEPs were reported in other studies (Ehle et al., 1982: Halliday, 1982:Yaar, 1981: Dinners et al.. 1985). and some researchers found only VEP amplitude reductions (Hansch et al.. 1982)or VEP asymmetries (Mintz et al., 1981).A wide variety of stimulus conditions were used in these studies: the visual stimuli ranged from uniform field stimulation to checks and gratlngs of differing spatial frequencies. The VEP abnormalities are more evident when stimulus spatial frequency is in the range of the peak of the contrast sensitivity curve (2-5cpd) - while using stimuli of lower spatial frequencies, the VEP can be mostly normal (Ehle et al.. 1982: Hansch et al., 1982;DiMer et al., 1985: Onofrj et al,. 1986).Thus, pattern element size is a crucial variable in Parkinson's disease: a patient may show abnormal VEP to 2-5 cpd without any impairment of lower spatial frequencies. In fact, the percentage of abnormal VEP increased by 21% with increased spatial frequency from 1 to 4 cpd in 26 Parkinsonian patients studied by Onofrj and colleagues (1986).Psychophysical testing performed in a large number of Parkinsonian patients with normal visual acuity confirmed these results (Kupersmith et al., 1982;Regan and Neima. 1984: Regan and Mamer. 1987:Bodis-Wollner et al., 1987;Bulens et al., 1987).
128-121
130-120
158-130
154-133
Figure 1. VEPs obtained with gratings of 1 and 4 cpd from the right and left eyes of a patient with Parkinson's disease. Notice that the latency decrease of the P l O O peak following sinemet administration which is more evident for the higher spatial frequency. The 99.5% confidence limit of the P l O O latency was 129 msec for 1 cpd and 141 msec for 4 cpd. Beside spatial frequency selection, another important variable in all these studies is the effect of therapy with dopaminergic agents (Figure 1). In fact. acute administration of Sinemet (L-dopa+carbidopa)
310
CHAPTER 9
significantly reduces PlOO latency in Parkinsonian patients for spatial frequencies equal to or higher than 2 cpd (Onofrj et al., 1986) and decreases the threshold for spatial frequencies a t the peak of contrast sensitivity curves in both normal (Domenici e t al.. 1985) and Parkinsonian subjects (Bodis-Wollner et al.. 1987). Visual changes are striking in patients treated with dopamine precursor who show on-off phenomena, that is, sudden switches from a symptom-free state (on-state) and a state of severe disability (off-state). During the off-state, VEP (Bodis-Wollner et al., 1982: Onofrj et al.. 1986) and contrast sensitivity (Bodis-Wollner et al.. 1987) abnormalities are prominent, while they are less evident in on-state. The on-off phenomenon is generally understood in terms of waxing and waning dopaminergic deficiency at the synaptic cleft. Hence, a change in the VEP, PERG and contrast sensitivity of these patients may be related to the efficiency of dopamine at postsynaptic sites.
Figure 2. VEPs to different spatial frequencies recorded from a patient with atypical paranoid disorder before and after treatment with haloperidol. Notice that with 0.5 cpd stimuli no latency changes were observed; however, P100 latency increases with increasing spatial frequency. The importance of dopamine in vision has been also confirmed in another patient population. In fact, while dopaminergic therapy decreases the abnormal pre-treatment VEP latency, in schizophrenic patients phenothiazine compounds administration delays normal P 100 pre-treatment latency (Bodis-Wollner et al., 1982: Onofrj et al.. 1986). We have recorded the VEP to 1. 2, and 4 cpd stimuli in patients with acute paranoid state, both before and after acute intravenous administration of haloperidol (Onofrj et al., 1986). All had abnormal VEP prior to treatment. Following drug treatment, a significant increase in P l O O latency was evident in all patients, mostly for spatial frequencies of 2 and 4 cpd (see Figure 2). To summarize, the visual system abnormalities in Parkinsonian patients are spatial frequency-dependent and linked to a dopaminergic deficiency. The most likely site of this pharmacological deficit is the retina. Reduced PERG (Gottlob et al., 1987; Nightingale et al.. 1986) and flash ERG amplitude (Nightingale et al.. 1986; Jaffe et al., 1987:
31 1
CLINICAL DIAGNOSIS
Ellis et al., 1987) have been reported in Parkinsonian patients. However, the results of ERG studies are controversial: in some patients, the ERG responses became normal following dopaminergic treatment, while there were patients with paradoxical responses, that is. a deterioration of the ERG following t h e treatment. However, pharmacological (Harnois et al., 1988) a n d histochemical (Nguyen-Legros et al.. 1988) studies seem to indicate the presence of a dopaminergic deficit in the retinas of Parkinsonian patients. 0.5 cpd
F
Pre-MPTP
1.2 cpd
k
2.5 cpd
v
3.5 cpd
40 d
70 d
73 uv i o o S
Figure 3. VEPs to four different spatial frequencies recorded in a monkey before and after MPTP administration. Notice that we were able to obtain stable and reproducible VEPs in the same and different "pre-MFTP testing sessions. Vertical lines indicate the latency of P100. Following MFTP. PlOO was progressively delayed and reduced in amplitude in the first 20 days, while on day 30 no further worsening of the response was seen. For days 30 to 40, a partial improvement is evident: no further improvements were seen 70 days post-MPTP. Notice that the changes were spatial frequency dependent - greatest for 3.5 cpd and minimal for 0.5 cpd stimuli. The intravenous administration of N-Methyl 4-Phenyl 1,2,5,6-tetrahydropyridine(MFTP) causes persistent and severe Parkinsonism in both humans (Ballard et al.. 1985) and monkeys (Burns et al.. 1983). with the major signs (tremor. rigidity and bradykinesia) being comparable to those present in advanced stages of idiopathic Parkinson's disease. The MPTP-treated monkey is the closest animal
312
CHAPTER 9
model to the human idiopathic Parkinson’s disease. We have studied the pattern VEP and ERG in MPTP-treated monkeys, following a protocol fully described elsewhere (Ghilardi et al., 1988).The stimuli were vertical gratings of 0.5,1.2,2.5,and 3.5 cpd, counterphase modulated at 1 Hz. Prior to MPTP administration, we obtained in all the monkeys clear and reproducible pattern VEP and ERG, which resembled the responses recorded in humans. From 15-20 days post-MPTP, the development of the Parkinsonian syndrome was accompanied by significant amplitude decrease and latency increase of the pattern VEP, while the pattern ERG amplitude was significantly reduced. These abnormalities were spatial frequency dependent, the largest alterations occurring with the higher spatial frequencies. Forty days post-MPTP. the monkeys partially recovered from the Parkinsonian syndrome: significantly prolonged VEP latencies (compared to the pre-MPTP values) were consistently recorded for the 1.2, 2.5, and 3.5 cpd gratings, but not for 0.5 cpd. VEP amplitude was within the normal range for the 0.5 and 1.2 cpd patterns, but not for 2.5 and 3.5 cpd stimuli (see Figure 3). From 40 days on, the amplitude of the ERG elicited by the 0.5 and 1.2 cpd gratings fell within the normal range, while the ERG amplitude of 2.5 and 3.5 cpd was significantly reduced when compared to baseline data. All ERG and VEP parameters measured within 40-45 days post-MITP remained stable u p to 3 years later. The administration of Sinemet induced a complete, yet temporary recovery of both the Parkinsonian signs and the electrophysiological abnormalities. which was seen as early a s 15 minutes after Sinemet and lasted for about 6-7 hours. The VEP and ERG changes were spatial frequency-dependent: the higher the spatial frequency, the greater the improvement. These electrophysiological abnormalities are similar to those found in Parkinsonian patients: in both, they are spatial frequency-dependent and reversible following dopamine precursor administration. The VEP and ERG alterations of our monkeys are probably due to the significant retinal dopamine decrease caused by MPTP (Ghilardi et al.. 1988). The pathophysiology of retinal damage can be explained taking into account the mechanism of action of M P r P and the vicinity of anatomical structures involved in the process. Intravenously injected MPTP reaches the eye (Lynden et al., 1985),and there it is likely to be transformed in MPP+ by retinal astrocytes. MPP+ could, a t this point, penetrate into the dopaminergic amacrine cells, whose distribution peaks from 2 to 3 m m around the fovea (Mariani et al.. 1985). The hypothesis of a retinal origin for the visual dysfunction in Parkinsonian patients is further supported by the effects of intraocular administration of 6-Hydroxydopamine on the pattern VEP and ERG of aphakic monkeys (Figure 4). Similarly to MPTP-treated monkeys, VEP and ERG abnormalities were spatial frequency-dependent, being most pronounced for the higher spatial frequencies (2.5 and 3.5 cpd). whereas lower spatial frequencies (0.5and 1.2 cpd) were less impaired (Ghilardi et al., 1989). The findings in Parkinsonian patients, a s well as in MPTP- and 6-Hydroxydopamine-treated monkeys can be explained in terms of disruption of center-surround mechanisms due to dopaminergic deficiency. In fact, the role of dopaminergic amacrine neurons is to modify spatial summation of the photoreceptor signals in horizontal
CLINICAL DIAGNOSIS
313
r
L
-
L
-
F,
0.OlL
,
1
,
,
I
I--+
0.5 1.2 2.5 3.5 0.5 1.2 2.5 3.5 0.5 1.2 2.5 3.5 SPATIAL FREQUENCY' (cpd)
Phase differences and amplitude ratios of the pattern electroretinogram and VEP recorded in three monkeys following intravitreal injections of 6-hydroxydopamine and saline solution. In the upper part of the figure, the normal range (95% confidence interval) is shown as a shaded area which was derived from the data collected from both eyes prior to the injections. Notice that the 6-hydroxydopamine treated eyes show abnormal phase shifts with spatial frequencies of 1.2 cpd or more, while the saline treated eyes produced responses within the normal range. In the lower part, the amplitude ratios of the saIine treated eyes show little variability, while the responses from the 6hydroxydopamine treated eyes are either reduced or absent for the higher spatial frequencies.
Figure 4.
314
CHAPTER 9
cells (Piccolino et al., 1984) and, thus, to modify the response of bipolar (Negishi and Drujan, 1979) and ganglion cells (Jensen and Daw. 1984). In other words, the center-surround mechanisms are dependent upon the spatial summation of horizontal cells, which in turn, depend upon the coupling of horizontal cells. Dopamine modulates the electric coupling of horizontal cells, and thus. can modify the size of the antagonistic surround of bipolar and ganglion cells. The effect of dopamine deficiency is the disruption of the center-surround mechanisms, which could explain the spatial frequency-dependent abnormality seen in our MmP-treated monkeys and in Parkinsonian patients, and suggests that the visual dysfunction in this disease may be primarily related to an impairment of retinal parvocellular pathways.
Multiple sclerosis I t is well established that the VEP is abnormal in patients with optic neuritis or multiple sclerosis (Halliday et al.. 1972, 1973; Lowitzsch et al.. 1976; Shahrokhi et aL.1978: Chiappa, 1980; Kjaer, 1980). A review of the available literature shows that in patients with multiple sclerosis a n abnormal VEP may occur to one but not to another stimulus, suggesting that this disease randomly affects different parts and substructures of the visual system. The first variable to take into account is stimulus size: is there a selective spatial frequency impairment in multiple sclerosis? The diagnostic yield increases by exploring the VEP to various spatial frequencies (Milner et al.. 1974: Gambi et al.. 1980; Neima and Regan, 1984; Oishi et al.. 1985; Novak et al.. 1988). In one of the studies with the largest number of multiple sclerosis patients (n=304), Oishi and colleagues (1985) studied the VEP to 25', 50' and 100' checks and found that the VEPs of 75% of the patients were abnormal to at least one check size. Forty-nine eyes showed "discrepancy among the check sizes": 29 eyes were abnormal only to 25' checks, 19 only to 100' checks, and 1 eye showed abnormality to both 25' and 100'. but not 50' checks. Using smaller checks (7'.14'. 28'. and 56' of arc). Novak and colleagues (1988)found abnormal VEP in 90 out of 127 patients with multiple sclerosis. Their data suggest that 14' checks were the most sensitive stimuli since they reveal additional abnormalities in 11 patients and were seldom normal (8 out of 66 patients) in the presence of a 28' check abnormality. These conclusions are in agreement with those of Plant (1983)who found that only sinusoidal gratings of spatial frequencies equal to or more than 2 cpd (15' of arc) produced significant latency delays in 11 patients with optic neuritis compared to the controls. The combined effect of small field size and low spatial frequency checks on the detection of VEP abnormalities has been reported by different authors (Hennerici et al.. 1977; Diener and Scheibler, 1980; Oepen et al.. 1982). Hennerici et al. (1977) used two stimuli: a foveal rectangle with 45' of arc visual angle and a checkerboard pattern (check size 1"). which subtended 20" of visual angle. With the checkerboard PlOO delays were evident in 61% of the cases of multiple sclerosis, while with the 'foveal' rectangle 88% of the patients had abnormal VEP. Combining the results obtained using a 'foveal' rectangle of 1" and a checkerboard of 15 " with a check size of 1'. Diener and Scheibler (1980) found 100% of abnormal VEP in
CLINICAL DIAGNOSIS
315
patients with clinically definite and probable multiple sclerosis, while 71% of the subjects with the clinically possible form had abnormal VEP. These authors reported that the detection rate of abnormal VEP was higher with the 'foveal' rectangle, although 8.1% of the patients showed abnormal checkerboard and normal 'foveal' VEPs. It must be pointed out that the 'foveal' stimulus of these studies can cause predominantly luminance, as opposed to spatial contrast changes. Furthermore, by using a small field size (less than 3-4') of any stimulus kind, patient cooperation is of the utmost importance in that poor fixation could account for VEP abnormalities. Victor and colleagues (1986) have diagnosed a mainly parafoveal retinal dysfunction in one patient with unilateral optic neuritis. using sinusoidal vertical gratings of low (1.5 cpd) and higher (8 cpd) spatial frequencies and central maskings of 1' and 4'. The amplitude of the VEP to full-field gratings of 1.5 cpd were similar in both the normal and affected eyes. In this patient, the parafoveal visual loss was not dense enough to be manifest itself as a scotoma, and was underestimated by perimetry: only the VEP obtained with low spatial frequency pattern in conjunction with foveal masking could demonstrate a parafoveal retinal dysfunction. Contrast sensitivity studies confirm that different types of spatial frequency loss occur in multiple sclerosis and optic neuritis. In different patients. low, intermediate, or high spatial frequencies may be affected (Regan et al.. 1977: Sjostrand and Frisen. 1978: Bodis-Wollner et al.. 1979: Zimmern and Plant, 1979: Hess and Plant, 1980). N70 has been measured and considered a marker of visual pathway dysfunction only in a few multiple sclerosis studies (Collins et al., 1978: Kaufman and Celesia, 1985). since it has been considered an 'unreliable' peak. However, as previously pointed out, when proper stimulus conditions are used, N70 is stable and reliable. This early negative peak, which occurs around 70-80 msec. is a post-synaptic cortical component of the VEP and reflects visual processes that are different than those shown by P100. In fact, N 7 0 shows clear spatial tuning, while PlOO does not (Parker et al., 1982: Plant et al.. 1983: May and Lovegrove. 1987: Previc, 1988: Onofrj, 1990): furthermore, contrast (Kulikowski. 1977: Previc. 19881, field size and retinal location of the stimulus (Lesevre and Joseph, 1979: Lesevre, 1982: Onofrj, 1990) differently affect the two components. Because of the characteristic 'disseminating' pattern of multiple sclerosis, visual processes may be dissociated at any point along their pathways. Thus, multiple sclerosis can provide a model to evaluate if N 7 0 and PlOO reflect sequential or parallel processing. In a retrospective study (Ghilardi et al.. 1991). we have investigated the relationship of N70 and PlOO components of the VEP to vertical sinusoidal gratings (2.3 cpd) in 98 patients with multiple sclerosis and 59 matched controls. Of all the 196 eyes tested in patients with multiple sclerosis, PlOO was absent in 12 (6.12%). PlOO latency was prolonged in 109 (55.61%), and normal in 75 eyes (38.26%). Significant interocular differences for PlOO latency were found in 49 patients. On the other hand, N 7 0 was absent in 29 eyes (14.8%):had prolonged latency in 68 eyes (34.7%). and was normal in 99 (51.3%). Twenty-four patients showed significant interocular differences for N70 latency. A delayed N70-P100 peak interval was found in 58 out of 167 eyes, which had also a delayed PlOO and in 2 eyes with normal PlOO and N70 latencies. In patients with
316
CHAPTER 9
multiple sclerosis, P100/N70 amplitude ratio was less than 1 in 1 1 eyes and more than 12 in 22 eyes, while in our controls this ratio was less than 1. The total number of eyes with either N70 and/or PlOO abnormalities was 137 (69.9%). Eighty eyes (40.8%) had abnormal latency of both PlOO and N70. while 41 eyes showed PlOO delays without corresponding N70 changes. Seventeen eyes had abnormal N70, but normal PlOO latency. N70 and PlOO appeared to be more often absent in the definite rather than in possible multiple sclerosis group. N70 was absent in 26% and 5% of the eyes of patients with definite and possible multiple sclerosis, respectively, while P100 was absent in 13% of the eyes of the definite group and always present in the eyes of subjects with possible multiple sclerosis. These results show that N70 and PlOO can be independently affected in patients with multiple sclerosis and have important clinical and theoretical implications. From a clinical point of view, the combined use of N70 and PlOO latencies increased the diagnostic yield by almost 10%. A few papers have described the results of the combined use of N70 and PlOO peak latencies in multiple sclerosis studies performed with pattern reversal mode. Kaufman and Celesia (1985)measured both N70 and PlOO latencies in a small group of multiple sclerosis patients. Using 15' of arc checks (fundamental spatial frequency of 2.8 cpd), VEP were abnormal in 13 out of 18 eyes: with 31' check size (fundamental spatial frequency equal to 1.3 cpd) they found "VEP abnormalities" in 23 out of 28 tested eyes. Unfortunately, more details are not available for comparison with our data. Results similar to ours were obtained by Collins and colleagues (1978)in 98 patients with multiple sclerosis and 50 controls. The stimuli were checks of 12' of arc, whose fundamental spatial frequency corresponds to 3.5 cpd. The latency of several peaks. including N70 and PlOO and other parameters, were measured. N70 and PlOO were always present in the control population, while among the multiple sclerosis patients, N70 latency was not measurable in 26 eyes (13%) and outside the normal range in 52 eyes (27%): PlOO was absent in 16 eyes (8%) and its latency was delayed in 71 eyes (36%). Direct comparison between N70 and PlOO diagnostic yields were not discussed in this paper, although they stated that some aspect of the negative components paralleled PlOO latency. and N70 could be abnormal in the presence of normal P100. These findings agree with the results of our study. In both studies: 1) PlOO had the highest diagnostic yield; 2) N70 was present in all the controls: 3) N70 was absent in almost the same number of eyes (Collins and colleagues: 13%; ours: 14%). while PlOO was absent in 8% and 6% of the two populations, respectively. Furthermore, in our study, the use of non-parametric tests reveal that N70 was more often absent than P100: in fact, N70 was not measurable in 26% and 5% of the eyes of the patients with definite and possible multiple sclerosis, respectively, while PlOO was absent in 13% of the definite multiple sclerosis eyes and always present in the possible multiple sclerosis eyes. These findings raise a question: Is the absence or, better, the non-detectability of N70 a hallmark of visual pathway dysfunction? As discussed earlier, N70 is very sensitive to variation of stimulus parameters, such as spatial frequency and contrast level: its amplitude of N70 is highest for medium to high spatial frequencies (Parker et al., 1982: Plant et al.. 1983: May and Lovegrove, 1987: Previc. 1988;
CLINICAL DIAGNOSIS
317
Onofrj. 19901, that is. 4 to 6 cpd. near the peak of foveal contrast sensitivity (Campbell and Green, 1965). Moreover, while spatial frequencies of 1 cpd may fail to produce a reliable N70 in normal subjects, N70 is always present when spatial frequencies of 2 and 4 cpd are used (Onofrj, 1990).All our controls had reliable N70 for 2.3 cpd gratings. Thus, the absence of N70 in our patients with multiple sclerosis should be considered a pathological sign. What is the meaning of an N70 delay or absence? One of the possible variables to be taken into account is visual acuity. We found abnormal PlOO in 7 out of 9 patients with visual acuity equal to 20/200: in these same patients N70 was absent, while in the other two patients with visual acuity of 20/200 N70 and PlOO latencies were within normal limits. A practical advantage of using sinusoidal gratings is that one can obtain reliable VEP even with blur due to non-optimal correction. Bobak and colleagues (1 987) have demonstrated that the effect of up to 2 diopters blur is negligible on the PlOO latency of the VEP obtained with 2.3 cpd sinusoidal gratings in normal observers. In a non-cycloplegic eye with 20/15 visual acuity, +2 diopters blur corresponds to a visual acuity of approximately 20/200. Thus, it is unlikely that low visual acuity accounts for the PlOO abnormalities in this group of patients. The role of visual blur on N70 latency and N70 and PlOO amplitudes obtained with sinusoidal gratings has not yet been established. Different authors have found that a small amount of blur attenuates the amplitude of PlOO obtained with checks (Regan and Richards, 1971; Spekreijse et al.. 1973). Traces obtained using 3 diopters blur and different check sizes (Sokol and Moskowitz, 1981: Fig. 1) seem to indicate that amplitudes are more affected than latencies. On the other hand, the absence of N70 cannot be explained in terms of low visual acuity only: in fact, this potential was absent also in 18 subjects with visual acuity better than or equal to 20/30.There are different hypotheses to explain the relatively higher undetectability of N70 compared to PlOO in our multiple sclerosis population. We can easily exclude that in our patients with multiple sclerosis P100,but not N70 had higher amplitude than in normals, thus "hiding" or annulling the preceding negative deflection. In fact, our data show a significant amplitude decrease for both N70 and PlOO peaks. Another possibility is that in patients with multiple sclerosis, both PlOO and N70 have similar amplitude decrements, but, since N70 has a lower signal-to-noise ratio than P100, the final effect is a disappearance of N70 with a relative preservation of P100. Although in 1 1 eyes, P100/N70 ratio was less than 1. in our multiple sclerosis population with detectable N70 and PlOO peaks, the P100/N70 amplitude ratio tended to increase compared to normals (9.96versus 2.78). This increase was more evident for the definite multiple sclerosis group versus the controls (11.53 versus 2.78: p< 0.05).suggesting that N70 amplitude decreases more than P100. In addition to the clinical diagnostic relevance of our findings.the different results obtained for N70 and PlOO support the hypothesis that they represent different signal processes of the visual system and/or they have different origins. In a multiple sclerosis population, the dissociation between N70 and PlOO diagnostic yield is maintained even when the stimuli are horizontal bars. Orientational abnormalities in multiple sclerosis population have been shown by both VEP and contrast sensitivity studies
318
CHAPTER 9
system (Camisa et al., 1981,Kirkham and Coupland. 1982;Kupersmith et al., 1983;Regan et al., 1980).We have analyzed the effect of stimulus orientation on the PlOO and N70 peaks in 53 patients with multiple sclerosis and 40 normal subjects, using vertical and horizontal sinusoidal gratings of 2.3 cpd. In the control population there were no significant differences between the measurements obtained with the two stimulus orientations. In multiple sclerosis patients, the VEP obtained with horizontal stimuli was more often abnormal than that obtained with vertical (71% versus 60%); PlOO was found more frequently abnormal t h a n N70 (75% versus 57%): P100/N70 dissociation was true for both vertical and horizontal orientations. If N70 reflects mostly foveal processes and PlOO is mostly a parafoveally generated peak as discussed earlier, it is possible that, in general, parafoveal pathways are more affected in multiple sclerosis, while in definite multiple sclerosis an additional involvement of foveal processes occurs, since this peak was more often absent in the definite rather than possible multiple sclerosis eyes. Furthermore, although there was no difference in the distributions of N70 latency between definite and possible multiple sclerosis eyes, it appears that this peak was more often absent in the definite rather than possible multiple sclerosis eyes. However, two lines of evidence would seem to contradict this hypothesis. First, the fact that color desaturation is a prominent feature early in the disease (Fallowfield and Krauskopft. 1984) suggests that the parvo. not the magno. pathways may be involved. Secondly, Hess and Plant (1983) reported a relative sparing of the high and low temporal frequency mechanism in resolved optic neuritis, a result that would suggest a relative sparing of magno pathways. In conclusion, from the data available, it seems that in multiple sclerosis magno and parvocellular pathways can be independently, but not preferentially affected. N70 and P100 measurements could provide a mean to separate the involvement of the two pathways not only in multiple sclerosis, but also in other neuro-ophthalmological diseases.
Glaucoma Glaucoma is a disease primarily of the optic nerve, characterized by increased intraocular pressure and atrophy of the optic nerve resulting in visual field loss. In its earliest stages, glaucoma primarily affects large bodied ganglion cells (Bodis-Wollner. 1989). Results using a monkey model of glaucoma suggest that early damage in glaucoma primarily affects magnocellular neurons (Quigley and Hendrickson. 1984: Marx et al.. 1988;Glovinsky. Quigley, and DunMeberger, 1991). In patients with early glaucoma, abnormal VEP is best elicited by stimuli of low spatial and higher temporal frequencies (Atkin et al., 1979; Bobak et al., 1983). This stimulus also discriminates monkeys with laser-induced glaucoma from normal monkeys (Marx et al.. 1988). Temporal contrast sensitivity can be a better indicator of early glaucoma than standard perimetry (Lustgarten et al., 1990). perhaps due to increased detection of early magnocellular loss. We have suggested (Ghilardi et al.. 1990) that changes in the VEP to stimuli with lower spatial and higher temporal frequency may provide the earliest signs of glaucoma onset.
CLINICAL DIAGNOSIS
319
Conclusions To summarize, the use of certain stimulus configurations when using the VEP or contrast sensitivity in clinical diagnosis is dependent to some extent on our knowledge of parallel processing in vision. In many disorders, such as multiple sclerosis, Parkinson's disease, and glaucoma, one parallel channel may be preferentially affected or affected prior to the other. With additional experimentation, our knowledge of parallel processing in vision may continue to improve clinical diagnosis.
References Atkin, A., Bodis-Wollner. I., Wolkstein. M., Moss, A., and Podos. S.M. (1979).Abnormalities of contrast sensitivity in glaucoma. American Journal Of Ophthalmology, 88. 205-213. Ballard. P.A., Tetrud, J.W., and Langston, JW. (1985).Permanent l-methyl-4-phenyl-l.2,5,6-tetrahuman parkinsonism due to hydropyridine: seven cases. Neurology, 35. 949-956. Barrett. G.. Blumhardt, L., Halliday. A.M., Halliday, E. and Kriss. A. (1976).A paradox in the lateralisation of the visual evoked response. Nature, 261, 253-255. Bobak, P., Bodis-Wollner, I., Harnois. C.. Maffei. L.. Mylin, L.. Podos, S.. and Thornton. J. (1983). Pattern electroretinograms and visualevoked potentials in glaucoma and multiple sclerosis. American Journal Of Ophthalmology. 96.72-83. Bobak, P., Bodis-Wollner. I. and Guillory, S. (1987).The effect of blur and contrast on VEP latency: comparison between check and sinusoidal grating patterns. Electroencephalography and Clinical Neurophysiology. 68,247-255. Bodis-Wollner, I. (1 989). Electrophysiological and psychophysical testing of vision in glaucoma. Survey of Ophthalmology, 33,301-307. Bodis-Wollner. I. and Yahr, M.D. (1978).Measurements of visual evoked potentials in Parkinson's Disease. Brain, 101. 661-671. Bodis-Wollner. I., Hendley. C.D. Mylin. L.H. and Thornton. J. (1979). Visual evoked potentials and the visuogram in multiple sclerosis. Annals of Neurology, 5. 40-47. Bodis-Wollner. I., Yahr, M.D.. Mylin. L. and Thornton. J. (1982). Dopaminergic deficiency and delayed visual evoked potentials in humans. Annals of Neurology, 11. 478-483. Bodis-Wollner, I., Marx, M.S., Mitra. S., Bobak, P., Mylin. L. and Yahr, M. (1987).Visual dysfunction in Parkinson's disease: Loss in spatio-temporal contrast sensitivity. Brain, 110, 1675-1698. Bodis-Wollner, I., Feldmann, R.G.. Guillory. S.L. and Mylin, L. (1987). Delayed visual evoked potentials are independent of pattern orientation in macular disease. Electroencephalography and Clinical Neurophysiology. 68,172-179. Bulens. C., Meerwaldt, J.D.. Van der Wildt. G.J. and Van Deursen J.B.P. (1987). Effect of levodopa treatment on contrast sensitivity in Parkinson's disease. Annals of Neurology, 22, 365-369. Bums, R.S., Chieuh. C.C. Markey, S.P., Ebert, M.H., Jacobowitz. D.M., and Kopin. I.J. (1983).A primate model of parkinsonism: selective destruction of dopaminergic neurons in the pars compacta of the substantia nigra by 1 -methyl-4-phenyl- 1,2,5,6-tetrahydropyridine.
320
CHAPTER 9
Proceedings of the National Academy of Sciences USA , 8 0 , 4546-4550. Camisa. J.. Mylin. L. and Bodis-Wollner. I. (1981). The effect of stimulus orientation on the visual evoked potential in multiple sclerosis. Ann& of Neurology, 10, 532-539. Campbell, F.W. and Green, D.G. (1965).Optical and retinal factors affecting visual resolution. Journal of Physiology. 181. 576-583. Chiappa. K.H.(1980). Pattern shift visual. brainstem auditory and short latency somatosensory evoked potentials in multiple sclerosis. Neurology, 30, 110-123. Collins, D.W.K.. Black, J.L. and Mastaglia. F.L. (1978).Pattern reversal visual evoked potential. Method of analysis and results in multiple sclerosis. Journal of Neurological Science. 36. 83-95. Darcey, T.M., Ary, J.P. and Fender D.H. (1980).Spatio-temporal visually evoked scalp potentials in response to partial-field patterned stimuli. Electroencephalography and Clinical Neurophysiology. SO, 348-355. Daw. N.W., Jensen. R. J. and Brunken. W.J. (1990).Rod pathways in mammalian retinae. Trends in Neuroscience, 13. 1 10-115. Daw. N.W., Ariel, M. and Caldwell, J. H. (1982).Function of neurotransmitters in the retina. Retina, 2 , 322-331. Delwaide, P.J.. Messaoua, B. and De Pasqua, V. (1980).Les potentiels evoques visuels dans la maladie de Parkinson. Electroencephalography and Clinical Neurophysiology. 10,338-342. Derrington, A.M. and Lennie. P. (1984). Spatial and temporal contrast sensitivity neurons in the lateral geniculate nucleus of macaque. Journal of Physiology, 357.219-240. Diener. H.Ch. and Scheibler, H. (1980).Follow-up studies of visual potentials in multiple sclerosis evoked by checkerboard and foveal stimulation. Electroencephalography and Clinical Neurophysiology. 49, 490-496. Dinner, D.S.. Luders, H., Hanson. M., Lesser, R.P. and Hem. G. (1985). Pattern evoked potentials (PEPS) in Parkinson's disease. Neurology, 35,610-613. Domenici, L., Trimarchi. C.. Piccolino, M., Fiorentini. A. and Maffei, L. (1985).Dopaminergic drugs improve human visual contrast sensitivity. Human Neurobiology, 4, 195-197. Ehle. A.L., Stewart, R.M.. Lellelid. N.E. and Leventhal, N.A. (1982). Normal checkerboard pattern reversal evoked potentials in Parkinsonism. Electroencephalography and Clinical Neurophysiology, 54. 336-338. Ellis, C.J.K.. Allen, T.G.J.. Marsden. C.D. and Ikeda. H. (1987). Electroretinographic abnormalities in idiopathic Parkinson's disease and the effect of levodopa administration. Clinical Vision Sciences, 66. 349-357. Enroth-Cugell, C. and Robson. J.G.(1966).The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology, 187,517-552. Eskin. T.A. and Merigan. W.H. (1986).Selective acrylamide-induced degeneration of color opponent ganglion cells in ganglion cells in macaque. Brain Research, 378,379-384. Fallowfield. L. and Krauskopf, J. (1984).Selective loss of chromatic sensitivity in demyelinating disease. Investigative Ophthalmology and Visual Science, 25, 771-773. Gambi, D., Rossini. P.M.. Onofrj. M. and Marchionno, L. (1980). Visual
CLINICAL DIAGNOSIS
321
evoked cortical potentials (VECP) by television presentation of different patterned stimuli to patients with multiple sclerosis. Italian Journal of Neurological Science, 2 , 101-106. Gawel, M.J., Das, P.. Vincent, S. and Rose. F. (1981). Visual and auditory evoked responses in patients with Parkinson's Disease. Journal of Neurology Neurosurgery and Psychiatry, 44. 227-232. Ghilardi. M.F., Bodis-Wollner. I.. Onofrj, M. Marx, M.S.and Glover, A. (1988). Spatial frequency dependent abnormalities of pattern electroretinogram and visual evoked potential in a parkinsonian monkey model. Brain. 111, 134-149. Ghilardi, M.F.. Brannan. J.R..and Bodis-Wollner, I. (1990). Clinical applications of pattern visual evoked potentials. In J.E. Desmedt (Ed.). Visual Evoked Potentials, pp. 12 1-146. Amsterdam: Elsevier Science Publishers. Ghilardi. M.F., Chung. E., Bodis-Wollner. I., Dvorzniak, M.. Glover, A. and Onofrj, M. (1988). Systemic MPTP administration decreases retinal dopamine content in primates. LiJe Sciences, 43. 255-262. Ghilardi. M.F., Marx, M.S.. Bodis-Wollner, I., Camras. C.B.and Glover, A. (1989). The effect of intraocular 6-hydroxydopamine on the retinal processing of primates. Annals of Neurology, 25. 357-364. Ghilardi. M.F.. Sartucci. F., Brannan, J.. Onofrj, M.. Bodis-Wollner. I. and Stroch. R. (1991). N70 and PlOO can be independently affected in multiple sclerosis. Electroencephalography a n d Clinical Neurophysiology. 80, 1-7. Glovinsky. Y.. Quigley. H.A., and Dunkleberger. G.R. (1991). Retinal ganglion cell loss is size dependent in experimental glaucoma. Inuestigatiue Ophthalmology and Visual Science, 32, 484-491. Gottlob, I., Schneider. E.. Heider. W. and Skrandies, W. (1987). Alteration of visual evoked potentials and electroretinograms in Parkinson's disease. Electroencephalography a n d Clinical Neurophysiology, 66, 349-357. Halliday. A.M. and Michael, W.F. (1970). Changes in the pattern-evoked responses in man associated with the vertical and horizontal meridians in the visual field. Journal of Physiology, 208, 499-513. Halliday. A.M., McDonald, W.I. and Mushin. J. (1972). Delayed visual evoked responses in optic neuritis. Lancet. 112. 982-985. Halliday, A.M., McDonald, W.I. and Mushin. J. (1973). Visual evoked responses in the diagnosis of multiple sclerosis. British Medical Journal, 4. 661-664. Halliday, A.M.. (19821. Evoked potentials in clinical testing. Churchill Livingstone, 1982. Hansch, E.C., Syndulko. K.. Cohen. S.N., Goldberg. Z.I., Potvin, A.R. and Tourtellotte, W.W. (1982). Cognition in Parkinson's Disease: a n event-related potential perspective. Annals of Neurology, 1 1 , 599-607. Harnois, C., Di Paolo, T., Marcotte G., and Daigle M. (1988). Retinal dopamine content in parkinsonian patients. I n u e s t i g a t i u e Ophthalmology and Vision Research, Suppl. 29, 200. Hennerici. M.. Wenzel. D. and Freund. H.J. (1977). The comparison of small-size rectangle and checkerboard stimulation for the evaluation of delayed visual evoked responses in patients suspected of multiple sclerosis. Brain, 100. 119-136. Hess. R.F. and Plant, G.T. (1983). The effect of temporal frequency
322
CHAPTER 9
variation on threshold contrast sensitivity defects in optic neuritis. Journal of Neurology, Neurosurgery and Psychiatry, 46. 322-334. Hinton. D.R.. Sadun. A.A.. Blanks, J.C. and Miller, C.A. (1986). Optic-nerve degeneration in Alzheimer disease. New England Journal of Medicine, 315. 485-487. Hokoc. J . N . a n d Mariani, A.P. (1987). Tyrosine hydroxylase immunoreactivity in the rhesus monkey retina reveals synapses from bipolar cells to dopaminergic amacrine cells. Journal of Neuroscience, 7 , 2785-2793. Hubel. D.H. and Wiesel. T.N. (1968).Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215-243. Jaffe, M.J.. Bruno, G.. Campbell G., Lavine. R.A.. Karson, C.N.. Weinberger, D.R. (1 987). Ganzfeld electroretinographic findings in parkinsonism: untreated patients and the effect of levodopa intravenous infusion. Journal of Neurology, Neurosurgury, and Psychiatry, 50, 847-852. Jeffreys, D.A. (1977).The physiological significance of pattern visual evoked potentials. In J. Desmedt (Ed.), Visual evoked potentials in man. Oxford, Clarendon, 134-167. Jensen, R.J. and Daw. N.W. (1984).Effect of dopamine antagonists on receptive field of brisk cells and directionally selective cells in the rabbit retina. Journal of Neuroscience, 14,2972-2985. Jensen. R.J. and Daw. N.W. (1988).Effects of dopaminergic agents on the activity of ganglion cells. In I. Bodis-Wollner and M. Piccolino Alan R. Liss. New York. (Eds.), Dopaminergic mechanisms in v k b n . 163-177. Kaufman. D. and Celesia. G. (1985).Simultaneous recording of pattern electroretinogram and visual evoked responses in neuro-ophthalmological disorders. Neurology, 35. 644-651. Kirby, A.W.. Wiley, R.W. and Harding. T.H. (1986).Cholinergic effects on the visual evoked potential. In R.Q.Cracco and I. Bodis-Wollner (Eds.), Evoked Potentials, Alan R. Liss. Inc., New York. Kirkham. T.H. and Coupland. S . G . (1982). Diamond versus checkerboard pattern-reversal VEPs. Evidence for orientation-specific delays in MS. Documenta Ophthalmologica Proceedings Series, 3 1, 323-328. Kjaer. M. (1980).Visual evoked potentials in normal subjects and patients with multiple sclerosis. Acta Neurologica Scand. 62. 1-13. Kolb, H., Nelson, R.. Mariani, A. (1981).Amacrine cells, bipolar cells and ganglion cells of the cat retina: A Golgi study. VisionResearch , 21, 1081-1114. Kriss. A. and Halliday. A.M. (1980). A comparison of occipital potentials evoked by pattern onset offset and reversal by movement. In C. Barber (Ed.), Euoked Potential, MTP Press Ltd. Falcon House Lancaster England, 205-21 1. Kulikowski. J. (1977). Visual evoked potentials as a measure of visibility. In J.E. Desmedt (Ed.), Visual evoked potentials in man: new developments, Clarendon Press, Oxford, 168-183. Kupersmith, M.J.. Shakin, E.. Siegel. I.M. and Lieberman. A. (1982). Visual system abnormalities in patients with Parkinson's Disease. Archives of Neurology. 39.284-286. Kupersmith, M.J., Seiple. W.H.. Nelson, J . I . and Carr, R.E. (1984).
CLINICAL DIAGNOSIS
323
Contrast sensitivity loss in multiple sclerosis: selectivity lay eye, orientation and spatial frequency measured with the evoked potential. Investigative Ophthalmology and Visual Science, 25, 632-639. Laurian, S.. Gaillard. J.M. and Wertheimer. J. (1982).Evoked potentials in the assessment of brain function in senile dementia. In J. Courjon. F. Mauguiere, M. Revol (Eds.). Clinical Applications of Evoked Potentials in Neurology. Raven Press. New York. 287-293. Lesevre. N. (1982).Chronotopographical analysis of the human evoked potential in relation to the visual field. Annals of the N.Y. Academy of Science, 388, 156-182. Lesevre, N. and Joseph, J.P. (1979).Modification of the pattern evoked potential in relation to the stimulated field. Electroencephalography and Clinical Neurophysiology. 47, 183-203. Lowitzsch. K., Kuhnt, U.. Sakmann. C.. Maurer, K., Hopf. H.C.. Schott, T.. and Thater, K. (1976). Visual pattern evoked responses and blink reflexes in assessment of M S diagnosis: A clinical study of 135 MS patients. Journal of Neurology, 213, 17-32. Lustgarten, J.S.. Marx, M.S., Podos. S.M.. Bodis-Wollner, I., Campeas, D.. and Serle. J.B. (1990). Contrast sensitivity and computerized perimetry in early detection of glaucomatous change. Clinical Vision Sciences, 5 , 407-413. Lynden. A., Bondesson. U..Larsson, B.S., Lindquist. N.G.and Olsson, I. ( 1985). Autoradiography of 1 -methyl-4-phenyl- 1.2.5.6-tetrahydropyridine (MPTP) uptake in the monoaminergic pathways and in the melanin containing tissues. Acta Pharmacologica Toxicology , 5 7 , 130-135. Maier. J., Dangelie, G., Spekreijse. H. and van Dijk. B.W. (1986). Principal component analysis for source locations of VEPs in man. Vision Research, 27, 165-177. Mariani. A.P.. Kolb, H. and Nelson, R. (1984).Dopamine containing amacrine cells of rhesus monkey retina parallel rods in spatial distribution. Brain Research, 322, 1-7. Marx. M.. Podos. S., and Bodis-Wollner. I. (1988). Signs of early damage in glaucomatous monkey eyes: Low spatial frequency losses in the pattern ERG and VEP. Experimental Eye Research, 46. 173-184. Masland. R.H. and Mills, J.W. (1979). Autoradiographic identification of acetylcholine in the rabbit retina. Journal of Ckllular Biology, 83, 159-178. Massey. S.C. and Redburn. D.A. (1987).Transmitters circuits in the vertebrate retina. Progress in Neurobiology, 28. 55-96. May, J.G.. and Lovegrove, W.J. (1987).The effects of grating complexity on transient evoked potentials. Electroencephalography and Clinical Neurophysiology, 66, 521-528. Medjbeur, S. and Tulunay-Keesay, U. (1986).Suprathreshold responses of visual system in normals and demyelinated diseases. Investigative Ophthalmology and Visual Science, 27, 1368-1378. Merigan, W. H. and Eskin. T. A. (1986).Spatio-temporal vision of macaques with severe loss of P retinal ganglion cells. Vision Research, 26, 1751-1761. Michael, W.F. and Halliday. A.M. (1971).Differences between the occipital distribution of upper and lower field pattern evoked responses in man. Brain Research. 32. 3 1 1-324.
324
CHAPTER 9
Milner. B.A.. Regan. D. and Heron, J.R. (1974). Differential diagnosis of multiple sclerosis by visual evoked potential recording. Brain. 97. 755-772.
Mintz, M., Tomer. R.. Radwan. H. and Myslobodsky, M.S. (1981). Visual evoked potentials in hemiparkinsonism. Electroencephalography and Clinical Neurophysiology, 52. 611-616. Negishi, K. and Drugan, B.D. (1979). Effect of cathecolamine and related compounds on horizontal cell in the fish retina. Journal of Neuroscience Research, 4, 3 11-334. Neima. D. and Regan, D. (1984). Pattern visual evoked potentials and spatial vision in retrobulbar neuritis and multiple sclerosis. Archives of Neurology, 4 1, 198-201. Nguyen-Le Gros. J. and Savy, C. (1988). Dopamine innervation of the vertebrate retina. In I. Bodis-Wollner and M. Piccolino (Eds.). Dopaminergic mechanisms in vision. Alan R. Uss, New York. Nightingale, S., Mitchell, K.W. and Howe. J.W.. (1986). Visual evoked cortical potentials and pattern electroretinograms in Parkinson's disease and control subjects. Journal of Neurology, Neurosurgery, and Psychiatry, 49, 1280-1287. Novak, G.P., Wiznitzer, M.. Kurtzberg, D.. Giesser, B.S. and Vaughan, H.G. (1988). The utility of visual evoked potentials using hemifield stimulation and several check sizes in the evaluation of suspected mu1tip1e sclerosis . Electroencephalography and Clinical Neurophysiology. 71. 1-9. Oepen, G.. Brauner. C., Doerr, M. and Thoden, U. (1982). Visual evoked potentials by central foveal and checkerboard reversal stimulation in multiple sclerosis. In J . Courjon, F. Mauguiere. M. Revol (Eds.), Clinical Applications of Evoked Potentials in Neurology. Raven Press, New York, 427-432. Oishi. M.. Yamada. T.. Dickins. Q.S., and Kimura, J. (1985). Visual evoked potentials by different check sizes in patients with multiple sclerosis. Neurology, 35, 1461-1465. Onofrj. M., Ghilardi, M.F.. Basciani. M. and Gambi. D. (1986). Visual evoked potentials in Parkinsonism: correlations with spatial frequency of the stimuli, clinical d a t a and pharmacological manipulations. Journal of Neurology, Neurosurgery, and Psychiatry, 49. 1150-1159. Onofrj. M. (1990). Pattern VEP generators. In J. Desmedt (Ed.), Visual Evoked Potential update, Elsevier North Holland Papakostopoulos. D.. Dean Hart, C.. Cooper, R. and Natsikos, V. (1984). Combined electrophysiological assessment of the visual system in central serous retinopathy. Electroencephalography and Clinical Neurophysiology. 59. 77-80. Parker, D.M., Salzen, E.A. and Lishman, J.R. (1982). Visual evoked responses elicited by the onset and offset of sinusoidal gratings: latency. waveform, and topographical characteristics. Inuestigatiue Ophthalmology and Visual Science, 22, 675-680. Piccolino. M.. Neyton, J. and Gerschenfeld. H.M. (1984). Decrease of gap junction permeability induced by dopamine and cyclic AMP in horizontal cells of turtle retina. Journal of Neuroscience, 4 , 2477-2488.
Plant, G.T. (1983). Transient visually evoked potentials to sinusoidal Journal of Neurology, Neurosurgery, and gratings in optic neuritis Psychiatry, 46, 1125-1133.
CLINICAL DIAGNOSIS
325
Plant, G.T., Zimmern. R.L. and Durken, K. (1983).Transient visual evoked Dotentials to the Dattern reversal and onset of sinusoidal gratings: Electroencephalogkphy and Clinical Neurophysiology, 56, 147-158. Prehc. F:T. (1988).The neurophysiological significance of the N1 and P1 components of the visual evoked potential. Clinicd Vision Science, 3. 195-202. Quigley. H.A., Sanchez, R.M., Dunkelberger G.R.. et al. (1987).Chronic glaucoma selectively damages larger optic nerve fibers. Znuestigatiue Ophthalmology an d Visual &fence, 28, 913-920. Ratliff. F. and Zemon. V. (1984).Visual evoked potentials elicited in normal subjects and epileptice patients by windmill-dartboard stimuli. In R.H. Nodar and C. Barber (Eds.). Evoked Potentials ZZ. Boston, Butterworth Publishers, 251-254. Regan. D. and Richards, W. (1971).Independence of evoked potentials and apparent size. Vision Research, 11. 679-684. Regan, D.,Silver, R. and Murray, T.J. (1977).Visual acuity and contrast sensitivity in multiple sclerosis: hidden visual loss. Brain, 100, 563-579. Regan. D. and Maxner. C. (1987).Orientation-selective visual loss in patients with Parkinson's Disease. Brain, 110. 415-432. Riemslag. F.C.C., Spekreijse. H. and Van Walbeek, H. (1982).Pattern evoked potential diagnosis of multiple sclerosis: a comparison of various contrast stimuli. In J. Courjon. F. Mauguiere. M. Revol (Eds.). Clinical Applications of Evoked Potentials in Neurology. Raven Press, New York. 417-436. Shahrokhi, F.. Chiappa. K.H. and Young R.R. (1978).Pattern shift visual evoked responses. Two hundred patients with optic neuritis and/or multiple sclerosis. Archives of Neurology, 35, 65-71. Sjostrand. J. and Frisen. L. (1977).Contrast sensitivity in macular disease. Acta Opthalmologica, 55, 507-514. Skrandies, W. (1984).Scalp potential fields evoked by grating stimuli: effects of spatial frequency and orientation. Electroencephalography and Clinical Neurophysiology. 58. 325-332. Sokol, S . and Moskowitz, A. (1981).Effect of retinal blur on the peak of the pattern evoked potential. Vision Research, 21. 1279-1286. Spekreijse. H.,Estevez. 0. and van der "keel, L.H. (1973).Luminance responses to pattern reversal. Docurnenta Ophthalmologica Proceedings Series, 10, 205-211. Tartaglione. A.. Pizio. N.. Bo, I., Spadavecchia, L. and Favale, E. (1985). Spatial properties of pattern as determinants of visual evoked potentials changes in Parkinson's disease. In C. Morocutti and P.A. Rizzo (Eds.). Evoked Potentials: Neurophysiological a n d Clinical aspects. Elsevier North Holland, Amsterdam, 321-327. Victor, J.D., Buchwald. E. and Devinsky. 0. (1986).VEP analysis of parafoveal visual loss in multiple sclerosis. Clinical Vision Sciences, 1. 113-118. Visser. S.L., Stam, F.C., Tilburg. van W.. Op den Velde. W., Blom, J.L. and De Rijke. W. (1976).Visual evoked response in senile and presenile dementia. Electroencephalography a n d Clinical Neurophysiology, 40,385-392. Visser. S.L.. Tilburg. van W., Hooijer, C., Jonker, C. and De Rijke, W.
326
CHAPTER 9
(1985). Visual evoked potentials (VEPs) in senile dementia (Alzheimer type) and in non-organic behavioral disorders in the elderly: comparison with EEG parameters. Electroencephalography and Clinlcal Neurophysiology, 60, 1 15-121. Wright, C.E., Harding, G.F.A. and Orwin. A. (1984).Presenile dementia - the use of the flash and pattern VEP in diagnosis. Electroencephalography and Clinical Neurophysiology, 67,405-415. Yaar. I. (1981).The effect of Levodopa treatment on the visual evoked potential in parkinsonian patients. Electroencephalography a n d Clinical Neurophysiology. 50, 267-274. Zemon, V.. Kaplan, E. and Ratliff F. (1980).Bicuculline enhances a negative component and diminishes a positive component of the visual evoked cortical potential in the cat. Proceedings of the National Academy of Science USA, 77,7476-7478. Zimmern. R.L., Campbell, F.W. and Wilkinson, I.M.S.(1979). Subtle disturbances of vision after optic neuritis elicited by studying contrast sensitivity. Journal of Neurology, Neurosurgery, and Psychiatry, 4 2 , 407-412.
Author Index 159. 244, 290, 299, 320
A
Blake, R. 15,21. 30,38,43,
Abney, W. 229, 245. 248,
254- 255 Abramov, I. 14,31-32,88, 110, 255 Adams, R. 88. 101,103. 110. 112, 246, 254 Adelson, H. 170,211.221 Albano. W. 37,57,62,77 Albright, T. 54, 62 Alpern, M. 242 Alvarez. S. 121, 134 Alwitt, D. 139, 161 Anderson, S.J. 47,62,64,76 Anstis, S . 19,30. 53-54.56, 62, 65, 72, 103, 115. 164 Aslin, T. 81,99, 110, 116 Atkin, A. 318-319 Atkinson. J. 81.83-84,89-90, 98-99,107, 110-111. 113, 117 Axelrod. S. 119, 132
B Badcock, D. 44,56,62,66.
263, 297. 299 Baker, C.L. 29,34,59,67,173, 22 1 Balogh. D.W. 61-62,71 Banks. M. 81,83-85.89, 94-95,101. 105, 107.
110-111,113, 117 Banta. A. 41,62 Barlow. H.B. 60.62 Bat, B. 331 Battaglia, F. 59 Baylor, D.A. 18. 20,30. 35,258 Benton, R. 263, 297 Bergen, J.R. 47,77. 121, 124.
132, 211, 218-219,221. 223-224 Black, J.E. 3, 40,64, 106, 138,
47-49,62, 64, 72. 87-88.90-91.93-94, 100. 108. 110-111,114 Blakemore. C.B. 15,21,30. 38, 47. 49. 62. 87-88, 90-91,93-94.100, 108, 110-111,114. 167, 221 Bobak, P. 317-319 Bodis-Wollner. I. 304, 309-310.315, 318-322, 324 Bologna, N.B. 272. 296,301 Bolz. J. 39. 51. 63 Bomstein, M. 81. 101, 1 1 1 , 117 Bouman. M.A. 24. 33,76 Bowen, R.W. 55. 63. 287. 297 Bowmaker. J.K. 18,20,30 Boycott. B.B. 10-11.30,57,77 Boynton. R.M. 19,29-30. 34-35.101, 1 1 1 , 227. 244, 249. 251, 254, 257 Braddick. 0. 28, 81. 84-85.93, 98, 110-111,113, 117, 173. 221 Brannan. J.R. 67. 74,77, 119, 121-122,124, 128, 130-132,134, 137, 159, 165, 270, 272. 274, 279, 295-297.301, 303. 32 1 Breitmeyer. B.G. 37. 39-44,46, 55. 57, 59-60,62-64, 68. 119-120,132-133, 137. 139-140.150. 157-159, 162, 264-266, 268. 278-279.281-283, 286. 288. 292. 298, 301 Bridge, J. 43. 59,63-64.67. 76, 110, 165,255, 257 Brindley, R. 227, 233-234,254 Broadbent. D.E. 139-140,162 Brown, J. 40. 64,85. 92, 103.
328
AUTHOR INDEX
111-112. 121. 132, 147-148, 162, 256. 263, 299 Browse, R. 195. 211, 222 Brussell. E.M. 43, 45, 61. 64 Burbeck, C. 38, 40, 42. 44-48, 64. 68. 264. 298 Burke, W. 49. 64 Burkhalter, A. 59. 64, 82, 101, 111 Burr, D.C. 47. 62. 84, 101, 112, 115
C Cacciato. D. 55. 63. 297 Caelli. T. 221-222 Calis, A. 137. 162 Camisa. J. 318. 320 Campbell. F. 24-25, 30, 38, 47. 62, 64. 83-84. 86. 98, 108. 111-112, 114. 167, 221-222, 317. 320. 322, 326 Camras, C. 321 Carden, D. 19, 33 Casima. J . 43, 48, 62, 64 Cavanagh, P. 19. 28-30, 39, 52-56. 62, 64-65, 103, 115, 138, 155. 158, 162 Chan, C. 111, 121, 132 Chang, I. 170, 222 Charles, E.R. 26, 39, 51, 54. 65. 70, 73. 116. 163-164, 167. 223 Chiappa, K.H. 314. 320, 325 Church, K.L. 120. 134, 321 Cleland, B.G. 4, 9-10. 13, 30-31, 39. 47-48. 50, 65 Coblentz, W.W. 19-20, 31 Cohen. E. 195, 222 Colby, C.L. 23. 35, 54-57, 70, 73 Coppinger, C. 119, 132 Cowan. J.D. 174, 224 Cowey. A. 13, 34, 88. 116 Crassini, B. 121. 132 Crawford. B.H. 41, 45, 55, 65 Critchely, M. 263, 272. 298 Crone, R. 20. 22, 31
D Darcey. D. 305, 320 Dartnall, H.J.A. 18. 30, 237. 255. 258
Daw. N. 10, 31, 45. 65. 75. 308, 314. 320. 322
De Lange. H. 38. 65 De Vries. R. 99, 112 De Weert, C. 24, 33. 55, 66 Derrington. A.M. 4. 13, 18. 20-21. 23, 25. 31, 38, 42-46, 49, 51. 54-56. 66, 92, 94, 112, 138, 159, 162. 249, 252. 255, 306, 320 Dev 169, 222 DeValois, R. 3-4, 14, 20, 22-25, 27, 31, 35, 93. 112, 164, 252, 255 DeYoe. E.A. 27, 31, 37, 39, 52-53. 56, 58. 66, 167. 222 Dieterici. L. 234 Dimmick. D. 247-248, 255 Dobson, V. 110-111, 116 Domenici, L. 310, 320 Donders, F.C. 245. 255 Dow, N. 307 Dreher. B. 13, 15. 34. 37, 46, 49. 51-52, 54, 57. 59, 66, 70. 72, 75, 138. 162, 286, 288, 298, 300 Drucker, T. 88. 92, 112 Dubin. M.W. 4. 30, 65, 100. 116 Dunlap. W. 274. 299 Dunn. K. 41. 63
E Eisenman, D. 112 Elliott, L. 119-121, 132-133 Emerson, W. 19-20, 31 Enoch. J. 45, 61. 66 Enroth-Cugell. C. 4-5, 18. 32, 35. 38. 45. 50. 66. 68. 120. 307, 320 Eriksen, C.W. 119-120, 132- 133 Eskew. R.T. 29, 32 Eskin, T.A. 26. 34, 51, 71. 87, 89. 115. 307. 320, 323
329
AUTHOR INDEX
Essock, E.A. 43, 46. 66 Estevez. 0. 20, 32, 259, 325
255
Green, M. 179, 222 Gregory, R. 28, 32, 34, 55-56. 67, 72, 164
F
Grossberg, S. 141, 195, 206,
Farah. M.J. 58. 66 Faucheux, A. 287, 290. 292, 294, 301
Favreau, 0. 28-30. 39. 56. 65, 138. 162
Fender, R. 168-169. 222 Ferrera, V.P. 47, 67 Fiorentini, A. 81, 85, 89, 91-92, 94, 101. 107-108. 112. 115-116, 320 Fletcher, J. 263, 298, 301 Flitcroft, D.I. 28, 32 Fogel, D. 222 Foster, S. 93, 113. 287. 298 Friedman, R. 233 Fukada, Y. 10, 15, 32, 138. 162 Fulton. A. 91-92, 103, 113
G Gambi, D. 314, 320, 324 G ~ z L.. 37-41, 43, 46, 57. 63, 74. 120, 132, 137, 139-140, 157, 162. 257. 264-266, 278-279, 283. 292, 298. 322 Gardner. T. 133 Gaska,A. 113 Geisler, W. 95, 113 Gelade. G. 218-219, 223 Ghilardi, M.F. 61, 67, 303. 306, 312, 315. 318. 321, 324 Gibson, E. 179, 222 Gielen. C.C.A.M. 26, 32 Gilbert, F. 62, 64, 101, 113 Glover, A. 321 Glovinsky, Y. 318, 321 Goldberg, M.E. 37, 58. 64. 67. 73, 321 Gordon, J . 30. 32, 101, 110- 111 Gorea. A. 40, 56, 67. 71 Gormican 195, 2 11, 223 Gouras. P. 3-4, 12, 14, 23, 31-32, 36 Graham, N. 25. 32. 141-142, 144-145. 150-151. 153, 161-162. 167, 222. 227,
220, 222
Grzywacz. F. 190. 222 Gumsey 195, 211, 222 Guth, S.L. 244, 246. 248. 255 Gwiazda, J. 81, 95. 97, 111, 113-114
H Halevy. H. 253. 259 Hall, S. 263, 300 Halliday, A.M. 304-306. 309. 314. 319, 321-323
Hansen, A. 91-92. 103, 113. 117
Harris, J. 32, 35, 38, 67, 75, 107, 113, 146, 165, 289, 301 Hartmann, E.E. 89. 94, 111. 113. 116 Harwerth. R.S. 19, 35, 40. 48-49. 63. 67, 298 Hawken. M.J. 26, 30, 32-33. 87. 108 Hayhoe, M. 29-30 Held. R. 37. 58, 95, 97-98. 111, 113. 116. 152 Helmholtz, H. 222 Hendrickson, D. 82, 88, 92. 110, 112, 114, 118, 304. 318 Hepler, N.K. 31 Hering. E. 244-245, 247-248. 255 Hess. R.F. 47, 59, 67. 89, 116, 315. 318, 321 Hickey, L. 82-83. 97. 114 Hicks, T.P. 18. 32-33, 49, 51, 54, 67 Hildreth. E. 184, 190, 222 Hochberg. J. 139. 162 Hochstein. S. 6-9. 14. 30. 32. 48. 67 Hokoc. J . N . 308 Hood, D. 252. 259 Horton. J.C. 27. 33, 83. 114 Hubel. D.H. 4. 12, 14. 20. 22-24. 26-28, 33-34,
330
AUTHOR INDEX
36-37, 39, 51-55. 57, 70, 77, 82. 98, 109. 114. 117, 120, 133. 138, 150. 155, 158-160. 163, 252, 259, 264-265, 286, 299, 305, 322 Hull, E.M. 22. 31 Humphrey, N.K. 37, 68 Huntington, M. 119, 133 Hurlbert. H. 167, 223 Hurvich. L. 3, 32, 77, 244, 246, 248, 255-256, 301 Hutman, L. 121. 134
I Ikeda, M. 50, 121. 133. 248, 254. 256, 284. 298. 320
Illing. R.B. 11, 33. 111, 299 Ingle, D. 37, 46, 58, 64. 68, 76, 123, 125, 245
lngling, C.R. 4, 25. 33, 55, 57, 68, 155, 158, 162. 246, 252, 256
J Jacobs, G.H. 14, 27, 31. 35, 87. 114. 255
Jarneson. D. 3, 32. 77, 244. 246, 248, 255-256, 301
Jansson. D. 179, 222 Jensen, J. 308, 314. 320. 322 Johansson. D. 179, 222-223 Johnston, R. 282 Jonides, J. 57, 68, 77 Judd, N. 238. 243, 246, 254. 2 56 Julesz, B. 39, 42-43, 56, 59. 63, 68. 71. 137-140. 162-163, 168-170, 195, 200. 202, 206. 21 1. 2 18-219. 22 1-224 Jung, R. 38, 68
K Kahneman, D. 43. 55, 68, 77 Kanizsa. K. 198. 223 Kaplan, E. 4, 12-18, 20-26, 30, 33-35, 49-51, 65. 68. 74. 84. 89. 92, 107, 114, 164. 326
Karis. A. 119 Kaufman. L. 315-316. 322 Kelly, D. 4. 24-25, 29. 33, 38, 40. 42, 44-48, 57. 64. 68-69, 120, 134, 158, 163 King-Smith, P.E. 42. 46. 69, 163 Klein. S. 44, 70, 75, 114 Kline. S. 119-121, 132-133 Klymenko, V. 147-148, 152-154, 163 Koffka, K. 137-139, 163 Kolb. P. 307 Kolers, 0. 279 Komatsu, J. 59, 69 Konig, K. 234, 240, 242 Krantz, I. 233 Kratz, K.E. 69, 74 Krauskopf, J. 4. 13. 27, 31, 51, 66, 243. 249. 252-257, 318, 320 Kroger-Paulus. A. 34 Kropfl. W. 59, 63, 68 Krueger, J. 51. 54, 69 Krose. J. 211. 218. 223 Kulikowski, J. 38-39, 42-43, 46-47, 64. 69, 71. 98. 112, 139. 150, 152, 154, 157. 163. 264, 270. 299, 315. 322 Kupersmith, M.J. 304, 309, 318. 322
L Lartigue, E.K. 274. 301 Leclerc, Y. 55. 65 LeCluyse. K. 60, 77. 282. 284-285. 287, 290. 292. 294, 301-302 Lee, B.B. 4, 18. 22, 32-33, 45-46, 48-49, 54. 65, 67, 69. 108, 112, 114. 137. 162-164 Leeuwenberg. L. 137, 162-163 Legge, G. 39-40. 47. 69 Lehmkuhle. S. 43. 45-46, 48-49, 51, 66-67, 69. 74, 157, 164 Lennie. P. 4, 10, 13. 16, 18, 25. 27. 31-33, 41-43, 45, 48-51. 66. 69-70,
33 1
AUTHOR INDEX 74, 76, 89. 92, 112, 114. 138, 159, 162. 167, 223, 252-253. 255, 257, 264, 299, 306-307, 320 k s e v r e , N. 305-306. 315, 323 LeVay, K. 109, 114 Leventhal. A.G. 10, 13-14, 34, 37. 52, 70, 75. 300, 320 Levi. D.M. 4. 10. 13, 20, 30-31. 39-40, 42, 48, 50, 59, 63-65, 70. 76, 95. 114, 298. 321 kvick, W. 4. 10. 13, 30-31. 39, 48, 50, 59. 64-65, 70. 76 Lewis. S. 64. 115 Livingstone. M.S. 4, 20. 22, 27-28, 33-34. 37, 39, 51-55. 57, 70. 109, 114. 120. 133, 138, 150. 155, 158-160, 163. 167. 223. 264-265. 286, 299. 32 1 Logothetis, N.K. 26. 39. 51, 53-56. 65. 70. 73. 116. 138, 158-159. 163-164, 167, 223 Love, R. 55, 63 Lovegrove, W. 60. 70, 77, 263. 267-272, 286. 292. 294-297, 299-30 1, 315-316, 323 Lund, J.S. 26, 33, 74 Lustgarten, J. 318, 323
MI M a d d a m , D.L. 227, 257 MacGrath, C. 121. 133 MacLeod, D. 19. 29-30. 34, 243. 249, 251. 257, 259
MacVeigh, D. 119 Maffei, L. 83-84, 86, 108, 112. 114-116, 319-320
Maguire. W. 77, 137. 146, 152, 156-157, 161, 163-165, 265, 267. 300 Maier, J. 111, 305. 323 Malpeli, J.G. 4. 10, 12-15. 29. 35, 49-51, 54, 56, 70, 73. 300 Mariani, A.P. 307-308, 312,
322-323
Marr, D. 169, 181. 223 M-. M. 318-319, 321. 323 Matin, E. 41, 55, 60, 70-71. 115, 266. 278-279, 283. 292. 299 Matthews, M. 282 Maunsell. J.H.R. 37-38, 49-51, 59, 71, 73, 167. 224, 264. 299 Maurer, W. 98. 103, 110, 115, 323 Maxwell, J.C. 232 May, J.G. 55 McClelland, C. 167. 223, 282 McCulloch. R. 112 McFarland, B. 119. 132-133 McFarlane, E. 47, 77 McKee. S . 140. 165 Mead, W.R. 31 Merigan. W. 12. 26. 34, 51. 71. 87, 89. 107. 109, 115, 159, 164, 307, 320, 323 Merritt, R.D. 61-62, 71 Michelson. A.A. 16. 34 Mingolla. M. 195. 206. 220. 222 Misiak, I. 119. 133 Mohn. M. 83. 88, 115 Molinet, K. 301 Mollon. J. 18. 30, 36. 62, 66, 107. 115, 244, 253. 257-259 Morgan, H.C. 22, 24, 31 Morrison. J. 117, 121, 133 Morrone, C. 84, 100-101, 104-105, 107, 111-112, 115 Moskowitz, S. 90. 101, 115, 317. 325 Mountcastle. V.B. 58, 71, 163 Movshon. A. 45-46, 75, 85, 95, 114-115. 170, 221 Mullen, K. 24-25. 29, 34, 105, 115 Mylin. L. 319-320
N Nachmias. N. 40. 164-165, 167. 222-223. 264, 301
Nagle. M. 64, 113 Nagler. D. 113
332
AUTHOR INDEX
Nagy, A.L. 237, 248, 257 Naka. V.I. 233, 257 Nakayama, K. 146. 164 Nelson, R. 169, 223 Newsome. W. 109. 115 Newton, I. 231 Nimeroff. I. 243 Norcia, A.M. 84-85, 89, 105. 107, 110, 115-116
Nuding. S.C. 120. 134 Nunn,B. 30
0 O'Connell. M. 179. 224 Oehler. R. 13, 34 Onofrj, M. 67. 303-305. 309-310, 315-317, 320-321, 324 Orme-Rogers, C. 132-133 Owsley. C. 119. 121. 133-134 Ozog, R. 41, 77, 120. 134. 139, 165, 264, 283. 301
Poggio. T. 93, 116, 140, 169, 181. 223-224
Pokomy, J. 18-19, 35, 55. 63, 227, 231. 237. 242-243, 248-249. 255. 257-259, 297 Pollen, D. 113 Polson. M.C. 22, 31 Pomerantz. J. 164. 217-218, 223 Posner, M. 58. 72. 223 Powers, S. 92, 116 Previc. F. 37, 59, 72. 306, 315-316. 325 Pugh, V. 242. 244. 254. 258 Purcell, S. 279 Purpura. K. 16, 20, 26. 30, 33-34
Q
Quick. R.F. 124, 133 Quigley. H.A. 304. 318, 321, 325
P Packer, 0. 103, 116 Pantle. A.J. 38. 43, 71. 157. 164
Papathomas, T.V. 56. 67. 71 Parker, A.J. 26. 32-33, 40. 72. 315-316, 324
Pastemak. T. 12, 34 Patel. A S . 45. 72 Paulus, W. 24, 34 Pearlman, A.L. 10. 31 Pentland, R. 164 Perry, V.H. 4, 9. 13-15 Petersen, S.E. 51, 58, 72 Pettigrew. J.D. 59. 64, 72, 76 Phillips, G . 47, 77, 121. 132, 161. 171. 173. 176, 181, 184. 188. 190. 224, 300 Piccolino, M. 314. 320, 322, 324 Pirchio. P. 83-84, 89, 107. 112, 116-117 Pitt, P. 240 Plant, G.T. 47. 67. 89. 116, 304. 314-316, 318, 321, 324-325 Podos, S. 319, 323
R Ramachandran, V.S. 28, 34, 52-56, 72, 164
Rashbass. C. 121, 133 Raymond, J.E. 46, 72 Rayner, K. 265, 298. 300 Reeves, A. 55. 72 Regan, D. 43. 45, 61. 73, 115, 304. 306. 309, 314-315, 317-318. 324-325 Resnikoff , R. 233 Robson, J. 4-6, 8. 10, 17-18, 24-25, 30, 32. 34, 38. 43. 46-47, 50, 66, 73. 77, 83. 85, 112, 116, 120, 133, 164-165, 167. 222-223, 307, 320 Rodieck. R.W. 5, 13. 15, 22, 34, 49, 52. 66. 70, 73. 138, 162, 227. 249, 258. 286, 288. 298 Rogers, D. 56. 72, 119, 132-133 Rosner. G. 63, 279. 298 Ross, J. 121. 134 Rubenstein, D. 223 Rubin. E. 137, 139-141. 147,
AUTHOR INDEX 164
Rudd. M. 41. 63 Rumelhart. R. 167. 223 Rushton, W.A.H. 233. 243. 257-258
333
Silverman. M.S. 20, 27, 35. 45. 61, 74-76. 164
Simonsen, S. 119, 133 Singer, T. 41, 50. 58, 60, 74, 265. 300
Slaghuis, W. 263, 267-268, 270-271, 286, 299-300
S
Sloane, M. 119, 121, 131,
Saccuzzo. D.P. 61. 71, 73 Sachs. 0. 167, 223 Sagi, D. 222-223 Saito, H. 50. 54, 73 Salapatek, P. 83-84, 89, 110, 114, 117
Sandini, N. 112, 117 Sartucci, F. 321 Sato. T. 56, 73 Satz. S. 263, 298, 301 Saucer, R.T. 38, 73 Schein, S.J. 23, 31. 51, 57, 66 Schieber, F. 119-121. 131, 133-134
Schiller, P. 4, 10, 12-15, 23, 26, 29, 35, 37-39. 49-51, 53-57, 70-71. 73, 87-88. 107, 109. 116. 158, 163-164, 167. 223, 286, 300 Schnapf, J.L. 18. 30, 35, 237, 243, 258 Schneider, G.E. 37, 58. 74, 32 1 Schubert. D.L. 61, 73 Schumann. R. 198. 223 Schwartz, S.H. 40, 74, 88, 116 Schweitzer-Tong. D.E. 66 Scobey, R.P. 47, 74 Scott, S. 55, 63 Sekuler. R. 38, 42, 70-71, 74, 121, 132-134, 157, 170-171. 173, 224 Sestokas, S. 51, 74. 307 Sevdalis. E. 44, 62 Shapley, R. 3-4, 6-10. 12-18, 20-26. 32-35, 37. 45, 48-51. 65, 67-68. 74, 76. 92, 114, 120, 134, 158, 164. 307 Shimojo, E. 95. 111, 113. 116, 146, 164 Shipp. S. 27, 35, 37, 74 Shulman, R. 140. 164 Siemsen. D. 121. 133
133- 134
Smith, V.C. 18-19. 33. 35. 42, 46. 48, 66-67. 69, 71, 97, 117, 163. 231. 237, 242-243, 249, 255, 257-259, 304, 309. 318, 322 Snodderly, D.M. 22, 24, 31 Sokol. S. 86, 89-90, 101. 115, 117, 317, 325 Soodak, R. 15. 35, 49, 74, 164 Spekreijse. H. 20. 32, 112, 304, 317, 323, 325 Sperling, H.G. 19, 28, 35 Spinelli. D. 88, 112, 116-117 Srebro. R. 243 Stanley, R. 263, 300 Stewart, T. 279 Stiles, W.S. 227, 229, 231, 237. 243, 254, 258 Stone, J. 4, 10-11. 13, 37-38. 41. 43-44. 46-50, 68, 73. 75, 264, 300 Stromeyer, C.F. 35. 44, 47, 75 Sturr, J.F. 120-121, 131, 134 Switkes, E. 20, 25, 27-29, 31, 35, 51, 75-76, 164 Szog. R. 120, 134
T Taub, H. 120, 134 Teller, D. 48-49, 75. 81, 87-88. 101, 103, 111-112, 116-117. 161 Thompson, R. 119, 121, 132- 133 Todd, J . 57. 75 Todorovic, G. 195. 222 Tolhurst, W. 39-40, 42-43, 45-46, 69, 75, 139, 150, 152. 154. 157. 163, 264, 270. 299-300 Tootell, R.B.H. 20, 27-28, 35, 38. 51-52. 75-76. 138.
AUTHOR INDEX
334 159. 164
Treisman, A. 195,200. 206,
211. 218-219.223-224 Trevarthen. C. 4,33. 37. 58-59.76, 114 Trick, G. 45, 74, 110 Trimarchi, T. 89, 91, 112. 320 Troscianko, T. 35. 72 Troy, J. 10,45. 48. 51. 76 Tsou, B.H.P. 33 Tulunay-Keesey, U. 38. 40.42. 63. 76. 306 Tyler, C.W. 28. 30. 39, 56. 65, 115-116,138. 162
U Ullman. L. 170, 190. 224 Uttal, W.R. 48,76
V Valberg, A. 4,22,33,46. 54.
65. 69. 112-114,281. 298 Van der Horst, G.J.C.24,33 Van Essen, D.C. 4,27. 31,33, 37. 39, 52-53,56-59, 64. 66,71,76. 114. 167. 222, 224 Van Gisbergen. J.A.M. 32 Van Orden. K. 120. 134 Vellutino, F. 263. 298, 300-301 Vendrik. A.J.H. 32 Victor, J. 8-10.30, 45. 50,74. 76. 307, 315, 325 Vidyasagar, T. 18,32-33,49, 56, 67,72 Vital-Durand, F. 49,62,87-88. 90-91,93-94.100, 1 1 1 Von Kries. D. 233 Voorhees. S. 224 Vos. ..I 237,242,259
w Wagner, G. 19,35 Wallach. E. 179,224 Walraven, L. 242-243 Walters, 0.218, 224 Warren, E. 119 Wassle, H. 4, 10-11. 30, 33
Watson, A. 40. 43,46-47.66,
77, 141-142,150, 164-165.264, 301 Wattam-Bell, J. 99. 110-1 1 1, 117 Weale. R.A. 121, 126. 134 Weisstein. N. 37. 41,55. 77, 120. 134. 137. 139. 141. 146-149. 152-154, 156-157.159, 161-163, 165-166.264-265.272. 278-279.282-285,289. 299, 301-302 Wepman, B. 55.63 Westheimer, G. 95, 117, 140, 165 Wheeler, G. 282 Whitaker, H. 119. 121, 133 Wiesel. T. N. 12. 14, 23-24.26. 36. 51. 54, 57. 77. 98, 114, 117. 252. 259. 305. 322 Williams, M.C. 55. 57,60. 63-64,77, 111. 131. 134. 146. 159. 162. 243, 254, 256. 259, 263. 270. 272. 274. 276-277.279, 282. 284-290,292, 294-299. 301-302 Williams, D. 167, 170-171, 173. 176. 181. 184. 188, 190. 195, 200. 202, 206, 224 Wilson, H. 39. 46-48,67,74, 77. 85. 94. 117. 121-122.124, 132. 140, 164. 173-174,224 Wong, E. 137. 141. 147-149. 152. 157. 165-166 Wright, D. 229, 231,237. 240, 242. 244. 258-259.284, 298. 308. 326 Wurtz. F. 37,57. 59,62,69. 77, 115
Y Yantis. S. 57. 68. 77 Young, E. 227,233,257,259. 325 Yund.E.W. 31
AUTHOR INDEX
z Zaidi, Q. 227,231. 242. 252-253.256, 258-259 Zeki, S. 27, 35, 37, 39, 70.74, 78. 160. 166-167,224 Zihl. J. 58-59,67, 74, 78 Zrenner. E. 23. 36
335
This Page Intentionally Left Blank
Subject Index A Acrylamide 26, 307, 320 Alpha cells 10, 77, 88 Apparent motion 28, 65, 67, 70, 72. 75. 103, 164
Attention 57-58. 62, 64. 66, 68, 71-74. 77-78. 105, 139, 151. 161, 164. 272, 279. 295-297
B Beta cells 10, 88 Blind sight 43
C Closure 195, 198, 200, 202-203. 206, 221
Contrast 4, 6-12, 14-18, 20, 22. 24-35, 38-41. 43-47, 49, 51. 55-57. 59-60, 62-64, 66-69, 71-75, 77. 83-94. 100-101, 103-113, 115-117, 119-125, 131-134, 138-139, 141, 143-144. 146-148. 150, 152- 159. 161-162, 164-165, 253. 255, 264. 268-271, 274. 277-279. 28 1-287 Contrast gain 14-18. 26, 34, 64, 94, 157 Contrast sensitivity 11-12, 22. 24-26, 29, 31-35, 45-47, 51, 66, 73-74, 83-85, 87. 89, 91-92, 104-111, 113. 115-116. 119, 121. 123-124. 131-134, 141, 146, 158-159, 164, 264, 268-27 1, 287. 295-297. 299, 304. 306-307.
309-310, 315, 317-320, 322-323. 325 Cooperative 132, 165 Cooperativity 168-169. 174. 22 1
D Depth 168 Deuteranope 101, 239-240, 242-243. 254
Disparity 168 DOG 122-125. 127 Dopamine 303, 307-314. 319-324
Dyslexia 75. 297-298, 300-301
E Equiluminance 20. 22-23. 25. 28, 30, 62
Excitation 16, 24, 34. 131-132. 240, 249. 251, 253, 257, 298
F Facilitation 29. 45, 123-124 Figure/ground 137-139. 141, 146-152, 154-161
Fill-in 195, 206. 217-218. 220-221
Fundamental 7-8. 15. 18. 26, 43, 139. 148, 156, 160-161. 240. 243, 245. 254, 258-259. 295. 304, 316
G Gamma 11 Gestalt 52. 55. 139. 163. 168 Glaucoma 43. 61, 74. 303-304. 307. 318-319. 321. 323, 325
338
SUBJECT INDEX
H
N
Horseradish peroxidase 1 1, 13 Hyperacuity 95. 113. 133 Hysteresis 168-169,171, 173-174, 176-178. 183-184.188, 190-191, 193. 221. 224
N70 303-306.308, 315-318, 32 1 Null position 6-7.9
I Ibotenic acid 26, 115 Impulse response 121-122, 124. 130-131,143-144 Inhibition 16,24. 41,60, 63-64,85. 100-101, 1 1 1 , 115, 124, 131-132. 244, 253, 266-268, 276-279,281. 284, 287, 292, 294
M M cells 4, 13-16.18. 22-24.
26, 48-49,51, 54-55. 82. 87-89,94. 107-108. 307 Magnocellular 4, 11, 13-18, 21-29.35. 37. 49, 68. 71, 81-82.87-88.92. 108-109.137-139.142, 157-159,161-162.167, 264, 303. 306-307,318 Maltese cross 137-138,147 Masking 29. 31. 35, 39-41. 43-45.47, 55. 60-63, 67-68.71, 73, 75, 77. 93-94,98, 100-101, 114, 132. 159. 162, 165, 268, 274, 276-279, 28 1-286,288-290.292, 294, 296-301,315 Mesopic 102. 108, 1 1 1 Metacontrast 41,55. 57. 60, 62-64,68,73. 75. 77, 134, 159, 162, 165. 277-279,281-289.294, 296-299,301-302 Miosis 121. 126, 131, 134 MPl'P 311-312,314, 321. 323 Multiple sclerosis 43. 61, 303-304,306, 314-325 Myelin 82, 115,306, 320, 323
0 Ocular dominance 82-83.114 Ocular hypertension 43,61, 303, 307 Optic neuritis 61, 116. 314-315.318. 321-322, 324-326 Optokinetic nystagmus 99. 103, 110 Orientation 27,32,38. 40,42, 47,56. 59. 62,64,71. 77. 86, 97-98.100-101, 104, 107. 109-112, 114-115,117, 142. 144, 161, 270. 299, 317-320, 322-323.325
P P cells 4, 13-16,18, 24-26, 49-51,54-55,82, 87-88.94. 307 P l O O 303-306.309-311. 314-318,321 Parkin son's disease 303-304. 308-309,3 1 1 -312, 319-322,325 Parvocellular 4, 10-18,23-30, 37,49, 68,81-82, 87-88,92. 108-109. 137-139.142. 157-159. 161-162.167. 264, 306-308,314. 318 Perceptual grouping 272, 295, 30 1 PERG 85,89-91.310 Perimetry 66,87, 116. 315, 318, 323 Phasic 12-13.69.87, 114, 121, 124, 144 Photopic luminosity 19-22 Photoreceptors 5. 7, 9. 18.20, 35. 81, 91-92.95. 105, 107. 258, 307 Preferential looking 83. 88. 99. 102. 117
SUBJECT INDEX
Protanope 101, 239-240, 242-243
R Random dot cinematogram 28, 131. 170-171. 184, 186
Random dot stereograms 168 Reading 60, 62, 70, 75, 77. 164, 257, 263-269, 271-272, 279. 281. 287. 290-301. 306 reading disability 60, 62, 70, 77, 263-264. 267-268. 272. 295. 297. 299-301 Rigidity 179, 181, 184, 190 Rubin figure 141
S 6-Hydroxydopamine 3 12. 32 1 Saccade 57. 263, 265-266. 279, 281. 292
Schizophrenia 62. 71. 73. 303 Second harmonic 7-8. 14. 23. 104, 108
Sinemet 309, 312 Specific reading disability 62, 70, 77, 263. 268, 299-300 Stereoacuity 95-97, 140 Stereopsis 33, 52-53, 55, 66, 95, 97, 109, 168-170, 176. 181
Summation probability summation 25-26, 121-122, 124 spatial summation 5, 8, 14, 16, 35, 49-50, 66, 74, 87-88. 113. 164. 314
temporal summation 40, 48, 92. 131, 133-134 Superposition 5. 230-23 1, 244 Sustained 9, 12, 15-16, 29-30. 38-51, 53. 55, 60-61. 63-65. 67. 69-70. 72,
339
74-76. 87. 120-121, 132-133. 141. 146. 157-158. 162- 163, 264-268. 270-271. 274. 276-279, 28 1, 283-289. 292, 296-299 Systems analysis 5. 9
T Texture 167, 193, 195, 197, 200-201, 206-208, 2 11, 2 16-218, 220-224 Tonic 12-13, 33. 36. 38. 40. 51. 57, 71, 150, 152, 154-155, 157, 267-268 Transient 9, 12, 16, 30. 38-51, 53, 55. 57, 60-61, 63-65. 67. 69-72, 74-77, 90. 120-121, 132-134, 142. 150, 157. 159. 162-163, 253, 264-268. 270-272. 274. 277-279, 28 1-290. 292, 294-299. 303. 308, 323-325 Trichromacy 228-229, 232. 234-235. 237, 257 Tritanope 239-240. 242-243
U Ubin figure 141
V VEP 83-86. 89-94, 97-101, 104-106. 108-110. 112. 117, 303-3 19, 322-326 Vernier acuity 63. 95-97, 109, 114. 116 Visual search 77, 160, 274, 30 1
W W cells 10-11
X X cells 4-11, 14-16. 45. 48-49, 56. 60. 111. 307
340
SUBJECT INDEX
Y Y cells 4-7. 9-12, 14-16, 33, 35, 41, 45, 48-51. 57. 60-61. 67-68, 74. 107, 114, 164, 307